View the original community article here
Last tested: Dec 12, 2019
The reaper is the process in Looker that is responsible for dropping non-active PDTs from the scratch schema.
Schedule:
Unlike the regenerator, there is only one reaper thread total for all connections (the reaper thread operates on the master node, in cases of a cluster). The reaper operates at most every hour, with respect to the cron specified in the PDT Maintenance Schedule of the connection settings.
The process:
The reaper uses the active_derived_tables
table in the internal database to determine which tables it should drop from the scratch schema. Before the reaper drops any tables, it will update the active_derived_tables
table.
Updating the active_derived_tables
table:
The reaper first acquires a list of all derived tables in use in production-built LookML and then deletes rows from the active_derived_tables
table for any tables not in that list.
Next, the reaper gathers a list of all tables that currently exist on the scratch schema and then deletes rows from the active_derived_tables
table for any tables that are not in that list.
Next the reaper will remove rows from the active_derived_tables
table for any persist_for PDTs that have expired.
After this, the active_derived_tables
table is up-to-date with what active tables live on the scratch schema.
In its updated state, the active_derived_tables
table contains a list of all active tables that exist on the scratch schema. Note that this table does not include tables that should exist on the scratch schema but don't for some reason (i.e. a table that hasn't been built yet or was dropped by a user or by error).
Now that the active_derived_tables
table is up to date, the reaper is ready to begin dropping tables.
Checking reg_key
and dropping tables:
For all tables on the scratch_schema
, the reaper first checks to see if the reg_key
is valid. The reg_key
is the 2 letters following the L
X
$
in the table name, where X
= C or R
. The connection_reg_r3
table on the scratch_schema
contains a list of all valid reg_key
s.
If the reg_ke
y is valid:
- If there is a row for this table in the
active_derived_tables
table, the table will not be dropped. - If there is not a row for this table in the
active_derived_tables
table, the table will be dropped.
If the reg_key
is not valid, the reaper checks the connection_reg_r3
table to see if this reg_key
exists in that table.
- If there is a row in the
connection_reg_r3
table corresponding to this tablesreg_key
, the table will not be dropped. - If there is not a row in the
connection_reg_r3
table corresponding to this tablesreg_key
, the table will be dropped.
Enemy Reaping
The reaper typically only reaps tables for its Looker instance - it can identify the instance a PDT belongs to based on the instance hash in the PDT table name.
Under very specific circumstances, two instances can have the same instance hash and the reaper for one instance will reap PDTs that are active on another instance. This is known as enemy reaping and can cause problems with the reaping process.
This content is subject to limited support.