Usage
derived_table: {
cluster_keys: ["customer_city", "customer_state"]
…
}
}
Hierarchycluster_keys - or - cluster_keys |
Default ValueNone AcceptsOne or more clustered column namesSpecial Rulescluster_keys is supported only on specific dialects
|
Definition
Clustering a partitioned table sorts the data in a partition based on the values in the clustered columns and organizes the clustered columns in optimally sized storage blocks. Clustering can improve the performance and reduce the cost of queries that filter on or aggregate by the clustered columns.
See the Dialect support for
cluster_keys
section below for the list of dialects that supportcluster_keys
.
The
cluster_keys
parameter works only with tables that are persistent, such as PDTs and aggregate tables.cluster_keys
is not supported for derived tables without a persistence strategy.In addition, the
cluster_keys
parameter is not supported for derived tables that are defined usingcreate_process
orsql_create
.Lastly, Google BigQuery supports clustering on partitioned tables only. The
cluster_keys
parameter works only with PDTs or aggregate tables that are also partitioned using thepartition_keys
parameter.
To add a clustered column to a persistent derived table (PDT) or an aggregate table, use the cluster_keys
parameter and supply the names of the columns you want clustered in the database table.
Examples
Create a customer_order_facts
native derived table on a Google BigQuery database, partitioned on the date
column and clustered on the city
, age_tier
, and gender
columns to optimize queries that are filtered or aggregated on those columns:
Things to consider
Google BigQuery tables can partition only on date fields
Google BigQuery tables can be partitioned only on a date or timestamp column. If you want to add clustered columns to a PDT or aggregate table that does not include date or time-based data, one way to do that is to add a date column using a SQL statement such as SELECT CURRENT_DATETIME() as now
, and then use partition_keys
to partition on the new column. You can then use clustering on other columns in your PDT or aggregate table.
Dialect support for cluster_keys
The ability to use cluster_keys
depends on the database dialect your Looker connection is using. In Looker 22.6, the following dialects support cluster_keys
: