This article is written with help from Emma Ware, Training Content Developer, Customer Education
The Problem
As a general rule in SQL — and, by extension, Looker — you cannot group a query by the results of an aggregate function (represented in Looker as measures). You can only group by unaggregated fields (represented in Looker as dimensions). If you try to aggregate a measure in Looker, you will see the following error:
What if you need a way to change a measure (COUNT, SUM, AVG, MAX, etc.) into a dimension, so that you can group by it to aggregate it (like a SUM of a COUNT or an AVG of a SUM), filter on it (in the WHERE clause rather than a HAVING clause), or pivot on it in an Explore?
The Solution
Internally at Looker, we call this process dimensionalizing a measure. This is because we redefine a measure as a dimension. This is accomplished by creating a derived table that includes the measure we want to dimensionalize in its SQL definition.
The Process
The following example is based on a sample e-commerce data set. The end goal for this example is to create an average measure based on an existing Total Revenue SUM measure.
The steps below outline how to generate a SQL-based derived table. You can elect to create a LookML-based derived table, or Native Derived Table (NDT), as an alternative to SQL.
- You start by setting up a base Explore. Then choose the appropriate fields for the base query, including the measure you want to dimensionalize. For example, you can choose a Salesperson Name dimension and a Count of Sales measure.
In our example use case, Total Revenue is grouped by Users State and Users ID: - From the Explore, choose Open in SQL Runner from the SQL tab to open the query in SQL Runner:
Test and add the derived table SQL to your LookML project directly from SQL Runner. - After running the query in SQL Runner (by clicking the Run button) and confirming the results, choose Add to Project from the gear menu. At this point you need to remove any row limit clause in the derived table SQL to ensure that all desired results are included in the query.
You can also choose Get Derived Table LookML from the menu to copy and paste the generated LookML into your project manually.
- Select a project name from the Project drop-down, enter a name for your derived table view file, and click Add:
- Now that the derived table is in a view file, you can create a measure that aggregates the dimensionalized measure. For example, you can now create an average measure for the new Total Revenue dimension,
order_items_total_revenue
: - Double-check that a primary key is defined in the derived table. If your derived table does not have a primary key, see the "Using
row_number
to Generate a Primary Key for a Derived Table" section of the Dealing with Non-Unique Primary Keys Help Center article. - Join the new view into the original Explore (or create a new Explore) to be able to create queries and content with the new fields:
Conclusion
Dimensionalizing measures with Looker derived tables unlocks new capabilities and provides the power to get further insights into your data. With the ability to group by a dimensionalized measure, filter by it in a WHERE clause (instead of HAVING), pivot on it, and create other dimensions based on it, you can take your Explore queries, and content, to the next dimension.
Check out the additional resources below for more inspiration and information.
Additional Resources
- Depending on your derived table query complexity and size, consider adding persistence for performance optimization. For even more information about persistence and optimization, check out the Identifying and Building PDTs for Performance Optimization Help Center article.