View the original community article here
Last tested: Mar 25, 2019
The Problem
In MySQL, you can write a measure which applies arithmetic between a measure and dimension.
measure: bad_measure {
type: number
sql: ${measure} * ${dimension} ;;
}
In Redshift (and most other databases), the same measure throws an error: ERROR: column must appear in the GROUP BY clause or be used in an aggregate function
. Why isn't every dialect as cool as MySQL?
The Reasoning
MySQL is not the cool one here. MySQL makes some really weird assumptions about grouping. Take for example this query which selects all the "years" in my video game database. Year isn't a primary key, and we aren't grouping by year, so we see several duplicates. This seems expected.
The problem arises when we also select a COUNT(*) without any grouping. As expected, in this query we get back one row with the full count of records. But we also see the "year" here as well, with a random value chosen. Most databases would have errored here instead of showing us a year.
While this may seem purely cosmetic, this can lead to bad data fast once you start doing arithmetic between the aggregated count and unaggregated year. Since you don't know which year will be used, your result in this query is probably unexpected and incorrect.
This content is subject to limited support.