6. Grouping Sets, Cube, and Rollup
Last updated
Last updated
(generate multiple grouping sets in reporting)
It a set of columns by which you group by using the GROUP BY
clause
It is denoted by a comma separated list of columns placed inside parenthesis i.e (column1, column2, ...)
The sales table stores the number of products sold by brand and segment
To get number of products sold by brand and segment
To get number of products sold by a brand
To get number of products sold by a segment
To get number of products sold for all brands and segments
Because UNION ALL requires all result sets to have the same number of columns with compatible data types, you need to adjust the queries by adding NULL to the selection list
it has two main problems.
First, it is quite lengthy.
Second, it has a performance issue because PostgreSQL has to scan the sales table separately for each query.
Using the GROUPING SETS clause
GROUPING()
function returns bit 0 if argument is member of current grouping set and 1 otherwise
(define multiple grouping sets that include all possible combinations of dimensions)
It allows us to generate multiple grouping sets
CUBE subclause is a short way to define multiple grouping sets so the following are equivilant
In general, if the number of columns specified in CUBE
is n
, then we will have 2n combinations
Consider the sales
table
The following query uses CUBE
subclause to generate multiple grouping sets
(generate reports that contain totals and subtotals)
It is different from CUBE
subclause, ROLLUP
does not generate all possible grouping sets based on specified columns, it just makes a subset of those
It assumes a heirarchy among the input columns and generates all grouping sets that make sense considering the heirarchy. This is the reason why ROLLUP
is often used to generate the subtotals and grand total for reports
Example, CUBE (c1, c2, c3)
makes all eight possible grouping sets
However, ROLLUP (c1, c2, c3)
generates only four grouping sets, assuming the heirarchy c1 > c2 > c3
as follows
Common use of ROLLUP
is to calculate aggregations of data by year, month and date, considering the heirarchy year > month > date
It is also possible to do a partial roll up to reduce the number of subtotals generated.
If you change the order of brand and segment, the result will be different as follows:
The following statement performs a partial roll-up:
This is rental
table