# 6. Grouping Sets, Cube, and Rollup

### Grouping Sets

**(*****generate multiple grouping sets in reporting*****)**

* It a set of columns by which you group by using the `GROUP BY` clause
* It is denoted by a comma separated list of columns placed inside parenthesis i.e `(column1, column2, ...)`

#### Example

The sales table stores the number of products sold by brand and segment

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FYh2lgOUP555EljRo1u05%2Fimage.png?alt=media&#x26;token=6b2d0db2-1900-4e59-8eab-1206047733cf" alt=""><figcaption></figcaption></figure>

**To get number of products sold by brand and segment**

```sql
SELECT
    brand,
    segment,
    SUM (quantity)
FROM
    sales
GROUP BY
    brand,
    segment;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2Fi5KQcDRhQYBPIdoC4QXo%2Fimage.png?alt=media&#x26;token=f7ca9b39-c62b-43ce-ade4-7bc0dc524b40" alt=""><figcaption></figcaption></figure>

**To get number of products sold by a brand**

```sql
SELECT
    brand,
    SUM (quantity)
FROM
    sales
GROUP BY
    brand;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FyI9PH7rRkL4TEhtlCvS1%2Fimage.png?alt=media&#x26;token=9dbf20fb-6b4d-41c7-8385-77f8f80d02d1" alt=""><figcaption></figcaption></figure>

**To get number of products sold by a segment**

```sql
SELECT
    segment,
    SUM (quantity)
FROM
    sales
GROUP BY
    segment;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FwyaHjJmsn34Lb6rrKMk0%2Fimage.png?alt=media&#x26;token=2e81c153-404d-49d0-89bb-5aee3ba82dcb" alt=""><figcaption></figcaption></figure>

**To get number of products sold for all brands and segments**

```sql
SELECT SUM (quantity) FROM sales;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FMJWA6tLuuRi2KZU4d4Kw%2Fimage.png?alt=media&#x26;token=836f4ac4-962a-4e0b-ba47-da0a6876d1dd" alt=""><figcaption></figcaption></figure>

#### **To get all of above in a single query**

```sql
SELECT
    brand,
    segment,
    SUM (quantity)
FROM
    sales
GROUP BY
    brand,
    segment

UNION ALL

SELECT
    brand,
    NULL,
    SUM (quantity)
FROM
    sales
GROUP BY
    brand

UNION ALL

SELECT
    NULL,
    segment,
    SUM (quantity)
FROM
    sales
GROUP BY
    segment

UNION ALL

SELECT
    NULL,
    NULL,
    SUM (quantity)
FROM
    sales;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2Fti0QIuTUqMfLwZpA9Rn6%2Fimage.png?alt=media&#x26;token=b0e0c193-2634-4e62-98eb-42e564969ff9" alt=""><figcaption></figcaption></figure>

* Because UNION ALL requires all result sets to have the same number of columns with compatible data types, you need to adjust the queries by adding NULL to the selection list
* it has two main problems.
  1. First, it is quite lengthy.
  2. Second, it has a performance issue because PostgreSQL has to scan the sales table separately for each query.

**Using the GROUPING SETS clause**

```sql
SELECT
	brand,
	segment,
	SUM (quantity)
FROM
	sales
GROUP BY
	GROUPING SETS (
		(brand, segment),
		(brand),
		(segment),
		()
	);
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2Fm5AdNGQAmcCdV385mkeH%2Fimage.png?alt=media&#x26;token=4b878c28-7fcb-4481-a6c9-76df8d093a87" alt=""><figcaption></figcaption></figure>

#### Grouping function

`GROUPING()` function returns bit 0 if argument is member of current grouping set and 1 otherwise

```sql
SELECT
	GROUPING(brand) grouping_brand,
	GROUPING(segment) grouping_segment,
	brand,
	segment,
	SUM (quantity)
FROM
	sales
GROUP BY
	GROUPING SETS (
		(brand),
		(segment),
		()
	)
ORDER BY
	brand,
	segment;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FRlfZw6t51TSuRhRV73Ta%2Fimage.png?alt=media&#x26;token=b5b5a321-2653-478a-84ce-fa216a75a929" alt=""><figcaption></figcaption></figure>

### Cube

**(*****define multiple grouping sets that include all possible combinations of dimensions*****)**

* It allows us to generate multiple grouping sets
* CUBE subclause is a short way to define multiple grouping sets so the following are equivilant

```sql
CUBE (c1, c2, c3)

GROUPING SETS (
	(c1, c2, c3),
	(c1, c2),
	(c1, c3),
	(c2, c3),
	(c1),
	(c2),
	(c3),
	()
)
```

* In general, if the number of columns specified in `CUBE` is `n`, then we will have 2n combinations

#### Example

Consider the `sales` table

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FJ9tETaSqDezrvcYtXgeY%2Fimage.png?alt=media&#x26;token=0f5bf220-8a3a-435f-8dc4-7e601ab2ade1" alt=""><figcaption></figcaption></figure>

The following query uses `CUBE` subclause to generate multiple grouping sets

```sql
SELECT
	brand,
	segment,
	sum (quantity)
FROM
	sales
GROUP BY
	CUBE (brand, segment)
ORDER BY
	brand,
	segment;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FCQzCyKO3KWDYJnfsSzfZ%2Fimage.png?alt=media&#x26;token=1b198af3-5df9-482e-a7ee-cb6956443ede" alt=""><figcaption></figcaption></figure>

### Rollup

**(*****generate reports that contain totals and subtotals*****)**

* It is different from `CUBE` subclause, `ROLLUP` does not generate all possible grouping sets based on specified columns, it just makes a subset of those
* It assumes a heirarchy among the input columns and generates all grouping sets that make sense considering the heirarchy. This is the reason why `ROLLUP` is often used to generate the subtotals and grand total for reports
* Example, `CUBE (c1, c2, c3)` makes all eight possible grouping sets

```sql
(c1, c2, c3)
(c1, c2)
(c2, c3)
(c1, c3)
(c1)
(c2)
(c3)
()
```

* However, `ROLLUP (c1, c2, c3)` generates only four grouping sets, assuming the heirarchy `c1 > c2 > c3` as follows

```sql
(c1, c2, c3)
(c1, c2)
(c1)
()
```

* Common use of `ROLLUP` is to calculate aggregations of data by year, month and date, considering the heirarchy `year > month > date`
* It is also possible to do a partial roll up to reduce the number of subtotals generated.

```sql
SELECT
    c1,
    c2,
    c3,
    aggregate(c4)
FROM
    table_name
GROUP BY
    c1,
    ROLLUP (c2, c3);
```

#### Example (number of products sold by brand (subtotal) and by all brands and segments (total))

```sql
SELECT
	brand,
	segment,
	SUM (quantity)
FROM
	sales
GROUP BY
	ROLLUP (brand, segment)
ORDER BY
	brand,
	segment;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FDIYnyBNqfC13FW099E1r%2Fimage.png?alt=media&#x26;token=4891ebc1-4794-4a59-8ecc-57b67a0793d4" alt=""><figcaption></figcaption></figure>

If you change the order of brand and segment, the result will be different as follows:

```sql
SELECT
    segment,
    brand,
    SUM (quantity)
FROM
    sales
GROUP BY
    ROLLUP (segment, brand)
ORDER BY
    segment,
    brand;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FvLNC9RbmD19EgH0UfBdt%2Fimage.png?alt=media&#x26;token=9719ea66-65b5-4c61-99f6-42d9736beac0" alt=""><figcaption></figcaption></figure>

The following statement performs a partial roll-up:

```sql
SELECT
    segment,
    brand,
    SUM (quantity)
FROM
    sales
GROUP BY
    segment,
    ROLLUP (brand)
ORDER BY
    segment,
    brand;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FbVIAeXnGxEHVnEmAXOAa%2Fimage.png?alt=media&#x26;token=37856af9-b517-4b87-8f96-cba797b1691b" alt=""><figcaption></figcaption></figure>

#### Example (finds the number of rental per day, month, and year)

This is `rental` table

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FP33o0j0G4Lwevk1Xw3oA%2Fimage.png?alt=media&#x26;token=78e28182-73e6-4130-91a9-09545f1e80c6" alt=""><figcaption></figcaption></figure>

```sql
SELECT
	EXTRACT (YEAR FROM rental_date) y,
	EXTRACT (MONTH FROM rental_date) m,
	EXTRACT (DAY FROM rental_date) d,
	COUNT (rental_id)
FROM
	rental
GROUP BY
	ROLLUP (
		EXTRACT (YEAR FROM rental_date),
		EXTRACT (MONTH FROM rental_date),
		EXTRACT (DAY FROM rental_date)
	);
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FJNMHr929EtKrwJsdoy1m%2Fimage.png?alt=media&#x26;token=a061173d-ad90-47ba-9938-d40f5a948c67" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://notes.nomanaziz.me/development/database/sql/basics/6.-grouping-sets-cube-and-rollup.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
