# 4. Grouping Data

### Group By

**(*****divide rows into groups and applies an aggregate function on each*****)**

* It divides the rows returned from `SELECT` statement into groups
* For each group, we can apply an **aggregate function**
* PostgreSQL evaluates this clause something like this
  \*

  ```
  <figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FMqWxrLciE4Msz3fVc7Pu%2Fimage.png?alt=media&#x26;token=4eb03613-b846-4a82-9f60-024168600179" alt=""><figcaption></figcaption></figure>
  ```

#### Example 1 (Group By without an aggregate function)

```sql
SELECT
	customer_id
FROM
	payment
GROUP BY
	customer_id;
```

In this case, it works like `DISTINCT` clause that removes duplicate rows from the result set

#### Example 2 (Group By with SUM() function)

To select the total amount that each customer has been paid, you use the GROUP BY clause to divide the rows in the payment table into groups grouped by customer id

```sql
SELECT
	customer_id,
	SUM(amount)
FROM
	payment
GROUP BY
	customer_id;
ORDER BY
	SUM(amount) DESC;
```

#### Example 3 (Group By with JOIN clause)

The following statement uses the GROUP BY clause with the INNER JOIN clause the get the total amount paid by each customer. Unlike the previous example, this query joins the payment table with the customer table and group customers by their names.

```sql
SELECT
	first_name || ' ' || last_name full_name,
	SUM(amount) amount
FROM
	payment
INNER JOIN
	customer USING (customer_id)
GROUP BY
	first_name || ' ' || last_name
ORDER BY
	amount DESC;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FPcegGXrV0srIMCD9KSm6%2Fimage.png?alt=media&#x26;token=a57f502a-0e7f-45a5-8671-d5faf0bd8eef" alt=""><figcaption></figcaption></figure>

#### Example 4 (Group By with COUNT() function)

To find the number of payment transactions that each staff has processed, we can

```sql
SELECT
	staff_id,
	COUNT (payment_id)
FROM
	payment
GROUP BY
	staff_id;
```

#### Example 5 (Group By with multiple columns)

```sql
SELECT
	customer_id,
	staff_id,
	SUM(amount)
FROM
	payment
GROUP BY
	staff_id,
	customer_id
ORDER BY
	customer_id;
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FmIsTpKbZ87R8uGTvhyuH%2Fimage.png?alt=media&#x26;token=f69a96db-c7cb-4376-831d-98cf1829764d" alt=""><figcaption></figcaption></figure>

#### Example 6 (Group By with date column)

To group the payment by dates, we can use `DATE()` function to convert timestamps

```sql
SELECT
	DATE(payment_date) paid_date,
	SUM(amount) sum
FROM
	payment
GROUP BY
	DATE(payment_date);
```

<figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2F7Z3uat18izN1XvCOE76y%2Fimage.png?alt=media&#x26;token=04c82871-363f-41d9-9f49-908f4b10e75b" alt=""><figcaption></figcaption></figure>

### Having

**(*****apply conditions to groups*****)**

* It specifies a search condition for a group or an aggregate
* It is often used with `GROUP BY` clause to filter groups or aggregates
* Its evaluation sequence is
  \*

  ```
  <figure><img src="https://1920086362-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FDfv51K0WXLZdwTryHQZc%2Fuploads%2FU0XkJqKgnQqkmI6fYk38%2Fimage.png?alt=media&#x26;token=022f1ce4-6090-49e2-8c89-bdb055b7bb22" alt=""><figcaption></figcaption></figure>
  ```
* Since, it is evaluated before `SELECT` clause, **we cannot use column aliases in HAVING clause**

#### HAVING vs WHERE

* `WHERE` allows us to filter rows based on a specified condition
* `HAVING` allows us to filter groups of rows according to specified condition

#### Example 1 (HAVING with SUM() function)

To find total amount of each customer who have been spending more than 200

```sql
SELECT
	customer_id,
	SUM(amount)
FROM
	payment
GROUP BY
	customer_id
HAVING
	SUM(amount) > 200;
```

#### Example 2 (HAVING with COUNT)

To select a store that has more than 300 customers

```sql
SELECT
	store_id,
	COUNT (customer_id)
FROM
	customer
GROUP BY
	store_id
HAVING
	COUNT (customer_id) > 300;
```

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://notes.nomanaziz.me/development/database/sql/basics/4.-grouping-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
