4. Grouping Data
Group By
(divide rows into groups and applies an aggregate function on each)
It divides the rows returned from
SELECT
statement into groupsFor each group, we can apply an aggregate function
PostgreSQL evaluates this clause something like this
Example 1 (Group By without an aggregate function)
SELECT
customer_id
FROM
payment
GROUP BY
customer_id;
In this case, it works like DISTINCT
clause that removes duplicate rows from the result set
Example 2 (Group By with SUM() function)
To select the total amount that each customer has been paid, you use the GROUP BY clause to divide the rows in the payment table into groups grouped by customer id
SELECT
customer_id,
SUM(amount)
FROM
payment
GROUP BY
customer_id;
ORDER BY
SUM(amount) DESC;
Example 3 (Group By with JOIN clause)
The following statement uses the GROUP BY clause with the INNER JOIN clause the get the total amount paid by each customer. Unlike the previous example, this query joins the payment table with the customer table and group customers by their names.
SELECT
first_name || ' ' || last_name full_name,
SUM(amount) amount
FROM
payment
INNER JOIN
customer USING (customer_id)
GROUP BY
first_name || ' ' || last_name
ORDER BY
amount DESC;

Example 4 (Group By with COUNT() function)
To find the number of payment transactions that each staff has processed, we can
SELECT
staff_id,
COUNT (payment_id)
FROM
payment
GROUP BY
staff_id;
Example 5 (Group By with multiple columns)
SELECT
customer_id,
staff_id,
SUM(amount)
FROM
payment
GROUP BY
staff_id,
customer_id
ORDER BY
customer_id;

Example 6 (Group By with date column)
To group the payment by dates, we can use DATE()
function to convert timestamps
SELECT
DATE(payment_date) paid_date,
SUM(amount) sum
FROM
payment
GROUP BY
DATE(payment_date);

Having
(apply conditions to groups)
It specifies a search condition for a group or an aggregate
It is often used with
GROUP BY
clause to filter groups or aggregatesIts evaluation sequence is
Since, it is evaluated before
SELECT
clause, we cannot use column aliases in HAVING clause
HAVING vs WHERE
WHERE
allows us to filter rows based on a specified conditionHAVING
allows us to filter groups of rows according to specified condition
Example 1 (HAVING with SUM() function)
To find total amount of each customer who have been spending more than 200
SELECT
customer_id,
SUM(amount)
FROM
payment
GROUP BY
customer_id
HAVING
SUM(amount) > 200;
Example 2 (HAVING with COUNT)
To select a store that has more than 300 customers
SELECT
store_id,
COUNT (customer_id)
FROM
customer
GROUP BY
store_id
HAVING
COUNT (customer_id) > 300;
Last updated