Your finance lead wants a breakdown of total revenue by country, but the revenue is in order_items and the country is in customers. Neither table alone can answer the question. You need to join them and then aggregate. That's the pattern this lesson covers.
The Basic Pattern: JOIN First, Then Aggregate
The reason this combination matters is that most real reporting questions span more than one table. A JOIN with aggregate functions is not two separate operations you stitch together; it's one query where the join runs first and the aggregation runs second.
The structure is always the same: join the tables you need, then GROUP BY the dimension you want to summarise by, then apply the aggregate function to the measure column.
Here's the simplest version. Total revenue per customer:
SELECT c.first_name, c.last_name, SUM(o.total_amount) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
The JOIN brings customers and orders together on customer_id. The GROUP BY collapses the rows so each customer appears once. The SUM adds up every total_amount for that customer across all their orders.
Notice customer_id is in GROUP BY even though it's not in SELECT. That's a common practice to ensure grouping is unambiguous, especially when two customers might share the same name.
Counting Rows Across Tables
COUNT works the same way. If you want to know how many orders each customer has placed, join the tables and count the order rows per customer:
SELECT c.first_name, c.last_name, COUNT(o.order_id) AS order_count
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY order_count DESC
COUNT(o.order_id) counts the number of non-NULL order_id values for each group. Because we're using an [INNER JOIN](learn/sql-inner-join), only customers who have at least one order appear in the result. Customers with no orders are excluded.
If you need to include customers with zero orders, switch to a LEFT JOIN. The COUNT would then return zero for those customers because o.order_id would be NULL.
Aggregating Across Three Tables
The pattern extends to more tables without changing the logic. The join chain runs first, and then you aggregate from whichever table holds the measure column.
Total items sold per product, with the product name:
SELECT p.product_name, SUM(oi.quantity) AS total_sold
FROM products p
JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.product_id, p.product_name
ORDER BY total_sold DESC
To add the brand alongside the product name, it's already in products, so no extra join is needed. To add something like the customer's country, you'd extend the chain through orders and then customers, as covered in the [joining multiple tables lesson](learn/sql-joining-multiple-tables).
Filtering Before and After Aggregation
You can filter rows before grouping with WHERE and filter groups after grouping with [HAVING](learn/sql-having-clause).
WHERE applies to the joined rows before any aggregation:
SELECT c.country, SUM(o.total_amount) AS revenue
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'delivered'
GROUP BY c.country
ORDER BY revenue DESC
Only delivered orders enter the aggregation. Cancelled or pending orders are excluded before the SUM runs.
HAVING applies after aggregation, so it can reference the result of the aggregate function:
SELECT c.country, SUM(o.total_amount) AS revenue
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'delivered'
GROUP BY c.country
HAVING SUM(o.total_amount) > 1000
ORDER BY revenue DESC
This returns only countries where delivered revenue exceeds 1000. The WHERE cuts the rows first. The HAVING cuts the groups second. Both can appear in the same query.
How JOIN with Aggregate Connects to Other Concepts
Understanding [aggregate functions](learn/sql-count-sum-avg-min-max) like COUNT and SUM independently is a prerequisite here. Once you know what they do on a single table, combining them with a JOIN is a direct extension, not a new concept.
The column aliasing rules from single-table queries carry over unchanged. Always prefix column names with the table alias when the query touches more than one table, including inside GROUP BY and HAVING.
Common Mistakes
1. Grouping by a column that isn't in SELECT or an aggregate
Including a non-aggregated column in SELECT without adding it to GROUP BY causes an error in most databases.
-- wrong: last_name is in SELECT but not in GROUP BY
SELECT c.first_name, c.last_name, COUNT(o.order_id) AS order_count
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.first_name
Add every non-aggregated column from SELECT to the GROUP BY clause.
-- correct
SELECT c.first_name, c.last_name, COUNT(o.order_id) AS order_count
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.first_name, c.last_name
2. Using WHERE to filter on an aggregate result
WHERE doesn't have access to aggregate results because it runs before GROUP BY.
-- wrong
SELECT c.country, COUNT(o.order_id) AS order_count
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE COUNT(o.order_id) > 5
GROUP BY c.country
Move aggregate conditions to HAVING, which runs after grouping.
-- correct
SELECT c.country, COUNT(o.order_id) AS order_count
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.country
HAVING COUNT(o.order_id) > 5
3. Forgetting to alias-prefix aggregated columns from joined tables
When multiple joined tables share a column name, using the column without a prefix in the aggregate function causes an ambiguity error.
-- wrong: status exists in both orders and payments
SELECT o.order_id, COUNT(status) AS status_count
FROM orders o
JOIN payments pay ON o.order_id = pay.order_id
GROUP BY o.order_id
Always prefix the column with its table alias inside the aggregate function.
-- correct
SELECT o.order_id, COUNT(pay.status) AS payment_count
FROM orders o
JOIN payments pay ON o.order_id = pay.order_id
GROUP BY o.order_id
Quick Recap
- The JOIN runs first and creates a combined result set; GROUP BY and aggregate functions then work on that result.
- Every non-aggregated column in SELECT must appear in GROUP BY.
- WHERE filters joined rows before aggregation; HAVING filters groups after aggregation.
- COUNT with an INNER JOIN excludes rows with no match; use a LEFT JOIN to include unmatched rows with a count of zero.
- Alias-prefix every column in aggregate functions when the query joins more than one table.
- This pattern produces most standard reporting outputs: totals per customer, counts per category, revenue per region.
amount collected and the number of payments per method in the payments table, using the aliases total_collected and payment_count.method in payments, then use SUM on amount and COUNT(*).SELECT pay.method, SUM(pay.amount) AS total_collected, COUNT(*) AS payment_count
FROM payments pay
GROUP BY pay.method
department from the employees table and the number of employees in that department, using the alias employee_count, sorted from highest to lowest.department in employees and use COUNT(*). Sort with ORDER BY employee_count DESC.SELECT e.department, COUNT(*) AS employee_count
FROM employees e
GROUP BY e.department
ORDER BY employee_count DESC
first_name and last_name alongside the number of orders they handled, using the alias order_count.employees and orders on employee_id, GROUP BY employee_id, first_name, and last_name, then COUNT order_id.SELECT e.first_name, e.last_name, COUNT(o.order_id) AS order_count
FROM employees e
JOIN orders o ON e.employee_id = o.employee_id
GROUP BY e.employee_id, e.first_name, e.last_name
first_name, last_name, and total total_amount they handled across all orders, using the alias total_handled, but only include employees where that total exceeds 500, sorted from highest to lowest.employees and orders on employee_id, GROUP BY employee columns, use SUM on total_amount, then filter with HAVING.SELECT e.first_name, e.last_name, SUM(o.total_amount) AS total_handled
FROM employees e
JOIN orders o ON e.employee_id = o.employee_id
GROUP BY e.employee_id, e.first_name, e.last_name
HAVING SUM(o.total_amount) > 500
ORDER BY total_handled DESC
brand from products and the total quantity sold across all order items, using the alias total_qty_sold, sorted from highest to lowest.products and order_items on product_id, GROUP BY brand, and SUM quantity.SELECT p.brand, SUM(oi.quantity) AS total_qty_sold
FROM products p
JOIN order_items oi ON p.product_id = oi.product_id
GROUP BY p.brand
ORDER BY total_qty_sold DESC
Yes. You JOIN the tables first, then GROUP BY a column from any of the joined tables. The aggregate functions run on the combined rows after the join.
Use COUNT(*) or COUNT(column) in your SELECT after joining the tables. GROUP BY the column you want to count by, such as a customer name or product category.
WHERE runs after the JOIN but before GROUP BY. It filters the combined rows before aggregation happens.
WHERE filters individual rows before grouping. HAVING filters groups after aggregation. If you want to exclude groups based on a COUNT or SUM result, use HAVING.