Lesson 57

SQL SET Operators: UNION, INTERSECT, EXCEPT

Combine, compare, and subtract query results using SQL SET operators to answer multi-source reporting questions in a single statement.


A data analyst needs one list that combines all cities where customers live and all cities where employees are based. The data sits in two separate tables with no shared key to join on. SET operators are built for exactly this: stacking or comparing results from separate queries without needing a relationship between the tables.

How SET Operators Work

SET operators sit between two SELECT statements. Both queries must return the same number of columns, and the corresponding columns must have compatible data types. The column names in the final result come from the first query.

The basic structure for all SET operators looks like this:

SELECT column FROM table_one
OPERATOR
SELECT column FROM table_two;

The four operators are UNION, UNION ALL, INTERSECT, and EXCEPT. Each one does something different with the two result sets.

Key idea: Both SELECT statements on either side of a SET operator must have the same number of columns in the same order. The data types of matching columns must be compatible.

UNION and UNION ALL: Stacking Results

UNION combines two result sets into one. By default it removes duplicate rows. UNION ALL keeps everything, including duplicates.

Return a single list of all cities where either customers or employees are located:

SELECT c.city FROM customers c
UNION
SELECT e.email FROM employees e;

Wait, that is the wrong columns. Let us fix that. The city column only exists on customers. Employees have no city column in this schema. A better real example: return a combined list of all unique email addresses from both customers and employees:

SELECT c.email FROM customers c
UNION
SELECT e.email FROM employees e;

UNION deduplicates. If the same email appears in both tables, it appears once. To keep duplicates, use UNION ALL:

SELECT c.email FROM customers c
UNION ALL
SELECT e.email FROM employees e;

UNION ALL returns every row from both queries without checking for duplicates. It is faster and the right choice when you know duplicates are not possible, or when you specifically need to count them.

Key idea: Use UNION ALL when you know there are no duplicates or when duplicates are meaningful. Use UNION only when you need a deduplicated result and the extra processing cost is acceptable.

INTERSECT: Finding the Overlap

INTERSECT returns only the rows that appear in both result sets. It is useful when you need to identify shared values across two sources.

Return all first_name values that appear in both customers and employees:

SELECT c.first_name FROM customers c
INTERSECT
SELECT e.first_name FROM employees e;

Only names that exist in both tables appear in the result. This is a simple name match, not a person match, but it illustrates the operator clearly. INTERSECT automatically deduplicates, so each matching value appears once.

EXCEPT: Subtracting One Set from Another

EXCEPT returns rows from the first query that do not appear in the second query result. The order matters: flipping the queries produces a different result.

Return all first_name values that appear in customers but not in employees:

SELECT c.first_name FROM customers c
EXCEPT
SELECT e.first_name FROM employees e;

Flip the order and you get names in employees that are not in customers:

SELECT e.first_name FROM employees e
EXCEPT
SELECT c.first_name FROM customers c;

EXCEPT also deduplicates by default. Each value that passes the subtraction appears once in the result.

Sorting a SET Operator Result

You can sort the combined result with ORDER BY, but it goes at the very end of the full statement, after the last SELECT. It applies to the whole combined result, not to one side.

Return all email addresses from customers and employees, sorted alphabetically:

SELECT c.email FROM customers c
UNION
SELECT e.email FROM employees e
ORDER BY email;

The column name in ORDER BY refers to the column name from the first SELECT. You cannot put ORDER BY between the two queries.

How This Connects to Other Concepts

SET operators are different from [JOINs](learn/sql-joins). A JOIN combines columns from two tables horizontally, matching rows by a shared key. A SET operator stacks or compares rows vertically, requiring the same column structure on both sides.

[UNION ALL is often faster than UNION](learn/sql-union) for combining large result sets, for the same reason that [SELECT DISTINCT](learn/sql-select-distinct) adds overhead: deduplication requires sorting or hashing the full result.

Common Mistakes

1. Different number of columns on each side

Both SELECT statements in a SET operation must return the same number of columns. Mismatched column counts cause a syntax error.

-- wrong: left returns 2 columns, right returns 1
SELECT c.first_name, c.last_name FROM customers c
UNION
SELECT e.first_name FROM employees e;

Match the column count on both sides. Add a placeholder or a second column to the shorter SELECT.

-- correct
SELECT c.first_name, c.last_name FROM customers c
UNION
SELECT e.first_name, e.last_name FROM employees e;

2. Putting ORDER BY in the wrong place

ORDER BY placed after the first SELECT applies only to that half and causes an error in most SQL databases when a SET operator follows it.

-- wrong: ORDER BY between the two queries
SELECT c.email FROM customers c
ORDER BY email
UNION
SELECT e.email FROM employees e;

Move ORDER BY to the end of the entire statement so it applies to the full combined result.

-- correct
SELECT c.email FROM customers c
UNION
SELECT e.email FROM employees e
ORDER BY email;

3. Expecting column names from the second query

The result set takes its column names from the first SELECT. Aliasing columns in the second SELECT has no effect on the output column names.

-- wrong expectation: 'employee_email' will not appear as the column name
SELECT c.email FROM customers c
UNION
SELECT e.email AS employee_email FROM employees e;

Put the alias on the first SELECT if you want to control the output column name.

-- correct
SELECT c.email AS contact_email FROM customers c
UNION
SELECT e.email FROM employees e;

Quick Recap

  • SET operators combine results from two SELECT statements that have the same number of columns with compatible data types.
  • UNION combines and deduplicates; UNION ALL combines and keeps all rows including duplicates.
  • INTERSECT returns only rows that appear in both result sets.
  • EXCEPT returns rows from the first query that do not appear in the second query.
  • ORDER BY goes at the end of the full statement and applies to the entire combined result.
  • Column names in the output always come from the first SELECT statement.
Exercise 1 Easy
Write a query to return a combined list of all unique first_name values from both the customers table and the employees table, with no duplicates.
Hint: Use UNION between two SELECT statements, each selecting first_name from their respective table.
Solution
SELECT c.first_name FROM customers c
UNION
SELECT e.first_name FROM employees e;
Exercise 2 Easy
Write a query to return a combined list of all first_name values from both customers and employees, including duplicates.
Hint: Use UNION ALL instead of UNION to keep all rows including repeated names.
Solution
SELECT c.first_name FROM customers c
UNION ALL
SELECT e.first_name FROM employees e;
Exercise 3 Medium
Write a query to return all first_name values that appear in the customers table but do not appear in the employees table.
Hint: Use EXCEPT with customers as the first query and employees as the second. Select only first_name from each.
Solution
SELECT c.first_name FROM customers c
EXCEPT
SELECT e.first_name FROM employees e;
Exercise 4 Medium
Write a query to return all first_name values that appear in both the customers table and the employees table, sorted alphabetically by first_name.
Hint: Use INTERSECT between the two SELECT statements selecting first_name, then add ORDER BY first_name at the end.
Solution
SELECT c.first_name FROM customers c
INTERSECT
SELECT e.first_name FROM employees e
ORDER BY first_name;
Exercise 5 Medium
Write a query to return a combined list of first_name and last_name from both customers and employees, with no duplicates, sorted by last_name in ascending order.
Hint: Use UNION with both first_name and last_name selected from each table. Add ORDER BY last_name ASC at the end of the full statement.
Solution
SELECT c.first_name, c.last_name FROM customers c
UNION
SELECT e.first_name, e.last_name FROM employees e
ORDER BY last_name ASC;
Frequently Asked Questions

SET operators combine the results of two or more SELECT statements into a single result. The main ones are UNION, UNION ALL, INTERSECT, and EXCEPT.

UNION removes duplicate rows from the combined result. UNION ALL keeps all rows including duplicates and runs faster because it skips the deduplication step.

INTERSECT returns only the rows that appear in both query results. It is the overlap between two sets.

EXCEPT returns rows from the first query that do not appear in the second query result. It subtracts one set from another.