Lesson 11

NULL in SQL: What It Means and How to Handle It

Identify where NULLs appear in your data, understand how they behave in queries, and replace or filter them without breaking your results.

You pull a list of all orders and notice some rows have no shipped_date. You run an average on discount_amt and the result looks higher than expected. Both of these are NULL problems. NULLs appear in almost every real dataset and they behave in ways that silently distort your queries if you are not watching for them.

What NULL Actually Is

NULL is not a value. It is the absence of a value. It means the data is unknown, not applicable, or simply was never entered.

In the ecommerce database, NULL can appear in many places:

  • shipped_date and delivered_date on orders that have not shipped yet
  • notes on orders where no note was added
  • failure_reason in payments that succeeded
  • address_line2 for customers with no second address line
  • manager_id for the top-level employee with no manager
-- See where NULLs appear across key order columns
SELECT
    o.order_id,
    o.order_date,
    o.shipped_date,
    o.delivered_date,
    o.notes
FROM orders o
WHERE o.shipped_date IS NULL
   OR o.delivered_date IS NULL
ORDER BY o.order_date DESC
LIMIT 10;

IS NULL and IS NOT NULL

These are the only correct operators for checking NULL. Equality operators do not work.

-- Find customers with no phone number recorded
SELECT c.customer_id, c.first_name, c.last_name, c.phone
FROM customers c
WHERE c.phone IS NULL
ORDER BY c.last_name;
-- Find payments that have a failure reason recorded
SELECT pay.payment_id, pay.order_id, pay.method, pay.status, pay.failure_reason
FROM payments pay
WHERE pay.failure_reason IS NOT NULL
ORDER BY pay.payment_date DESC;
-- Employees who have no manager (top of the hierarchy)
SELECT e.employee_id, e.first_name, e.last_name, e.role
FROM employees e
WHERE e.manager_id IS NULL;
Key idea: WHERE col = NULL never returns any rows. It evaluates to NULL, which SQL treats as not true. Always use IS NULL or IS NOT NULL when testing for missing values.

How NULL Behaves in Expressions and Comparisons

Any arithmetic or string operation involving NULL returns NULL. This can silently produce null results in calculated columns.

-- discount_pct being NULL makes the whole expression NULL
SELECT
    oi.item_id,
    oi.unit_price,
    oi.discount_pct,
    oi.unit_price * (1 - oi.discount_pct) AS discounted_price
FROM order_items oi
LIMIT 10;
-- rows where discount_pct is NULL will show NULL in discounted_price
-- Use COALESCE to treat NULL discount as 0
SELECT
    oi.item_id,
    oi.unit_price,
    COALESCE(oi.discount_pct, 0) AS discount_pct,
    oi.unit_price * (1 - COALESCE(oi.discount_pct, 0)) AS discounted_price
FROM order_items oi
LIMIT 10;

How NULL Affects Aggregate Functions

COUNT(*) counts every row. COUNT(column) skips rows where that column is NULL. SUM(), AVG(), MIN(), and MAX() all ignore NULL values.

-- Compare COUNT(*) vs COUNT(column) to spot NULLs
SELECT
    COUNT(*)                  AS total_rows,
    COUNT(o.shipped_date)     AS rows_with_ship_date,
    COUNT(o.delivered_date)   AS rows_with_delivery_date,
    COUNT(o.notes)            AS rows_with_notes
FROM orders o;

The difference between total_rows and any column count tells you exactly how many NULLs exist in that column.

-- AVG ignores NULLs, which may not be what you want
SELECT
    AVG(o.discount_amt)                          AS avg_ignoring_nulls,
    SUM(o.discount_amt) / COUNT(*)               AS avg_treating_null_as_zero
FROM orders o;

If a NULL discount means no discount was applied, treating it as zero gives a more accurate average. The two calculations above will produce different results if discount_amt contains any NULLs.

Replacing NULL with COALESCE and IFNULL

COALESCE(a, b, c) returns the first non-null value from its arguments. IFNULL(a, b) is a SQLite shorthand that does the same thing with two arguments.

-- Replace NULL notes with a default label
SELECT
    o.order_id,
    o.status,
    COALESCE(o.notes, 'No notes') AS notes
FROM orders o
ORDER BY o.order_date DESC
LIMIT 10;
-- Replace NULL failure_reason in payments
SELECT
    pay.payment_id,
    pay.method,
    pay.status,
    IFNULL(pay.failure_reason, 'N/A') AS failure_reason
FROM payments pay
ORDER BY pay.payment_date DESC
LIMIT 10;
-- COALESCE with multiple fallbacks
-- Use address_line2 if present, otherwise fall back to address_line1
SELECT
    c.customer_id,
    c.first_name,
    COALESCE(c.address_line2, c.address_line1, 'No address') AS best_address
FROM customers c
LIMIT 10;

Common Mistakes

-- wrong
SELECT o.order_id, o.shipped_date
FROM orders o
WHERE o.shipped_date = NULL;
-- correct
SELECT o.order_id, o.shipped_date
FROM orders o
WHERE o.shipped_date IS NULL;

= NULL always evaluates to NULL and returns zero rows. Use IS NULL to correctly find rows with missing values.

---

-- wrong
SELECT COUNT(o.notes) AS total_orders
FROM orders o;
-- correct
SELECT COUNT(*) AS total_orders
FROM orders o;

COUNT(column) skips NULL values, so it undercounts when the column has missing entries. Use COUNT(*) when you want to count every row regardless of nulls.

---

-- wrong
SELECT o.order_id, o.discount_amt * 0.1 AS extra_discount
FROM orders o
WHERE o.discount_amt != 0;
-- correct
SELECT o.order_id, COALESCE(o.discount_amt, 0) * 0.1 AS extra_discount
FROM orders o
WHERE o.discount_amt != 0 OR o.discount_amt IS NULL;

!= 0 silently excludes rows where discount_amt is NULL because NULL != 0 evaluates to NULL. Decide whether NULLs should be included and handle them explicitly.

Quick Recap

  • NULL means unknown or missing. It is not zero, not an empty string, and not false. Any operation on NULL returns NULL.
  • Always use IS NULL or IS NOT NULL to test for missing values. Using = NULL or != NULL will never match any rows.
  • Aggregate functions like SUM() and AVG() ignore NULLs. Use COALESCE() or IFNULL() to substitute a default value when NULL should be treated as zero or another placeholder.
Exercise 1 Easy
Write a query that returns customer_id, first_name, last_name, and phone from the customers table for all customers where phone is NULL. Sort by last_name ascending.
Hint: Use WHERE c.phone IS NULL. Do not use = NULL.
Solution
SELECT c.customer_id, c.first_name, c.last_name, c.phone FROM customers c WHERE c.phone IS NULL ORDER BY c.last_name ASC;
Exercise 2 Medium
Write a query that returns order_id, order_date, shipped_date, and a column called ship_status. If shipped_date is not null, show 'Shipped'. If it is null, show 'Not Shipped'. Filter to orders placed in 2024. Sort by order_date ascending.
Hint: Use CASE WHEN o.shipped_date IS NOT NULL THEN ... ELSE ... END for the label. Filter the year with STRFTIME or a date range.
Solution
SELECT o.order_id, o.order_date, o.shipped_date, CASE WHEN o.shipped_date IS NOT NULL THEN 'Shipped' ELSE 'Not Shipped' END AS ship_status FROM orders o WHERE o.order_date >= '2024-01-01' AND o.order_date < '2025-01-01' ORDER BY o.order_date ASC;
Exercise 3 Hard
For each employee, count the total number of orders they are linked to and the number of those orders that have no notes recorded. Show first_name, last_name, total_orders, and orders_without_notes. Only include employees who have at least one order. Join employees and orders. Sort by orders_without_notes descending.
Hint: Join on employee_id. Use COUNT(*) for total orders and COUNT(CASE WHEN o.notes IS NULL THEN 1 END) for orders without notes. Filter with HAVING.
Solution
SELECT e.first_name, e.last_name, COUNT(*) AS total_orders, COUNT(CASE WHEN o.notes IS NULL THEN 1 END) AS orders_without_notes FROM employees e JOIN orders o ON e.employee_id = o.employee_id GROUP BY e.employee_id, e.first_name, e.last_name HAVING COUNT(*) >= 1 ORDER BY orders_without_notes DESC;
Frequently Asked Questions

NULL means the value is unknown or missing. It is not the same as zero, an empty string, or false. It represents the absence of any value.

Use IS NULL or IS NOT NULL. Using = NULL or != NULL never works because any comparison with NULL returns NULL, not TRUE or FALSE.

No. NULL = NULL evaluates to NULL, not TRUE. Two unknown values are not considered equal in SQL.

COUNT(*) counts all rows including NULLs. COUNT(column) skips NULLs. SUM, AVG, MIN, and MAX all ignore NULL values in their calculations.