You pull a list of all orders and notice some rows have no shipped_date. You run an average on discount_amt and the result looks higher than expected. Both of these are NULL problems. NULLs appear in almost every real dataset and they behave in ways that silently distort your queries if you are not watching for them.
What NULL Actually Is
NULL is not a value. It is the absence of a value. It means the data is unknown, not applicable, or simply was never entered.
In the ecommerce database, NULL can appear in many places:
shipped_dateanddelivered_dateon orders that have not shipped yetnoteson orders where no note was addedfailure_reasonin payments that succeededaddress_line2for customers with no second address linemanager_idfor the top-level employee with no manager
-- See where NULLs appear across key order columns
SELECT
o.order_id,
o.order_date,
o.shipped_date,
o.delivered_date,
o.notes
FROM orders o
WHERE o.shipped_date IS NULL
OR o.delivered_date IS NULL
ORDER BY o.order_date DESC
LIMIT 10;
IS NULL and IS NOT NULL
These are the only correct operators for checking NULL. Equality operators do not work.
-- Find customers with no phone number recorded
SELECT c.customer_id, c.first_name, c.last_name, c.phone
FROM customers c
WHERE c.phone IS NULL
ORDER BY c.last_name;
-- Find payments that have a failure reason recorded
SELECT pay.payment_id, pay.order_id, pay.method, pay.status, pay.failure_reason
FROM payments pay
WHERE pay.failure_reason IS NOT NULL
ORDER BY pay.payment_date DESC;
-- Employees who have no manager (top of the hierarchy)
SELECT e.employee_id, e.first_name, e.last_name, e.role
FROM employees e
WHERE e.manager_id IS NULL;
WHERE col = NULL never returns any rows. It evaluates to NULL, which SQL treats as not true. Always use IS NULL or IS NOT NULL when testing for missing values.How NULL Behaves in Expressions and Comparisons
Any arithmetic or string operation involving NULL returns NULL. This can silently produce null results in calculated columns.
-- discount_pct being NULL makes the whole expression NULL
SELECT
oi.item_id,
oi.unit_price,
oi.discount_pct,
oi.unit_price * (1 - oi.discount_pct) AS discounted_price
FROM order_items oi
LIMIT 10;
-- rows where discount_pct is NULL will show NULL in discounted_price
-- Use COALESCE to treat NULL discount as 0
SELECT
oi.item_id,
oi.unit_price,
COALESCE(oi.discount_pct, 0) AS discount_pct,
oi.unit_price * (1 - COALESCE(oi.discount_pct, 0)) AS discounted_price
FROM order_items oi
LIMIT 10;
How NULL Affects Aggregate Functions
COUNT(*) counts every row. COUNT(column) skips rows where that column is NULL. SUM(), AVG(), MIN(), and MAX() all ignore NULL values.
-- Compare COUNT(*) vs COUNT(column) to spot NULLs
SELECT
COUNT(*) AS total_rows,
COUNT(o.shipped_date) AS rows_with_ship_date,
COUNT(o.delivered_date) AS rows_with_delivery_date,
COUNT(o.notes) AS rows_with_notes
FROM orders o;
The difference between total_rows and any column count tells you exactly how many NULLs exist in that column.
-- AVG ignores NULLs, which may not be what you want
SELECT
AVG(o.discount_amt) AS avg_ignoring_nulls,
SUM(o.discount_amt) / COUNT(*) AS avg_treating_null_as_zero
FROM orders o;
If a NULL discount means no discount was applied, treating it as zero gives a more accurate average. The two calculations above will produce different results if discount_amt contains any NULLs.
Replacing NULL with COALESCE and IFNULL
COALESCE(a, b, c) returns the first non-null value from its arguments. IFNULL(a, b) is a SQLite shorthand that does the same thing with two arguments.
-- Replace NULL notes with a default label
SELECT
o.order_id,
o.status,
COALESCE(o.notes, 'No notes') AS notes
FROM orders o
ORDER BY o.order_date DESC
LIMIT 10;
-- Replace NULL failure_reason in payments
SELECT
pay.payment_id,
pay.method,
pay.status,
IFNULL(pay.failure_reason, 'N/A') AS failure_reason
FROM payments pay
ORDER BY pay.payment_date DESC
LIMIT 10;
-- COALESCE with multiple fallbacks
-- Use address_line2 if present, otherwise fall back to address_line1
SELECT
c.customer_id,
c.first_name,
COALESCE(c.address_line2, c.address_line1, 'No address') AS best_address
FROM customers c
LIMIT 10;
Common Mistakes
-- wrong
SELECT o.order_id, o.shipped_date
FROM orders o
WHERE o.shipped_date = NULL;
-- correct
SELECT o.order_id, o.shipped_date
FROM orders o
WHERE o.shipped_date IS NULL;
= NULL always evaluates to NULL and returns zero rows. Use IS NULL to correctly find rows with missing values.
---
-- wrong
SELECT COUNT(o.notes) AS total_orders
FROM orders o;
-- correct
SELECT COUNT(*) AS total_orders
FROM orders o;
COUNT(column) skips NULL values, so it undercounts when the column has missing entries. Use COUNT(*) when you want to count every row regardless of nulls.
---
-- wrong
SELECT o.order_id, o.discount_amt * 0.1 AS extra_discount
FROM orders o
WHERE o.discount_amt != 0;
-- correct
SELECT o.order_id, COALESCE(o.discount_amt, 0) * 0.1 AS extra_discount
FROM orders o
WHERE o.discount_amt != 0 OR o.discount_amt IS NULL;
!= 0 silently excludes rows where discount_amt is NULL because NULL != 0 evaluates to NULL. Decide whether NULLs should be included and handle them explicitly.
Quick Recap
NULLmeans unknown or missing. It is not zero, not an empty string, and not false. Any operation onNULLreturnsNULL.- Always use
IS NULLorIS NOT NULLto test for missing values. Using= NULLor!= NULLwill never match any rows. - Aggregate functions like
SUM()andAVG()ignore NULLs. UseCOALESCE()orIFNULL()to substitute a default value when NULL should be treated as zero or another placeholder.
customer_id, first_name, last_name, and phone from the customers table for all customers where phone is NULL. Sort by last_name ascending.WHERE c.phone IS NULL. Do not use = NULL.SELECT c.customer_id, c.first_name, c.last_name, c.phone FROM customers c WHERE c.phone IS NULL ORDER BY c.last_name ASC;
order_id, order_date, shipped_date, and a column called ship_status. If shipped_date is not null, show 'Shipped'. If it is null, show 'Not Shipped'. Filter to orders placed in 2024. Sort by order_date ascending.CASE WHEN o.shipped_date IS NOT NULL THEN ... ELSE ... END for the label. Filter the year with STRFTIME or a date range.SELECT o.order_id, o.order_date, o.shipped_date, CASE WHEN o.shipped_date IS NOT NULL THEN 'Shipped' ELSE 'Not Shipped' END AS ship_status FROM orders o WHERE o.order_date >= '2024-01-01' AND o.order_date < '2025-01-01' ORDER BY o.order_date ASC;
notes recorded. Show first_name, last_name, total_orders, and orders_without_notes. Only include employees who have at least one order. Join employees and orders. Sort by orders_without_notes descending.employee_id. Use COUNT(*) for total orders and COUNT(CASE WHEN o.notes IS NULL THEN 1 END) for orders without notes. Filter with HAVING.SELECT e.first_name, e.last_name, COUNT(*) AS total_orders, COUNT(CASE WHEN o.notes IS NULL THEN 1 END) AS orders_without_notes FROM employees e JOIN orders o ON e.employee_id = o.employee_id GROUP BY e.employee_id, e.first_name, e.last_name HAVING COUNT(*) >= 1 ORDER BY orders_without_notes DESC;
NULL means the value is unknown or missing. It is not the same as zero, an empty string, or false. It represents the absence of any value.
Use IS NULL or IS NOT NULL. Using = NULL or != NULL never works because any comparison with NULL returns NULL, not TRUE or FALSE.
No. NULL = NULL evaluates to NULL, not TRUE. Two unknown values are not considered equal in SQL.
COUNT(*) counts all rows including NULLs. COUNT(column) skips NULLs. SUM, AVG, MIN, and MAX all ignore NULL values in their calculations.