Why Missing Data Shows Up in Every Real Database
Your manager asks you to list all customers who have never logged in. Some rows in the customers table have no value in last_login because those customers registered but never came back. That absence is stored as NULL, and standard comparison operators will silently skip those rows if you are not careful. Learning to handle NULL correctly means your queries return complete, trustworthy results.
What NULL Actually Is
NULL is the database's way of saying "we don't have a value here." It is not zero. It is not an empty string. It is the absence of any value.
Because of this, any comparison using =, !=, <, or > against NULL returns neither true nor false. It returns unknown, which means the row gets filtered out silently.
=. The expression column = NULL always evaluates to unknown, not true. Always use IS NULL or IS NOT NULL.-- Customers who have never logged in
SELECT customer_id, first_name, last_name, email
FROM customers c
WHERE last_login IS NULL;
-- Customers who have logged in at least once
SELECT customer_id, first_name, last_name, last_login
FROM customers c
WHERE last_login IS NOT NULL;
IS NULL in Practice
Use IS NULL any time you want to find rows where a value is missing. In the ecommerce database, several columns are good candidates: address_line2, date_of_birth, gender, loyalty_tier, assigned_rep, and notes.
-- Orders that have no notes recorded
SELECT order_id, customer_id, order_date, status
FROM orders o
WHERE notes IS NULL;
-- Customers with no loyalty tier assigned yet
SELECT customer_id, first_name, last_name, email
FROM customers c
WHERE loyalty_tier IS NULL;
-- Payments where a failure reason exists (something went wrong)
SELECT payment_id, order_id, amount, failure_reason
FROM payments pay
WHERE failure_reason IS NOT NULL;
IS NOT NULL in Practice
IS NOT NULL keeps only the rows where a column has a real value. Use it when you need to work with complete data only.
-- Orders that have already shipped
SELECT order_id, customer_id, order_date, shipped_date
FROM orders o
WHERE shipped_date IS NOT NULL;
-- Products that have a brand listed
SELECT product_id, product_name, brand, price
FROM products p
WHERE brand IS NOT NULL;
Combining NULL Checks with Other Conditions
You can mix IS NULL and IS NOT NULL with AND, OR, and other filters just like any other condition.
-- Customers with no phone number and no date of birth on record
SELECT customer_id, first_name, last_name, email
FROM customers c
WHERE phone IS NULL AND date_of_birth IS NULL;
-- Active products with no brand listed
SELECT product_id, product_name, price
FROM products p
WHERE is_active = 1 AND brand IS NULL;
-- Orders that were placed but never delivered and have no shipped date
SELECT order_id, customer_id, order_date, status
FROM orders o
WHERE shipped_date IS NULL AND delivered_date IS NULL AND status != 'cancelled';
Counting NULLs with Aggregates
COUNT(column) only counts non-NULL values. COUNT(*) counts all rows including those with NULLs. This difference matters when you are auditing data quality.
-- How many customers are missing a phone number
SELECT COUNT(*) AS total_customers,
COUNT(phone) AS has_phone,
COUNT(*) - COUNT(phone) AS missing_phone
FROM customers c;
-- Per-order count of items where a discount was applied
SELECT o.order_id,
COUNT(oi.item_id) AS total_items,
COUNT(oi.discount_pct) AS items_with_discount
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY o.order_id;
Common Mistakes
-- wrong
SELECT * FROM customers c WHERE last_login = NULL;
This never returns any rows because comparing anything to NULL with = always evaluates to unknown.
-- correct
SELECT * FROM customers c WHERE last_login IS NULL;
---
-- wrong
SELECT * FROM orders o WHERE notes != '';
This filters out empty strings but does not remove NULL values. Rows where notes is NULL will still appear or disappear unexpectedly depending on context.
-- correct
SELECT * FROM orders o WHERE notes IS NOT NULL AND notes != '';
---
-- wrong
SELECT COUNT(notes) AS total FROM orders o;
If your intent is to count all orders, this undercounts because COUNT(column) skips NULL values.
-- correct
SELECT COUNT(*) AS total FROM orders o;
Quick Recap
NULLmeans a value is missing, not zero and not an empty string- Use
IS NULLto find rows where a column has no value - Use
IS NOT NULLto keep only rows that have a value in that column - Never use
= NULLor!= NULL, they will silently return wrong results COUNT(column)skips NULLs whileCOUNT(*)counts every row
assigned_rep on record. Return their customer_id, first_name, last_name, and email.SELECT customer_id, first_name, last_name, email FROM customers c WHERE assigned_rep IS NULL;
shipped_date is not null) but have not yet been delivered (delivered_date is null). Return order_id, customer_id, order_date, and shipped_date.SELECT order_id, customer_id, order_date, shipped_date FROM orders o WHERE shipped_date IS NOT NULL AND delivered_date IS NULL;
failure_reason, return the customer's first_name, last_name, email, and the count of failed payments as failed_payment_count. Order by failed_payment_count descending.SELECT c.first_name, c.last_name, c.email, COUNT(pay.payment_id) AS failed_payment_count FROM customers c JOIN orders o ON c.customer_id = o.customer_id JOIN payments pay ON o.order_id = pay.order_id WHERE pay.failure_reason IS NOT NULL GROUP BY c.customer_id, c.first_name, c.last_name, c.email ORDER BY failed_payment_count DESC;
NULL means the value is unknown or missing. It is not the same as zero, an empty string, or a space.
NULL is not a value you can compare with `=`. You must use `IS NULL` or `IS NOT NULL` instead.
Use `WHERE column IS NOT NULL` to exclude rows where that column has no value.
An empty string `''` is an actual value stored in the column. NULL means no value was stored at all.