Lesson 12

NOT NULL Constraint in SQL: Enforcing Required Values

Define which columns must always have a value and write queries that detect existing NULL violations in any table.

An order gets inserted without a customer_id. A product row lands in the database with no price. Both situations break downstream reports and are hard to detect after the fact. The NOT NULL constraint is how you stop these problems at the source by telling the database that certain columns are required, no exceptions.

What the NOT NULL Constraint Does

NOT NULL is a column-level constraint declared in the table schema. When a column has this constraint, the database refuses any INSERT or UPDATE that would leave that column without a value.

-- A table definition using NOT NULL on required columns
CREATE TABLE categories (
    category_id   INTEGER PRIMARY KEY,
    category_name TEXT    NOT NULL,
    description   TEXT,
    created_at    TEXT    NOT NULL
);

In this definition, category_name and created_at are required. description is optional and can be NULL. Any attempt to insert a row without category_name will be rejected immediately.

-- This INSERT would fail because category_name is NOT NULL
-- INSERT INTO categories (category_id, description, created_at)
-- VALUES (10, 'Optional desc', '2024-01-01');

-- This INSERT succeeds
INSERT INTO categories (category_id, category_name, created_at)
VALUES (10, 'Accessories', '2024-01-01');

Identifying Which Columns Should Be NOT NULL

Not every column needs to be required. The decision depends on whether the data makes sense without a value in that column.

For the ecommerce schema, here is how you would think through it:

| Column | Table | NOT NULL? | Reason | |--------|-------|-----------|--------| | customer_id | orders | Yes | Every order must belong to a customer | | order_date | orders | Yes | You need to know when the order was placed | | shipped_date | orders | No | Order may not have shipped yet | | email | customers | Yes | Required for account creation | | phone | customers | No | Customers may not provide a phone | | notes | orders | No | Notes are optional | | failure_reason | payments | No | Only present when a payment fails |

-- A more complete table definition showing mixed constraints
CREATE TABLE products (
    product_id    INTEGER PRIMARY KEY,
    product_name  TEXT    NOT NULL,
    category_id   INTEGER NOT NULL,
    price         REAL    NOT NULL,
    cost_price    REAL    NOT NULL,
    stock_qty     INTEGER NOT NULL DEFAULT 0,
    sku           TEXT    NOT NULL,
    brand         TEXT,
    is_active     INTEGER NOT NULL DEFAULT 1,
    created_at    TEXT    NOT NULL
);

brand is nullable here because not every product may have a known brand. Everything else is required.

Key idea: NOT NULL and DEFAULT work well together. If a column is required but has a sensible default, combine both constraints. stock_qty INTEGER NOT NULL DEFAULT 0 means the column can never be NULL and new products start with zero stock unless you specify otherwise.

Auditing Existing Tables for NULL Violations

Even with constraints defined on new data, you will often work with legacy tables that already contain NULLs in columns that should be required. Auditing for these is a standard data quality task.

-- Count NULLs in columns that should always have values in orders
SELECT
    COUNT(*)                                        AS total_rows,
    COUNT(*) - COUNT(o.customer_id)                 AS null_customer_id,
    COUNT(*) - COUNT(o.order_date)                  AS null_order_date,
    COUNT(*) - COUNT(o.status)                      AS null_status,
    COUNT(*) - COUNT(o.total_amount)                AS null_total_amount
FROM orders o;
-- Find the actual rows where a required column is NULL
SELECT o.order_id, o.order_date, o.customer_id, o.status, o.total_amount
FROM orders o
WHERE o.customer_id IS NULL
   OR o.order_date IS NULL
   OR o.status IS NULL
   OR o.total_amount IS NULL;
-- Same audit on products
SELECT
    COUNT(*)                            AS total_rows,
    COUNT(*) - COUNT(p.product_name)   AS null_product_name,
    COUNT(*) - COUNT(p.price)          AS null_price,
    COUNT(*) - COUNT(p.sku)            AS null_sku,
    COUNT(*) - COUNT(p.category_id)    AS null_category_id
FROM products p;

NOT NULL With DEFAULT Values

Pairing NOT NULL with DEFAULT is common for columns like flags, statuses, and timestamps where a sensible starting value exists.

CREATE TABLE payments (
    payment_id     INTEGER PRIMARY KEY,
    order_id       INTEGER NOT NULL,
    payment_date   TEXT    NOT NULL,
    amount         REAL    NOT NULL,
    method         TEXT    NOT NULL,
    status         TEXT    NOT NULL DEFAULT 'Pending',
    transaction_id TEXT,
    gateway        TEXT,
    failure_reason TEXT
);

status is required and defaults to 'Pending'. transaction_id, gateway, and failure_reason are all nullable because they may not be available at insert time.

-- Check current payment status distribution
SELECT pay.status, COUNT(*) AS count
FROM payments pay
GROUP BY pay.status
ORDER BY count DESC;

How SQLite Handles NOT NULL

SQLite enforces NOT NULL constraints at insert and update time. If you try to insert a NULL into a NOT NULL column, SQLite raises an error: NOT NULL constraint failed.

One important SQLite behaviour: INTEGER PRIMARY KEY columns are implicitly NOT NULL because they act as the row identifier. You do not need to declare NOT NULL on them separately.

-- Verify employees table has no NULLs in required identity columns
SELECT
    COUNT(*)                            AS total_employees,
    COUNT(*) - COUNT(e.first_name)      AS null_first_name,
    COUNT(*) - COUNT(e.last_name)       AS null_last_name,
    COUNT(*) - COUNT(e.email)           AS null_email,
    COUNT(*) - COUNT(e.role)            AS null_role,
    COUNT(*) - COUNT(e.hire_date)       AS null_hire_date
FROM employees e;

Common Mistakes

-- wrong
CREATE TABLE order_items (
    item_id    INTEGER PRIMARY KEY,
    order_id   INTEGER,
    product_id INTEGER,
    quantity   INTEGER,
    unit_price REAL
);
-- correct
CREATE TABLE order_items (
    item_id    INTEGER PRIMARY KEY,
    order_id   INTEGER NOT NULL,
    product_id INTEGER NOT NULL,
    quantity   INTEGER NOT NULL,
    unit_price REAL    NOT NULL
);

Omitting NOT NULL on foreign key and financial columns allows orphaned or incomplete rows to enter the table silently, which breaks joins and aggregate calculations downstream.

---

-- wrong
SELECT p.product_name, p.price
FROM products p
WHERE p.price != NULL;
-- correct
SELECT p.product_name, p.price
FROM products p
WHERE p.price IS NOT NULL;

Even when auditing for NOT NULL violations, you cannot use != NULL. It always evaluates to NULL and returns no rows. Use IS NOT NULL to find rows that violate the constraint.

---

-- wrong
ALTER TABLE products ADD COLUMN weight_kg REAL NOT NULL;
-- correct
ALTER TABLE products ADD COLUMN weight_kg REAL NOT NULL DEFAULT 0.0;

Adding a NOT NULL column to an existing table fails unless you also provide a DEFAULT value. Without a default, the database has no value to assign to existing rows.

Quick Recap

  • NOT NULL is a schema-level constraint that rejects any INSERT or UPDATE leaving that column empty. It stops bad data from entering the table in the first place.
  • Pair NOT NULL with DEFAULT for columns like status flags and timestamps where a sensible starting value exists. This lets inserts succeed without requiring callers to always supply the value.
  • Use COUNT(*) - COUNT(column) to audit existing tables for NULL violations in columns that should be required.
Exercise 1 Easy
Write a query that finds all rows in the customers table where email is NULL. Show customer_id, first_name, last_name, and email. Sort by customer_id ascending.
Hint: Use WHERE c.email IS NULL. Remember that = NULL will not work.
Solution
SELECT c.customer_id, c.first_name, c.last_name, c.email FROM customers c WHERE c.email IS NULL ORDER BY c.customer_id ASC;
Exercise 2 Medium
Write a data quality audit query for the orders table. Return a single row showing total_rows, null_customer_id, null_order_date, null_status, and null_total_amount as separate columns. Each null count should show how many rows are missing a value in that column.
Hint: Use COUNT(*) - COUNT(column) for each nullable check. No GROUP BY needed since you want one summary row.
Solution
SELECT COUNT(*) AS total_rows, COUNT(*) - COUNT(o.customer_id) AS null_customer_id, COUNT(*) - COUNT(o.order_date) AS null_order_date, COUNT(*) - COUNT(o.status) AS null_status, COUNT(*) - COUNT(o.total_amount) AS null_total_amount FROM orders o;
Exercise 3 Hard
Find all products that are missing either a price, a cost_price, or a sku. For each such product, show product_id, product_name, category_name from the categories table, price, cost_price, and sku. Label a new column missing_fields that lists which fields are null as a comma-separated string using CASE expressions combined with string concatenation. Sort by product_id.
Hint: Join products and categories on category_id. Use TRIM with concatenation or nested CASE statements to build the missing_fields string. Filter with WHERE to only return rows with at least one NULL among the three columns.
Solution
SELECT p.product_id, p.product_name, cat.category_name, p.price, p.cost_price, p.sku, TRIM( CASE WHEN p.price IS NULL THEN 'price, ' ELSE '' END || CASE WHEN p.cost_price IS NULL THEN 'cost_price, ' ELSE '' END || CASE WHEN p.sku IS NULL THEN 'sku' ELSE '' END, ', ') AS missing_fields FROM products p JOIN categories cat ON p.category_id = cat.category_id WHERE p.price IS NULL OR p.cost_price IS NULL OR p.sku IS NULL ORDER BY p.product_id;
Frequently Asked Questions

NOT NULL is a column constraint that prevents NULL values from being inserted or updated into that column. The database will reject any operation that tries to leave that column empty.

In most databases you use ALTER TABLE to modify the column definition. SQLite has limited ALTER TABLE support and usually requires recreating the table to add or remove constraints.

NOT NULL rejects the row if no value is provided. DEFAULT supplies an automatic value when none is given, which satisfies a NOT NULL constraint without requiring the caller to supply the value.

It can, because the database knows the column will never be NULL and can skip certain null checks during query planning. The bigger benefit is data quality and predictable query results.