Every table in a real database needs a reliable way to identify each row. Without a primary key, you cannot safely join tables, update a specific record, or guarantee you are not working with duplicates. In the ecommerce schema, every table from customers to payments has a primary key column, and understanding why those columns were chosen that way is as useful as knowing the syntax.
What a PRIMARY KEY Enforces
A primary key combines two constraints in one declaration: NOT NULL and UNIQUE. Every value in the primary key column must exist and must be different from every other value in that column.
-- Single-column primary key on an integer ID
CREATE TABLE categories (
category_id INTEGER PRIMARY KEY,
category_name TEXT NOT NULL,
description TEXT,
created_at TEXT NOT NULL
);
category_id is now the unique identifier for every category row. The database will reject any insert that supplies a duplicate category_id or tries to leave it as NULL.
-- Verify primary keys are unique and not null by querying the table
SELECT cat.category_id, cat.category_name
FROM categories cat
ORDER BY cat.category_id;
Single-Column Primary Keys
The most common pattern is an integer surrogate key, an auto-incrementing number with no business meaning. Every table in the ecommerce schema uses this pattern.
CREATE TABLE customers (
customer_id INTEGER PRIMARY KEY,
first_name TEXT NOT NULL,
last_name TEXT NOT NULL,
email TEXT NOT NULL UNIQUE,
phone TEXT,
city TEXT,
country TEXT,
loyalty_tier TEXT,
created_at TEXT NOT NULL
);
In SQLite, declaring a column as INTEGER PRIMARY KEY makes it an alias for the internal rowid. SQLite automatically assigns the next available integer when you insert a row without supplying that column.
-- Insert without specifying customer_id; SQLite assigns it automatically
INSERT INTO customers (first_name, last_name, email, created_at)
VALUES ('Arun', 'Mehta', 'arun@example.com', '2024-03-10');
-- Retrieve the row using the auto-assigned primary key
SELECT c.customer_id, c.first_name, c.last_name, c.email
FROM customers c
WHERE c.email = 'arun@example.com';
Composite Primary Keys
A composite primary key uses two or more columns together as the unique identifier. Neither column needs to be unique on its own. Only the combination must be unique.
This is the right choice for junction tables and line-item tables where a single integer ID is not meaningful.
-- Composite primary key on order_items
-- The same product cannot appear twice in the same order
CREATE TABLE order_items (
item_id INTEGER PRIMARY KEY,
order_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
quantity INTEGER NOT NULL,
unit_price REAL NOT NULL,
line_total REAL NOT NULL,
UNIQUE (order_id, product_id)
);
Note that order_items still uses a surrogate item_id as its primary key and enforces the business rule separately with a UNIQUE constraint. An alternative would be to make (order_id, product_id) the primary key directly:
-- Pure composite primary key with no surrogate
CREATE TABLE order_items_alt (
order_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
quantity INTEGER NOT NULL,
unit_price REAL NOT NULL,
line_total REAL NOT NULL,
PRIMARY KEY (order_id, product_id)
);
Both approaches are valid. Surrogate keys are simpler to reference in foreign keys and application code. Composite primary keys make the uniqueness rule more explicit in the schema itself.
Natural Keys vs Surrogate Keys
A natural key is a column that already exists in your data and has a unique business meaning, like email for customers or sku for products. A surrogate key is a meaningless integer added purely to identify the row.
-- Natural key example: sku as the identifier for products
CREATE TABLE products_natural (
sku TEXT PRIMARY KEY,
product_name TEXT NOT NULL,
price REAL NOT NULL,
is_active INTEGER NOT NULL DEFAULT 1
);
-- Surrogate key example: product_id as the identifier
CREATE TABLE products_surrogate (
product_id INTEGER PRIMARY KEY,
sku TEXT NOT NULL UNIQUE,
product_name TEXT NOT NULL,
price REAL NOT NULL,
is_active INTEGER NOT NULL DEFAULT 1
);
The ecommerce schema uses surrogate keys everywhere. This is standard practice because natural keys can change over time. A customer might change their email. A product SKU might be reassigned. Surrogate integer IDs never change, which makes joins and foreign key references stable.
-- Look up a specific order using its surrogate primary key
SELECT o.order_id, o.order_date, o.status, o.total_amount
FROM orders o
WHERE o.order_id = 1042;
How SQLite Handles PRIMARY KEY
In SQLite, INTEGER PRIMARY KEY is special. It is an alias for the rowid and SQLite auto-assigns the next integer on insert. TEXT PRIMARY KEY or REAL PRIMARY KEY do not get this treatment and will not auto-increment.
For composite primary keys, you must use the table-level PRIMARY KEY (col1, col2) syntax. You cannot declare a composite key inline with the column definitions.
-- Correct composite primary key syntax in SQLite
CREATE TABLE employee_departments (
employee_id INTEGER NOT NULL,
department TEXT NOT NULL,
assigned_at TEXT NOT NULL,
PRIMARY KEY (employee_id, department)
);
Common Mistakes
-- wrong
CREATE TABLE payments (
payment_id INTEGER,
order_id INTEGER NOT NULL,
amount REAL NOT NULL,
status TEXT NOT NULL
);
-- correct
CREATE TABLE payments (
payment_id INTEGER PRIMARY KEY,
order_id INTEGER NOT NULL,
amount REAL NOT NULL,
status TEXT NOT NULL
);
Omitting PRIMARY KEY means the table has no enforced row identifier. You can end up with duplicate payment_id values and no reliable way to reference a specific payment.
---
-- wrong
CREATE TABLE order_items (
order_id INTEGER PRIMARY KEY,
product_id INTEGER PRIMARY KEY,
quantity INTEGER NOT NULL
);
-- correct
CREATE TABLE order_items (
order_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
quantity INTEGER NOT NULL,
PRIMARY KEY (order_id, product_id)
);
You cannot declare PRIMARY KEY twice inline on two separate columns. For a composite primary key you must use the table-level PRIMARY KEY (col1, col2) syntax at the end of the column list.
---
-- wrong
SELECT * FROM orders WHERE order_id = NULL;
-- correct
SELECT * FROM orders WHERE order_id IS NULL;
Primary key columns are NOT NULL by definition, so this query returns no rows regardless of what is in the table. In practice you would never search for a null primary key, but using = NULL anywhere in SQL is always wrong.
Quick Recap
- A primary key enforces both
NOT NULLandUNIQUEon the identifying column. Every table should have one so rows can always be uniquely referenced. - Surrogate integer keys are the most common pattern. In SQLite,
INTEGER PRIMARY KEYauto-increments by default so you do not need to supply the value on insert. - Composite primary keys cover a combination of columns and are declared at the table level using
PRIMARY KEY (col1, col2). Use them for junction and line-item tables where the pair of values defines uniqueness.
customer_id, first_name, last_name, and email from the customers table for the single customer with customer_id = 5. Use the primary key in your WHERE clause.c.customer_id = 5. A primary key lookup always returns at most one row.SELECT c.customer_id, c.first_name, c.last_name, c.email FROM customers c WHERE c.customer_id = 5;
order_id values exist in the payments table. An order should only have one payment record per transaction. Show order_id and a count called payment_count. Only return rows where the count is greater than 1. Sort by payment_count descending.GROUP BY pay.order_id with HAVING COUNT(*) > 1. This is a data quality check against an implied uniqueness rule.SELECT pay.order_id, COUNT(*) AS payment_count FROM payments pay GROUP BY pay.order_id HAVING COUNT(*) > 1 ORDER BY payment_count DESC;
customer_id, first_name, last_name, and the total number of orders they have placed as total_orders. Only include customers who have placed more than 2 orders. Join customers and orders using the primary and foreign key relationship. Sort by total_orders descending, then by customer_id ascending.c.customer_id = o.customer_id. Use GROUP BY on the customer primary key and name columns. Filter with HAVING COUNT(o.order_id) > 2.SELECT c.customer_id, c.first_name, c.last_name, COUNT(o.order_id) AS total_orders FROM customers c JOIN orders o ON c.customer_id = o.customer_id GROUP BY c.customer_id, c.first_name, c.last_name HAVING COUNT(o.order_id) > 2 ORDER BY total_orders DESC, c.customer_id ASC;
A primary key is a column or combination of columns that uniquely identifies each row in a table. It cannot contain NULL values and no two rows can share the same primary key value.
A table can only have one primary key, and it cannot contain NULLs. A unique key also enforces uniqueness but allows NULLs and a table can have multiple unique keys.
Yes. This is called a composite primary key. The combination of the columns must be unique across all rows, even if individual columns repeat.
In SQLite, INTEGER PRIMARY KEY already auto-increments by default. Adding AUTOINCREMENT is stricter and prevents reuse of deleted IDs, but it is rarely necessary for most applications.