Referential Integrity
Referential integrity ensures that relationships between tables remain consistent. Foreign keys are the primary mechanism for maintaining referential integrity, preventing orphaned records and ensuring that data relationships accurately reflect your business logic. This is one of the most powerful features of relational databases.
What is Referential Integrity?
Referential integrity means that a foreign key value must either match a value in the referenced table or be NULL. For example, every order must belong to a valid customer—you can't have an order for a customer that doesn't exist.
Real-World Analogy: Think of referential integrity like library cards. A book checkout record must reference a valid library card number. If someone loses their card, the library must decide: prevent deletion (RESTRICT), delete all their checkouts (CASCADE), or mark checkouts as anonymous (SET NULL).
Creating Foreign Keys
Foreign keys establish relationships between tables:
-- Parent table (referenced table)
CREATE TABLE customers (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL
);
-- Child table (referencing table)
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
order_number VARCHAR(50) UNIQUE NOT NULL,
customer_id INT NOT NULL,
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
total DECIMAL(10, 2) NOT NULL,
-- Foreign key constraint
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
);
-- Insert valid data
INSERT INTO customers (id, name, email) VALUES (1, 'John Doe', 'john@ex.com');
INSERT INTO orders (order_number, customer_id, total)
VALUES ('ORD-001', 1, 99.99); -- Works!
-- This fails: Cannot add or update child row
INSERT INTO orders (order_number, customer_id, total)
VALUES ('ORD-002', 999, 49.99); -- customer_id 999 doesn't exist
Adding Foreign Keys to Existing Tables
You can add foreign keys after table creation:
-- Add foreign key to existing table
ALTER TABLE orders
ADD CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id);
-- Add multiple foreign keys
ALTER TABLE order_items
ADD CONSTRAINT fk_order_items_order
FOREIGN KEY (order_id)
REFERENCES orders(id),
ADD CONSTRAINT fk_order_items_product
FOREIGN KEY (product_id)
REFERENCES products(id);
Prerequisite: Before adding a foreign key, ensure that all existing values in the child column exist in the parent table, or the ALTER TABLE statement will fail.
ON DELETE Actions
The ON DELETE clause defines what happens when a referenced row is deleted:
RESTRICT (default): Prevent deletion if child rows exist
CASCADE: Automatically delete child rows
SET NULL: Set foreign key to NULL in child rows
NO ACTION: Same as RESTRICT (check deferred in some databases)
SET DEFAULT: Set to default value (not supported in InnoDB)
RESTRICT - Prevent Deletion
RESTRICT prevents deleting parent rows that have child records:
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT NOT NULL,
total DECIMAL(10, 2) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
ON DELETE RESTRICT -- This is the default
);
-- Insert data
INSERT INTO customers (id, name, email) VALUES (1, 'Alice', 'alice@ex.com');
INSERT INTO orders (customer_id, total) VALUES (1, 100.00);
-- This fails: Cannot delete or update parent row
DELETE FROM customers WHERE id = 1;
-- Solution: Delete child rows first
DELETE FROM orders WHERE customer_id = 1;
DELETE FROM customers WHERE id = 1; -- Now this works
Use Case: RESTRICT is ideal when child records are important and shouldn't be automatically deleted. For example, preventing account deletion while active orders exist.
CASCADE - Automatic Deletion
CASCADE automatically deletes related child rows:
CREATE TABLE blog_posts (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT NOT NULL,
title VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
CONSTRAINT fk_posts_user
FOREIGN KEY (user_id)
REFERENCES users(id)
ON DELETE CASCADE
);
CREATE TABLE comments (
id INT PRIMARY KEY AUTO_INCREMENT,
post_id INT NOT NULL,
user_id INT NOT NULL,
comment TEXT NOT NULL,
CONSTRAINT fk_comments_post
FOREIGN KEY (post_id)
REFERENCES blog_posts(id)
ON DELETE CASCADE,
CONSTRAINT fk_comments_user
FOREIGN KEY (user_id)
REFERENCES users(id)
ON DELETE CASCADE
);
-- Insert data
INSERT INTO users (id, name, email) VALUES (1, 'Bob', 'bob@ex.com');
INSERT INTO blog_posts (id, user_id, title, content)
VALUES (1, 1, 'My Post', 'Content');
INSERT INTO comments (post_id, user_id, comment)
VALUES (1, 1, 'Great post!');
-- Delete user: automatically deletes all their posts and comments
DELETE FROM users WHERE id = 1;
-- Result: User, blog post, and comment all deleted
Caution: CASCADE can delete large amounts of data. In production, consider using soft deletes (marking records as deleted) instead of CASCADE for critical data.
SET NULL - Preserve Child Records
SET NULL keeps child records but removes the reference:
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT NULL, -- Must allow NULL
sales_rep_id INT NULL,
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
total DECIMAL(10, 2) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
ON DELETE SET NULL,
CONSTRAINT fk_orders_sales_rep
FOREIGN KEY (sales_rep_id)
REFERENCES employees(id)
ON DELETE SET NULL
);
-- Insert data
INSERT INTO customers (id, name, email) VALUES (1, 'Charlie', 'charlie@ex.com');
INSERT INTO employees (id, name) VALUES (10, 'Sales Rep');
INSERT INTO orders (customer_id, sales_rep_id, total)
VALUES (1, 10, 500.00);
-- Delete employee: order remains but sales_rep_id becomes NULL
DELETE FROM employees WHERE id = 10;
-- Order still exists with customer_id=1, sales_rep_id=NULL
Use Case: SET NULL is useful for optional relationships. For example, if a sales representative leaves, their orders remain for historical records but without the employee reference.
ON UPDATE Actions
Similar to ON DELETE, ON UPDATE defines behavior when a parent key is modified:
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT NOT NULL,
total DECIMAL(10, 2) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
ON DELETE CASCADE
ON UPDATE CASCADE -- If customer ID changes, update all orders
);
-- If you update a customer's ID, all their orders are updated automatically
UPDATE customers SET id = 100 WHERE id = 1;
-- All orders with customer_id=1 now have customer_id=100
Best Practice: Use auto-increment primary keys that never change. This makes ON UPDATE CASCADE unnecessary in most cases. Natural keys that might change (like email addresses) are better as UNIQUE columns, not primary keys.
Handling Orphaned Records
Orphaned records are child rows that reference non-existent parent rows:
-- Find orphaned orders (customers that no longer exist)
SELECT o.*
FROM orders o
LEFT JOIN customers c ON o.customer_id = c.id
WHERE c.id IS NULL;
-- Clean up orphaned records
DELETE FROM orders
WHERE customer_id NOT IN (SELECT id FROM customers);
-- Or set to NULL if allowed
UPDATE orders
SET customer_id = NULL
WHERE customer_id NOT IN (SELECT id FROM customers);
Multi-Level Cascades
Cascades can propagate through multiple levels:
-- Three-level hierarchy
CREATE TABLE categories (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL
);
CREATE TABLE products (
id INT PRIMARY KEY AUTO_INCREMENT,
category_id INT NOT NULL,
name VARCHAR(255) NOT NULL,
CONSTRAINT fk_products_category
FOREIGN KEY (category_id)
REFERENCES categories(id)
ON DELETE CASCADE
);
CREATE TABLE order_items (
id INT PRIMARY KEY AUTO_INCREMENT,
order_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
CONSTRAINT fk_order_items_product
FOREIGN KEY (product_id)
REFERENCES products(id)
ON DELETE CASCADE
);
-- Delete a category: deletes products, which deletes order_items
DELETE FROM categories WHERE id = 5;
-- All products in category 5 deleted
-- All order_items for those products deleted
Be Careful: Multi-level cascades can have far-reaching effects. Always test cascade behavior in development before deploying to production.
Circular References
Handle circular references carefully:
-- Scenario: Employees table where each employee has a manager (also an employee)
CREATE TABLE employees (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
manager_id INT NULL,
CONSTRAINT fk_employees_manager
FOREIGN KEY (manager_id)
REFERENCES employees(id)
ON DELETE SET NULL -- When manager deleted, subordinates remain
);
-- Insert CEO (no manager)
INSERT INTO employees (id, name, manager_id) VALUES (1, 'CEO', NULL);
-- Insert manager reporting to CEO
INSERT INTO employees (id, name, manager_id) VALUES (2, 'Manager', 1);
-- Insert employee reporting to manager
INSERT INTO employees (id, name, manager_id) VALUES (3, 'Employee', 2);
-- Delete manager: employee remains but manager_id becomes NULL
DELETE FROM employees WHERE id = 2;
Viewing Foreign Keys
Query information schema to see foreign key relationships:
-- View all foreign keys in database
SELECT
TABLE_NAME,
COLUMN_NAME,
CONSTRAINT_NAME,
REFERENCED_TABLE_NAME,
REFERENCED_COLUMN_NAME
FROM information_schema.KEY_COLUMN_USAGE
WHERE REFERENCED_TABLE_SCHEMA = 'your_database'
AND REFERENCED_TABLE_NAME IS NOT NULL;
-- View foreign key constraints with actions
SELECT
CONSTRAINT_NAME,
TABLE_NAME,
REFERENCED_TABLE_NAME,
UPDATE_RULE,
DELETE_RULE
FROM information_schema.REFERENTIAL_CONSTRAINTS
WHERE CONSTRAINT_SCHEMA = 'your_database';
-- See table creation with foreign keys
SHOW CREATE TABLE orders;
Dropping Foreign Keys
Remove foreign key constraints when needed:
-- Drop foreign key constraint
ALTER TABLE orders
DROP FOREIGN KEY fk_orders_customer;
-- Drop and recreate with different action
ALTER TABLE orders
DROP FOREIGN KEY fk_orders_customer,
ADD CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
ON DELETE CASCADE;
Real-World Example: Complete E-Commerce Schema
Here's a comprehensive schema with proper referential integrity:
CREATE TABLE customers (
id INT PRIMARY KEY AUTO_INCREMENT,
email VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(255) NOT NULL
);
CREATE TABLE products (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
price DECIMAL(10, 2) NOT NULL,
stock INT NOT NULL
);
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT NOT NULL,
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
status ENUM('pending', 'processing', 'shipped', 'delivered') DEFAULT 'pending',
total DECIMAL(10, 2) NOT NULL,
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
ON DELETE RESTRICT -- Can't delete customers with orders
);
CREATE TABLE order_items (
id INT PRIMARY KEY AUTO_INCREMENT,
order_id INT NOT NULL,
product_id INT NOT NULL,
quantity INT NOT NULL,
price DECIMAL(10, 2) NOT NULL,
CONSTRAINT fk_order_items_order
FOREIGN KEY (order_id)
REFERENCES orders(id)
ON DELETE CASCADE, -- Delete items when order deleted
CONSTRAINT fk_order_items_product
FOREIGN KEY (product_id)
REFERENCES products(id)
ON DELETE RESTRICT -- Can't delete products in existing orders
);
CREATE TABLE reviews (
id INT PRIMARY KEY AUTO_INCREMENT,
product_id INT NOT NULL,
customer_id INT NULL, -- Allow NULL for deleted customers
rating INT NOT NULL,
comment TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT fk_reviews_product
FOREIGN KEY (product_id)
REFERENCES products(id)
ON DELETE CASCADE, -- Delete reviews when product deleted
CONSTRAINT fk_reviews_customer
FOREIGN KEY (customer_id)
REFERENCES customers(id)
ON DELETE SET NULL, -- Keep review when customer deleted
CONSTRAINT chk_rating_range CHECK (rating BETWEEN 1 AND 5)
);
Practice Exercise:
Scenario: Design a course enrollment system with proper referential integrity.
Requirements:
- Students, courses, and instructors tables
- Enrollments table linking students to courses
- Each course has an instructor
- If a course is deleted, delete its enrollments
- If a student is deleted, delete their enrollments
- If an instructor is deleted, prevent deletion if they have courses
Solution:
CREATE TABLE students (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
enrollment_date DATE DEFAULT (CURRENT_DATE)
);
CREATE TABLE instructors (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
department VARCHAR(100)
);
CREATE TABLE courses (
id INT PRIMARY KEY AUTO_INCREMENT,
title VARCHAR(255) NOT NULL,
instructor_id INT NOT NULL,
credits INT NOT NULL,
CONSTRAINT fk_courses_instructor
FOREIGN KEY (instructor_id)
REFERENCES instructors(id)
ON DELETE RESTRICT -- Can't delete instructors with courses
);
CREATE TABLE enrollments (
id INT PRIMARY KEY AUTO_INCREMENT,
student_id INT NOT NULL,
course_id INT NOT NULL,
enrollment_date DATE DEFAULT (CURRENT_DATE),
grade VARCHAR(2),
CONSTRAINT fk_enrollments_student
FOREIGN KEY (student_id)
REFERENCES students(id)
ON DELETE CASCADE, -- Delete enrollments when student deleted
CONSTRAINT fk_enrollments_course
FOREIGN KEY (course_id)
REFERENCES courses(id)
ON DELETE CASCADE, -- Delete enrollments when course deleted
CONSTRAINT uq_student_course UNIQUE (student_id, course_id)
);
Soft Deletes vs Foreign Key Cascades
Consider soft deletes for important data:
-- Instead of CASCADE DELETE, use soft deletes
CREATE TABLE customers (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255) NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
deleted_at TIMESTAMP NULL, -- NULL = active, timestamp = deleted
INDEX idx_deleted_at (deleted_at)
);
-- "Delete" a customer (soft delete)
UPDATE customers SET deleted_at = NOW() WHERE id = 1;
-- Query active customers only
SELECT * FROM customers WHERE deleted_at IS NULL;
-- Restore a customer
UPDATE customers SET deleted_at = NULL WHERE id = 1;
Summary
In this lesson, you learned:
- Referential integrity ensures consistent relationships between tables
- Foreign keys enforce referential integrity constraints
- ON DELETE RESTRICT prevents deletion of referenced rows
- ON DELETE CASCADE automatically deletes related child rows
- ON DELETE SET NULL preserves child rows but removes references
- Multi-level cascades can propagate through entire hierarchies
- Use meaningful foreign key constraint names
- Consider soft deletes for important data instead of CASCADE
Congratulations! You've completed Module 5: Transactions & Data Integrity. You now understand how to maintain data consistency through transactions, isolation levels, locks, constraints, and referential integrity—essential skills for any database professional!