MySQL & Database Design

Referential Integrity

13 min Lesson 30 of 40

Referential Integrity

Referential integrity ensures that relationships between tables remain consistent. Foreign keys are the primary mechanism for maintaining referential integrity, preventing orphaned records and ensuring that data relationships accurately reflect your business logic. This is one of the most powerful features of relational databases.

What is Referential Integrity?

Referential integrity means that a foreign key value must either match a value in the referenced table or be NULL. For example, every order must belong to a valid customer—you can't have an order for a customer that doesn't exist.

Real-World Analogy: Think of referential integrity like library cards. A book checkout record must reference a valid library card number. If someone loses their card, the library must decide: prevent deletion (RESTRICT), delete all their checkouts (CASCADE), or mark checkouts as anonymous (SET NULL).

Creating Foreign Keys

Foreign keys establish relationships between tables:

-- Parent table (referenced table) CREATE TABLE customers ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255) NOT NULL, email VARCHAR(255) UNIQUE NOT NULL ); -- Child table (referencing table) CREATE TABLE orders ( id INT PRIMARY KEY AUTO_INCREMENT, order_number VARCHAR(50) UNIQUE NOT NULL, customer_id INT NOT NULL, order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP, total DECIMAL(10, 2) NOT NULL, -- Foreign key constraint CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ); -- Insert valid data INSERT INTO customers (id, name, email) VALUES (1, 'John Doe', 'john@ex.com'); INSERT INTO orders (order_number, customer_id, total) VALUES ('ORD-001', 1, 99.99); -- Works! -- This fails: Cannot add or update child row INSERT INTO orders (order_number, customer_id, total) VALUES ('ORD-002', 999, 49.99); -- customer_id 999 doesn't exist

Adding Foreign Keys to Existing Tables

You can add foreign keys after table creation:

-- Add foreign key to existing table ALTER TABLE orders ADD CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id); -- Add multiple foreign keys ALTER TABLE order_items ADD CONSTRAINT fk_order_items_order FOREIGN KEY (order_id) REFERENCES orders(id), ADD CONSTRAINT fk_order_items_product FOREIGN KEY (product_id) REFERENCES products(id);
Prerequisite: Before adding a foreign key, ensure that all existing values in the child column exist in the parent table, or the ALTER TABLE statement will fail.

ON DELETE Actions

The ON DELETE clause defines what happens when a referenced row is deleted:

RESTRICT (default): Prevent deletion if child rows exist CASCADE: Automatically delete child rows SET NULL: Set foreign key to NULL in child rows NO ACTION: Same as RESTRICT (check deferred in some databases) SET DEFAULT: Set to default value (not supported in InnoDB)

RESTRICT - Prevent Deletion

RESTRICT prevents deleting parent rows that have child records:

CREATE TABLE orders ( id INT PRIMARY KEY AUTO_INCREMENT, customer_id INT NOT NULL, total DECIMAL(10, 2) NOT NULL, CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE RESTRICT -- This is the default ); -- Insert data INSERT INTO customers (id, name, email) VALUES (1, 'Alice', 'alice@ex.com'); INSERT INTO orders (customer_id, total) VALUES (1, 100.00); -- This fails: Cannot delete or update parent row DELETE FROM customers WHERE id = 1; -- Solution: Delete child rows first DELETE FROM orders WHERE customer_id = 1; DELETE FROM customers WHERE id = 1; -- Now this works
Use Case: RESTRICT is ideal when child records are important and shouldn't be automatically deleted. For example, preventing account deletion while active orders exist.

CASCADE - Automatic Deletion

CASCADE automatically deletes related child rows:

CREATE TABLE blog_posts ( id INT PRIMARY KEY AUTO_INCREMENT, user_id INT NOT NULL, title VARCHAR(255) NOT NULL, content TEXT NOT NULL, CONSTRAINT fk_posts_user FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE ); CREATE TABLE comments ( id INT PRIMARY KEY AUTO_INCREMENT, post_id INT NOT NULL, user_id INT NOT NULL, comment TEXT NOT NULL, CONSTRAINT fk_comments_post FOREIGN KEY (post_id) REFERENCES blog_posts(id) ON DELETE CASCADE, CONSTRAINT fk_comments_user FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE ); -- Insert data INSERT INTO users (id, name, email) VALUES (1, 'Bob', 'bob@ex.com'); INSERT INTO blog_posts (id, user_id, title, content) VALUES (1, 1, 'My Post', 'Content'); INSERT INTO comments (post_id, user_id, comment) VALUES (1, 1, 'Great post!'); -- Delete user: automatically deletes all their posts and comments DELETE FROM users WHERE id = 1; -- Result: User, blog post, and comment all deleted
Caution: CASCADE can delete large amounts of data. In production, consider using soft deletes (marking records as deleted) instead of CASCADE for critical data.

SET NULL - Preserve Child Records

SET NULL keeps child records but removes the reference:

CREATE TABLE orders ( id INT PRIMARY KEY AUTO_INCREMENT, customer_id INT NULL, -- Must allow NULL sales_rep_id INT NULL, order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP, total DECIMAL(10, 2) NOT NULL, CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE SET NULL, CONSTRAINT fk_orders_sales_rep FOREIGN KEY (sales_rep_id) REFERENCES employees(id) ON DELETE SET NULL ); -- Insert data INSERT INTO customers (id, name, email) VALUES (1, 'Charlie', 'charlie@ex.com'); INSERT INTO employees (id, name) VALUES (10, 'Sales Rep'); INSERT INTO orders (customer_id, sales_rep_id, total) VALUES (1, 10, 500.00); -- Delete employee: order remains but sales_rep_id becomes NULL DELETE FROM employees WHERE id = 10; -- Order still exists with customer_id=1, sales_rep_id=NULL
Use Case: SET NULL is useful for optional relationships. For example, if a sales representative leaves, their orders remain for historical records but without the employee reference.

ON UPDATE Actions

Similar to ON DELETE, ON UPDATE defines behavior when a parent key is modified:

CREATE TABLE orders ( id INT PRIMARY KEY AUTO_INCREMENT, customer_id INT NOT NULL, total DECIMAL(10, 2) NOT NULL, CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE CASCADE ON UPDATE CASCADE -- If customer ID changes, update all orders ); -- If you update a customer's ID, all their orders are updated automatically UPDATE customers SET id = 100 WHERE id = 1; -- All orders with customer_id=1 now have customer_id=100
Best Practice: Use auto-increment primary keys that never change. This makes ON UPDATE CASCADE unnecessary in most cases. Natural keys that might change (like email addresses) are better as UNIQUE columns, not primary keys.

Handling Orphaned Records

Orphaned records are child rows that reference non-existent parent rows:

-- Find orphaned orders (customers that no longer exist) SELECT o.* FROM orders o LEFT JOIN customers c ON o.customer_id = c.id WHERE c.id IS NULL; -- Clean up orphaned records DELETE FROM orders WHERE customer_id NOT IN (SELECT id FROM customers); -- Or set to NULL if allowed UPDATE orders SET customer_id = NULL WHERE customer_id NOT IN (SELECT id FROM customers);

Multi-Level Cascades

Cascades can propagate through multiple levels:

-- Three-level hierarchy CREATE TABLE categories ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(100) NOT NULL ); CREATE TABLE products ( id INT PRIMARY KEY AUTO_INCREMENT, category_id INT NOT NULL, name VARCHAR(255) NOT NULL, CONSTRAINT fk_products_category FOREIGN KEY (category_id) REFERENCES categories(id) ON DELETE CASCADE ); CREATE TABLE order_items ( id INT PRIMARY KEY AUTO_INCREMENT, order_id INT NOT NULL, product_id INT NOT NULL, quantity INT NOT NULL, CONSTRAINT fk_order_items_product FOREIGN KEY (product_id) REFERENCES products(id) ON DELETE CASCADE ); -- Delete a category: deletes products, which deletes order_items DELETE FROM categories WHERE id = 5; -- All products in category 5 deleted -- All order_items for those products deleted
Be Careful: Multi-level cascades can have far-reaching effects. Always test cascade behavior in development before deploying to production.

Circular References

Handle circular references carefully:

-- Scenario: Employees table where each employee has a manager (also an employee) CREATE TABLE employees ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255) NOT NULL, manager_id INT NULL, CONSTRAINT fk_employees_manager FOREIGN KEY (manager_id) REFERENCES employees(id) ON DELETE SET NULL -- When manager deleted, subordinates remain ); -- Insert CEO (no manager) INSERT INTO employees (id, name, manager_id) VALUES (1, 'CEO', NULL); -- Insert manager reporting to CEO INSERT INTO employees (id, name, manager_id) VALUES (2, 'Manager', 1); -- Insert employee reporting to manager INSERT INTO employees (id, name, manager_id) VALUES (3, 'Employee', 2); -- Delete manager: employee remains but manager_id becomes NULL DELETE FROM employees WHERE id = 2;

Viewing Foreign Keys

Query information schema to see foreign key relationships:

-- View all foreign keys in database SELECT TABLE_NAME, COLUMN_NAME, CONSTRAINT_NAME, REFERENCED_TABLE_NAME, REFERENCED_COLUMN_NAME FROM information_schema.KEY_COLUMN_USAGE WHERE REFERENCED_TABLE_SCHEMA = 'your_database' AND REFERENCED_TABLE_NAME IS NOT NULL; -- View foreign key constraints with actions SELECT CONSTRAINT_NAME, TABLE_NAME, REFERENCED_TABLE_NAME, UPDATE_RULE, DELETE_RULE FROM information_schema.REFERENTIAL_CONSTRAINTS WHERE CONSTRAINT_SCHEMA = 'your_database'; -- See table creation with foreign keys SHOW CREATE TABLE orders;

Dropping Foreign Keys

Remove foreign key constraints when needed:

-- Drop foreign key constraint ALTER TABLE orders DROP FOREIGN KEY fk_orders_customer; -- Drop and recreate with different action ALTER TABLE orders DROP FOREIGN KEY fk_orders_customer, ADD CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE CASCADE;

Real-World Example: Complete E-Commerce Schema

Here's a comprehensive schema with proper referential integrity:

CREATE TABLE customers ( id INT PRIMARY KEY AUTO_INCREMENT, email VARCHAR(255) UNIQUE NOT NULL, name VARCHAR(255) NOT NULL ); CREATE TABLE products ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255) NOT NULL, price DECIMAL(10, 2) NOT NULL, stock INT NOT NULL ); CREATE TABLE orders ( id INT PRIMARY KEY AUTO_INCREMENT, customer_id INT NOT NULL, order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP, status ENUM('pending', 'processing', 'shipped', 'delivered') DEFAULT 'pending', total DECIMAL(10, 2) NOT NULL, CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE RESTRICT -- Can't delete customers with orders ); CREATE TABLE order_items ( id INT PRIMARY KEY AUTO_INCREMENT, order_id INT NOT NULL, product_id INT NOT NULL, quantity INT NOT NULL, price DECIMAL(10, 2) NOT NULL, CONSTRAINT fk_order_items_order FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE, -- Delete items when order deleted CONSTRAINT fk_order_items_product FOREIGN KEY (product_id) REFERENCES products(id) ON DELETE RESTRICT -- Can't delete products in existing orders ); CREATE TABLE reviews ( id INT PRIMARY KEY AUTO_INCREMENT, product_id INT NOT NULL, customer_id INT NULL, -- Allow NULL for deleted customers rating INT NOT NULL, comment TEXT, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, CONSTRAINT fk_reviews_product FOREIGN KEY (product_id) REFERENCES products(id) ON DELETE CASCADE, -- Delete reviews when product deleted CONSTRAINT fk_reviews_customer FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE SET NULL, -- Keep review when customer deleted CONSTRAINT chk_rating_range CHECK (rating BETWEEN 1 AND 5) );

Practice Exercise:

Scenario: Design a course enrollment system with proper referential integrity.

Requirements:

  1. Students, courses, and instructors tables
  2. Enrollments table linking students to courses
  3. Each course has an instructor
  4. If a course is deleted, delete its enrollments
  5. If a student is deleted, delete their enrollments
  6. If an instructor is deleted, prevent deletion if they have courses

Solution:

CREATE TABLE students ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255) NOT NULL, email VARCHAR(255) UNIQUE NOT NULL, enrollment_date DATE DEFAULT (CURRENT_DATE) ); CREATE TABLE instructors ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255) NOT NULL, email VARCHAR(255) UNIQUE NOT NULL, department VARCHAR(100) ); CREATE TABLE courses ( id INT PRIMARY KEY AUTO_INCREMENT, title VARCHAR(255) NOT NULL, instructor_id INT NOT NULL, credits INT NOT NULL, CONSTRAINT fk_courses_instructor FOREIGN KEY (instructor_id) REFERENCES instructors(id) ON DELETE RESTRICT -- Can't delete instructors with courses ); CREATE TABLE enrollments ( id INT PRIMARY KEY AUTO_INCREMENT, student_id INT NOT NULL, course_id INT NOT NULL, enrollment_date DATE DEFAULT (CURRENT_DATE), grade VARCHAR(2), CONSTRAINT fk_enrollments_student FOREIGN KEY (student_id) REFERENCES students(id) ON DELETE CASCADE, -- Delete enrollments when student deleted CONSTRAINT fk_enrollments_course FOREIGN KEY (course_id) REFERENCES courses(id) ON DELETE CASCADE, -- Delete enrollments when course deleted CONSTRAINT uq_student_course UNIQUE (student_id, course_id) );

Soft Deletes vs Foreign Key Cascades

Consider soft deletes for important data:

-- Instead of CASCADE DELETE, use soft deletes CREATE TABLE customers ( id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(255) NOT NULL, email VARCHAR(255) UNIQUE NOT NULL, deleted_at TIMESTAMP NULL, -- NULL = active, timestamp = deleted INDEX idx_deleted_at (deleted_at) ); -- "Delete" a customer (soft delete) UPDATE customers SET deleted_at = NOW() WHERE id = 1; -- Query active customers only SELECT * FROM customers WHERE deleted_at IS NULL; -- Restore a customer UPDATE customers SET deleted_at = NULL WHERE id = 1;

Summary

In this lesson, you learned:

  • Referential integrity ensures consistent relationships between tables
  • Foreign keys enforce referential integrity constraints
  • ON DELETE RESTRICT prevents deletion of referenced rows
  • ON DELETE CASCADE automatically deletes related child rows
  • ON DELETE SET NULL preserves child rows but removes references
  • Multi-level cascades can propagate through entire hierarchies
  • Use meaningful foreign key constraint names
  • Consider soft deletes for important data instead of CASCADE
Congratulations! You've completed Module 5: Transactions & Data Integrity. You now understand how to maintain data consistency through transactions, isolation levels, locks, constraints, and referential integrity—essential skills for any database professional!