MySQL & Database Design

Database Design Review & Best Practices

13 min Lesson 40 of 40

Database Design Review & Best Practices

Congratulations on reaching the final lesson! In this comprehensive review, we'll consolidate everything you've learned about database design, examine common anti-patterns to avoid, explore scaling strategies, and prepare you for real-world database challenges.

Core Design Principles Recap

The fundamental principles that guide all good database design:

1. Normalization First, Denormalization When Needed
✓ Start with 3NF to eliminate data anomalies
✓ Denormalize only when performance requires it
✓ Document why you denormalized

2. Data Integrity is Non-Negotiable
✓ Use foreign keys to enforce relationships
✓ Apply appropriate constraints (NOT NULL, UNIQUE, CHECK)
✓ Choose correct data types and sizes
✓ Implement cascading rules thoughtfully

3. Performance Through Design
✓ Index strategically (not every column)
✓ Design for common query patterns
✓ Partition large tables by time or key ranges
✓ Pre-aggregate when querying raw data is slow

4. Security By Design
✓ Use prepared statements (prevent SQL injection)
✓ Apply principle of least privilege
✓ Encrypt sensitive data
✓ Audit critical operations

5. Scalability Planning
✓ Design for growth from day one
✓ Use appropriate primary key types (auto-increment vs UUID)
✓ Plan archiving strategy early
✓ Consider read replicas for read-heavy apps

Design Pattern Summary

Review of key patterns covered in this tutorial:

E-Commerce Pattern:
- Products with variations
- Shopping cart as user-product junction
- Orders as snapshots of cart state
- Inventory tracking with transactions
- Denormalized product details in order_items

Social Media Pattern:
- Follow relationships with status
- Polymorphic likes (posts + comments)
- Nested comments with recursive queries
- Activity feed generation
- Notification system

CMS Pattern:
- Hierarchical taxonomies (categories)
- Multi-language content with translation tables
- Revision history for version control
- Flexible meta fields
- Scheduled publishing

Multi-Tenant Pattern:
- Shared database with tenant_id isolation
- Separate databases per tenant
- Feature flags per tenant
- Usage tracking for billing

Time-Series Pattern:
- Time-based partitioning
- Pre-aggregated summary tables
- Star schema for analytics (OLAP)
- ETL from OLTP to OLAP

Common Anti-Patterns to Avoid

Learn from these common mistakes:

1. EAV (Entity-Attribute-Value) Anti-Pattern
❌ BAD: Generic key-value storage
CREATE TABLE object_values (
    object_id INT,
    attribute_name VARCHAR(100),
    attribute_value TEXT
);

✓ GOOD: Properly typed columns
CREATE TABLE products (
    id INT PRIMARY KEY,
    name VARCHAR(255),
    price DECIMAL(10,2),
    stock INT
);

Why It's Bad: No data type validation, poor performance,
complex queries, can't use foreign keys

2. Storing Delimited Lists
❌ BAD: Comma-separated tags
CREATE TABLE posts (
    id INT PRIMARY KEY,
    tags VARCHAR(500) -- 'php,mysql,web'
);

✓ GOOD: Junction table
CREATE TABLE post_tags (
    post_id INT,
    tag_id INT,
    PRIMARY KEY (post_id, tag_id)
);

3. Using Generic Type Column
❌ BAD: Polymorphic with string type
CREATE TABLE comments (
    commentable_type VARCHAR(50), -- 'Post', 'Photo'
    commentable_id INT
);

✓ BETTER: Separate tables or true polymorphic with ENUM
CREATE TABLE comments (
    commentable_type ENUM('post', 'photo'),
    commentable_id INT,
    INDEX idx_commentable (commentable_type, commentable_id)
);

4. Premature Denormalization
❌ BAD: Denormalizing before knowing query patterns
✓ GOOD: Normalize first, profile queries, then denormalize

5. Using VARCHAR for Everything
❌ BAD: VARCHAR for dates, numbers, booleans
✓ GOOD: Use correct types (DATE, INT, BOOLEAN)

6. Missing Indexes on Foreign Keys
❌ BAD: Foreign key without index
FOREIGN KEY (user_id) REFERENCES users(id)

✓ GOOD: Always index foreign keys
INDEX idx_user (user_id),
FOREIGN KEY (user_id) REFERENCES users(id)

7. No Soft Deletes for Audit Requirements
❌ BAD: Hard delete important records
DELETE FROM orders WHERE id = 123;

✓ GOOD: Soft delete with timestamp
UPDATE orders SET deleted_at = NOW() WHERE id = 123;

Critical: Anti-patterns are seductive because they seem to work initially. They cause problems at scale. Always question "simple" solutions.

Index Strategy Review

Indexing best practices:

When to Create Indexes:
✓ Primary keys (automatic)
✓ Foreign keys (always!)
✓ Columns in WHERE clauses
✓ Columns in JOIN conditions
✓ Columns in ORDER BY
✓ Columns in GROUP BY

When NOT to Create Indexes:
✗ Small tables (< 1000 rows)
✗ Columns rarely queried
✗ High write, low read tables
✗ Boolean columns (low cardinality)
✗ Columns that change frequently

Composite Index Guidelines:
1. Most selective column first
2. Equality before range
3. Order matters: (a,b,c) ≠ (c,b,a)
4. Can use leftmost prefix: (a,b,c) covers (a) and (a,b)

Example:
INDEX idx_user_status_created (user_id, status, created_at)

-- Can use index:
WHERE user_id = 1
WHERE user_id = 1 AND status = 'active'
WHERE user_id = 1 AND status = 'active' ORDER BY created_at

-- Cannot use index:
WHERE status = 'active'
WHERE created_at > '2024-01-01'

Query Optimization Checklist

Before Optimization:
1. Use EXPLAIN to understand execution plan
2. Check for full table scans (type = ALL)
3. Verify indexes are being used
4. Look for filesort and temporary tables
5. Profile slow queries (slow query log)

Common Optimizations:
✓ Add appropriate indexes
✓ Rewrite subqueries as JOINs
✓ Use LIMIT for pagination
✓ Avoid SELECT * (specify columns)
✓ Use covering indexes when possible
✓ Cache frequently accessed data
✓ Partition very large tables
✓ Use read replicas for reporting

Query Performance Targets:
Excellent:  < 10ms
Good:       < 100ms
Acceptable: < 500ms
Poor:       > 1000ms

Scaling Strategies

How to scale your database as your application grows:

Vertical Scaling (Scale Up):
- Add more CPU, RAM, faster disks
- Easiest approach, works until hardware limits
- Eventually hits ceiling
- Cost increases exponentially

Horizontal Scaling (Scale Out):

1. Read Replicas
   - Master handles writes
   - Replicas handle reads
   - Good for read-heavy applications
   - Eventual consistency

2. Sharding (Partitioning Data)
   - Split data across multiple databases
   - By customer (tenant_id)
   - By geography (region)
   - By hash (user_id % num_shards)
   - Complex application logic required

3. Database Clusters
   - Master-master replication
   - Multi-region deployments
   - Automatic failover
   - Use managed services (RDS, Aurora, etc.)

Caching Strategies:
1. Application-level caching (Redis, Memcached)
2. Query result caching
3. Object caching
4. Full-page caching
5. CDN for static assets

When to Scale:
- CPU consistently > 70%
- Memory consistently > 80%
- Disk I/O bottlenecks
- Query response times increasing
- Connection pool exhaustion

Schema Migration Strategy

Safe Migration Practices:

1. Always backup before migrations
2. Test migrations on staging first
3. Use transactions for safety (when possible)
4. Have rollback plan ready
5. Schedule during low-traffic periods
6. Monitor closely after deployment

Zero-Downtime Migrations:

Step 1: Add new column (nullable)
ALTER TABLE users ADD COLUMN new_email VARCHAR(255);

Step 2: Populate new column
UPDATE users SET new_email = email;

Step 3: Update application to write to both
-- Deploy application code

Step 4: Make new column NOT NULL
ALTER TABLE users MODIFY new_email VARCHAR(255) NOT NULL;

Step 5: Drop old column
ALTER TABLE users DROP COLUMN email;

Step 6: Rename new column
ALTER TABLE users CHANGE new_email email VARCHAR(255) NOT NULL;

Version Control Your Schema:
- Use migration tools (Laravel Migrations, Flyway, Liquibase)
- Track schema version in database
- Each migration should be reversible
- Never edit existing migrations

Security Best Practices

SQL Injection Prevention:
❌ NEVER: String concatenation
$sql = "SELECT * FROM users WHERE id = " . $_GET['id'];

✓ ALWAYS: Prepared statements
$stmt = $pdo->prepare("SELECT * FROM users WHERE id = ?");
$stmt->execute([$id]);

Access Control:
✓ Create role-specific database users
✓ Grant minimum required privileges
✓ Never use root in application
✓ Use SSL/TLS for connections
✓ Rotate passwords regularly

Data Protection:
✓ Encrypt sensitive data at rest
✓ Hash passwords (bcrypt, argon2)
✓ Encrypt backups
✓ Mask sensitive data in logs
✓ Comply with regulations (GDPR, HIPAA)

Audit Logging:
✓ Log authentication attempts
✓ Track data modifications
✓ Record admin actions
✓ Include IP, timestamp, user
✓ Store logs securely, separately

Testing Your Database Design

Database Testing Strategies:

1. Unit Tests (Test individual queries)
   - Verify constraints work
   - Test foreign key cascades
   - Validate triggers and procedures

2. Integration Tests (Test transactions)
   - Verify ACID properties
   - Test concurrent operations
   - Check isolation levels

3. Performance Tests (Load testing)
   - Simulate high concurrency
   - Test with production-like data volume
   - Monitor query times under load

4. Data Integrity Tests
   - Verify no orphaned records
   - Check referential integrity
   - Validate calculated fields

Test Data Generation:
- Use realistic volumes
- Include edge cases
- Test with NULL values
- Use faker libraries for realistic data

Documentation Best Practices

What to Document:

1. Entity Relationship Diagrams (ERD)
2. Table descriptions and purposes
3. Column data types and constraints
4. Foreign key relationships
5. Index strategies and reasoning
6. Denormalization decisions
7. Partitioning strategy
8. Migration history
9. Known limitations
10. Backup/recovery procedures

Use Database Comments:
CREATE TABLE orders (
    id BIGINT COMMENT 'Primary key',
    total DECIMAL(10,2) COMMENT 'Order total in USD'
) COMMENT = 'Customer orders table';

Tools for Documentation:
- dbdiagram.io for ERDs
- phpMyAdmin for schema browsing
- MySQL Workbench for reverse engineering
- SchemaSpy for automatic documentation

Monitoring and Maintenance

What to Monitor:
1. Query performance (slow query log)
2. Connection count
3. Replication lag
4. Disk space usage
5. Index usage statistics
6. Table size growth
7. Lock contention
8. CPU and memory usage

Regular Maintenance Tasks:
1. OPTIMIZE TABLE (defragment)
2. ANALYZE TABLE (update statistics)
3. Check for unused indexes
4. Archive old data
5. Verify backups are working
6. Review and rotate logs
7. Update MySQL version
8. Review security patches

Backup Strategy:
✓ Automated daily backups
✓ Test restore procedures
✓ Store backups off-site
✓ Keep multiple versions
✓ Document recovery steps

Real-World Lessons

Lessons from Production Systems:

1. Start Simple
   Don't over-engineer. Build what you need now,
   refactor as requirements become clear.

2. Premature Optimization is Evil
   Profile first, optimize second. Don't guess
   where bottlenecks are.

3. Data Grows Faster Than You Think
   Plan for 10x growth. Yesterday's "big enough"
   is today's bottleneck.

4. Migrations Are Risky
   Have a rollback plan. Test thoroughly.
   Automate when possible.

5. Security Matters From Day One
   You can't bolt on security later.
   Use prepared statements, always.

6. Monitor Everything
   You can't fix what you don't measure.
   Set up alerts before problems occur.

7. Documentation Saves Lives
   Future you (and your team) will thank you.
   Document why, not just what.

8. Backups Mean Nothing Without Tested Restores
   Test your disaster recovery plan regularly.

Next Steps & Advanced Topics

Continue your database journey:

Advanced Topics to Explore:

1. Database Internals
   - B-Tree and hash index structures
   - Query optimizer internals
   - Storage engines (InnoDB, MyISAM)

2. Advanced Replication
   - Multi-master replication
   - Group replication
   - Cross-region replication

3. NoSQL Databases
   - When to use NoSQL vs SQL
   - MongoDB, Cassandra, Redis
   - Polyglot persistence

4. NewSQL Databases
   - CockroachDB, TiDB, YugabyteDB
   - Distributed SQL at scale

5. Cloud Database Services
   - AWS RDS, Aurora, DynamoDB
   - Google Cloud SQL, Spanner
   - Azure SQL Database

6. Database DevOps
   - Infrastructure as Code
   - Automated testing
   - CI/CD for databases

7. Data Warehousing
   - Kimball methodology
   - Snowflake, BigQuery, Redshift
   - Real-time analytics

8. Graph Databases
   - Neo4j, Amazon Neptune
   - When relationships are complex

Pro Tip: The best way to master database design is to build real projects. Start with a personal project, make mistakes, refactor, and learn from the experience.

Final Challenge:

Design a Complete System:

Choose one of these projects and design the complete database:

Online Learning Platform - Courses, lessons, quizzes, certificates, progress tracking
Task Management System - Projects, tasks, subtasks, comments, time tracking
Hotel Booking System - Hotels, rooms, bookings, pricing, availability
Healthcare System - Patients, doctors, appointments, medical records

Requirements:

Create complete ERD
Write all CREATE TABLE statements
Design appropriate indexes
Include sample queries for main features
Document design decisions
Plan for scaling to 1M users

Evaluation Criteria:

Is data normalized appropriately?
Are foreign keys and constraints used correctly?
Are indexes well-designed?
Can the design scale?
Is security considered?
Is the design well-documented?

Conclusion

You've completed the MySQL & Database Design tutorial! Let's recap what you've mastered:

Module 1: Database Design Principles
✓ ERD creation and normalization (1NF, 2NF, 3NF, BCNF)
✓ Data types and column design
✓ Naming conventions and best practices

Module 2: Advanced Queries
✓ Subqueries, CTEs, and window functions
✓ Complex JOINs and set operations
✓ String/date functions and aggregations

Module 3: Indexes & Performance
✓ Index types and strategies
✓ Query optimization with EXPLAIN
✓ Full-text search

Module 4: Advanced Database Objects
✓ Views, stored procedures, functions
✓ Triggers and events
✓ Information schema queries

Module 5: Transactions & Data Integrity
✓ ACID properties and isolation levels
✓ Locks and deadlock prevention
✓ Constraints and referential integrity

Module 6: Security & User Management
✓ User privileges and roles
✓ Security best practices
✓ Backup and recovery strategies

Module 7: Real-World Database Design
✓ E-commerce, social media, CMS patterns
✓ Multi-tenant architecture
✓ Time-series and analytics databases

You now have the knowledge to design, optimize, and scale production-grade databases. Remember: great database design is an iterative process. Start simple, measure performance, and refactor as needed.

Keep Learning: Join database communities, read MySQL documentation, practice with real projects, and never stop learning. The database field is constantly evolving with new techniques and technologies.

Thank you for completing this tutorial! Best of luck in your database design journey!

Previous Lesson Time-Series & Analytics Database Design

Back to Course Fullstack Web Development

Back to Course

Database Design Review & Best Practices

Database Design Review & Best Practices

Core Design Principles Recap

Design Pattern Summary

Common Anti-Patterns to Avoid

Index Strategy Review

Query Optimization Checklist

Scaling Strategies

Schema Migration Strategy

Security Best Practices

Testing Your Database Design

Documentation Best Practices

Monitoring and Maintenance

Real-World Lessons

Next Steps & Advanced Topics

Final Challenge:

Conclusion

Tutorial Complete!