Choosing a Database
Choosing a Database
Every system design interview, every architecture review, and every greenfield project eventually arrives at the same question: which database should we use? The answer is rarely obvious, and the wrong choice can cost months of painful migration work. This lesson gives you a practical, repeatable decision framework you can apply immediately — grounded in real trade-offs and concrete examples.
Step 1 — Understand Your Data Model
Before you think about scale or availability, ask what shape your data is. The mismatch between data shape and database model is the single most common cause of painful rewrites.
- Strongly relational data with complex joins — user accounts, orders, inventory, financial transactions. Each entity references others through foreign keys. Use a relational (SQL) database:
PostgreSQLorMySQL. - Hierarchical or semi-structured documents — product catalogs, CMS content, user profiles with arbitrary fields. The schema varies per record. Use a document store:
MongoDB,Couchbase. - Key-based lookups with no complex queries — session storage, feature flags, rate-limit counters, leaderboards. Use a key-value store:
Redis,DynamoDB. - Time-ordered measurements — metrics, IoT sensor readings, application logs. Use a time-series database:
InfluxDB,TimescaleDB,Prometheus. - Deeply connected relationships — social graphs, recommendation engines, fraud detection. Use a graph database:
Neo4j,Amazon Neptune. - Analytical queries over large historical datasets — reporting, business intelligence, data warehousing. Use a columnar / OLAP store:
Redshift,BigQuery,Snowflake,ClickHouse.
Step 2 — Characterise Your Access Patterns
Data shape tells you the family; access patterns narrow the choice within that family. Ask:
- Read-heavy or write-heavy? A social media feed is read-heavy (1,000 reads per write). An IoT pipeline collecting telemetry is write-heavy (100 writes per read). PostgreSQL, Cassandra, and DynamoDB all handle each direction differently.
- What is the latency requirement? Sub-millisecond lookups (shopping cart, session) need an in-memory store like Redis. A financial report that runs once a day can tolerate seconds on Redshift.
- What queries do you need? Full-text search, geospatial queries, graph traversal — if your query pattern is exotic, pick a database that treats it as a first-class citizen rather than bolting it on later.
- What is the read/write ratio and the QPS? A small read replica on PostgreSQL handles 10,000 read QPS. Cassandra is designed to saturate 100,000+ write QPS across a cluster. Choosing PostgreSQL for the second workload will work — until it won't.
Step 3 — Know Your Consistency and Availability Requirements
The CAP theorem states that a distributed system can provide at most two of three guarantees: Consistency, Availability, and Partition tolerance. In practice, network partitions happen, so the real trade-off is CP vs AP:
- CP systems sacrifice availability during a partition to ensure every read returns the latest write. Use for financial transactions, inventory management, anything where stale data causes real-world harm. Examples:
PostgreSQL,CockroachDB,Spanner. - AP systems stay available during a partition and offer eventual consistency. Use for user timelines, shopping carts, view counters — places where briefly stale data is acceptable. Examples:
Cassandra,DynamoDB,CouchDB.
Step 4 — Estimate Volume and Velocity
Raw numbers drive the decision more than any principle. Use envelope math:
- Under 10 TB, under 10,000 QPS — a well-tuned single PostgreSQL primary with read replicas handles almost any web application. Do not add complexity you do not need.
- 10,000–100,000 write QPS, eventual consistency acceptable — Cassandra or DynamoDB. Both are designed for this range and partition horizontally with minimal ops overhead.
- 100+ TB analytical data, complex aggregations — a columnar OLAP warehouse: Snowflake, BigQuery, or Redshift. Row-oriented databases like PostgreSQL perform orders of magnitude worse for full-table scans.
- Sub-millisecond read latency at any scale — Redis (in-memory). Disk-based databases, even SSDs, cannot compete on latency for hot data.
Step 5 — Factor in Operational Realities
The best database for your data model is worthless if your team cannot operate it. Ask:
- Team expertise — a team of SQL experts will ship faster on PostgreSQL than on Cassandra even if Cassandra is technically a better fit. Expertise compounds over time.
- Managed vs self-hosted — AWS RDS, Google Cloud SQL, PlanetScale, and Supabase remove most operational burden from relational databases. DynamoDB, Cosmos DB, and Firestore are fully managed NoSQL options. Self-hosting Cassandra or Elasticsearch requires dedicated infrastructure engineers.
- Ecosystem and tooling — PostgreSQL has decades of tooling: pgAdmin, pg_dump, logical replication, extensions (PostGIS for geo, pgvector for vector search). MongoDB has a mature driver ecosystem. Niche databases may lack ORM support or migration tooling.
- Cost — DynamoDB charges per read/write capacity unit. BigQuery charges per byte scanned. PostgreSQL on RDS charges per instance-hour. Model your expected workload through each pricing model before committing.
The Polyglot Persistence Pattern
At scale, no single database is the right answer for every workload in a system. The industry-standard solution is polyglot persistence — using the best-fit database for each distinct access pattern within the same application. Amazon's product detail page, for example, reads from a relational database for pricing/inventory, a document store for the product description, a key-value cache for the session, a search index for the search bar, and a graph database for the "customers also bought" recommendations — all within a single page load.
The cost of polyglot persistence is operational complexity: more systems to monitor, more failure modes, more data synchronization to manage. This is why startups often start with a single PostgreSQL instance — premature polyglot persistence is as harmful as premature optimization.
A Practical Decision Checklist
- Draw your data model — identify entities and their relationships.
- List your top 3 most frequent query patterns.
- Estimate peak QPS, data volume, and acceptable latency (p99).
- Decide on consistency requirements — CP or AP?
- Ask what your team already knows how to operate.
- Consider whether a managed cloud service removes operational burden.
- Model the cost at your expected scale before committing.
Database selection is a multi-dimensional trade-off. There is no universally correct answer — only the answer that best fits your data shape, access patterns, consistency requirements, volume, and team. Master these five dimensions and you will rarely choose wrong.