Databases & Storage

Indexing

18 min Lesson 4 of 10

Indexing

Every table in a production database eventually grows large enough that a full scan becomes painfully slow. Scanning 50 million rows to find one user by email is not an option when your API must respond in under 20 ms. Indexes are the primary tool databases use to find rows without reading every page on disk. Understanding how they work — and what they cost — is essential for any engineer designing systems at scale.

What Is an Index?

An index is a separate data structure, maintained alongside your table, that maps column values to the physical location (page and row offset) of the matching rows. Think of it as the index at the back of a textbook: instead of reading every page to find "B-tree", you look up the entry and jump straight to the right page.

Without an index, the database performs a full table scan — it reads every row in the table to evaluate the WHERE clause. With an index on the searched column, it performs an index seek — it navigates the index structure in O(log n) steps and fetches only the matching rows.

Full scan vs index seek Without Index — Full Table Scan Row 1 — user_id=1, email=alice@… Row 2 — user_id=2, email=bob@… Row 3 — user_id=3, email=carol@… Row 4 — user_id=4, email=dave@… ✓ Row 5 — user_id=5, email=eve@… ⋯ reads ALL rows ⋯ O(n) — scans every row With Index — Index Seek B-Tree Root Node Leaf: dave@ → Row 4 Row 4 — dave@ ✓ Navigates ~log₂(n) tree levels O(log n) — jumps directly
Without an index the database reads every row (O(n)); with a B-tree index it navigates a few levels to the matching row (O(log n)).

The B-Tree: How Most Indexes Work

The B-tree (balanced tree) is the data structure behind the default index in PostgreSQL, MySQL (InnoDB), Oracle, and SQL Server. It is self-balancing, meaning every leaf is at the same depth no matter how many rows are inserted or deleted. A table with 50 million rows has a B-tree roughly 4–5 levels deep — so finding any single row takes at most 4–5 page reads.

A B-tree has three node types:

  • Root node — the single entry point at the top of the tree.
  • Internal nodes — route lookups toward the correct subtree by comparing key values.
  • Leaf nodes — store the actual index key plus a pointer to the heap row (row ID / page + slot). Leaf nodes are linked in a doubly-linked list so range scans (e.g. WHERE created_at BETWEEN …) can walk forward without returning to the root.
B-tree index structure Root Node 50 | 200 | 600 Internal 10 | 30 | 45 Internal 100 | 150 | 180 Internal 300 | 450 | 580 Leaf 1 → row ptr 5 → row ptr Leaf 15 → row ptr 22 → row ptr Leaf 60 → row ptr 90 → row ptr Leaf 210 → row ptr 250 → row ptr Leaf 620 → row ptr 700 → row ptr ← Leaf nodes linked for range scans →
A B-tree index: root and internal nodes route lookups; leaf nodes hold key-to-row-pointer mappings and are linked for efficient range scans.

Types of Indexes

Beyond the standard B-tree, databases offer several specialized index types:

  • Hash index — stores an unsorted hash map of key → row. Extremely fast for exact-match lookups (O(1)), but useless for range queries or sorting. PostgreSQL supports them; MySQL's InnoDB uses B-trees even when you declare a hash index.
  • Composite (multi-column) index — a B-tree built on two or more columns. Crucially, queries must use the leftmost prefix of the index. An index on (last_name, first_name) helps a query filtering on last_name alone, but not one filtering on first_name alone.
  • Covering index — an index that contains all columns a query needs, so the database never has to read the heap table at all. The query is satisfied entirely from the index, which is a major win for high-read-rate tables.
  • Partial index — an index over a subset of rows (e.g. WHERE status = 'active'). Far smaller and faster when you only ever query active records.
  • Full-text index — inverted index built for keyword search inside text columns. Different engine from B-trees; used by MATCH … AGAINST in MySQL or tsvector / @@ in PostgreSQL.

The Write and Space Trade-Off

Indexes are not free. Every index you add imposes three costs:

  1. Extra disk space — a B-tree on a 50 M-row table with a VARCHAR(255) email column can easily consume several gigabytes.
  2. Write overhead — on every INSERT, UPDATE, or DELETE, the database must update every index that covers the changed columns. A table with 8 indexes takes roughly 8× more work per write compared to a table with no indexes. On a heavily-written table (e.g. a time-series events table ingesting 100k rows/sec) excessive indexes can saturate I/O.
  3. Lock contention — index pages can become hotspots, especially the right-most leaf of a monotonically increasing sequence (the "right-hand side" problem in B-trees).
Over-indexing is a real problem. A common anti-pattern is to add an index for every query ever written. On write-heavy tables this dramatically degrades throughput. Always measure. Use EXPLAIN ANALYZE (PostgreSQL) or EXPLAIN (MySQL) to confirm an index is actually used before adding it, and periodically audit unused indexes and drop them.

Clustered vs. Non-Clustered Indexes

A clustered index determines the physical order of rows on disk. In InnoDB (MySQL's default engine) the primary key is always the clustered index — the table is the B-tree. Accessing a row by primary key requires a single tree traversal. Secondary indexes in InnoDB store the primary key value as the row pointer, so a secondary index lookup does two traversals: one into the secondary index to get the PK, then one into the clustered index to fetch the full row.

In PostgreSQL, all indexes are logically non-clustered (heap-based). You can run CLUSTER to physically reorder the table once, but the order drifts over time.

Choose primary keys carefully. In InnoDB, UUIDs (random 128-bit values) as primary keys fragment the clustered B-tree because new rows insert randomly across all leaf pages. Sequential integer keys (AUTO_INCREMENT, BIGSERIAL) or time-ordered UUIDs (UUIDv7) keep inserts at the rightmost leaf, reducing page splits and write amplification.

Practical Guidelines

  • Index every foreign key column — unindexed FK columns cause full scans on every join.
  • Index columns used in WHERE, JOIN ON, and ORDER BY clauses of frequent, slow queries.
  • Prefer composite indexes over multiple single-column indexes when queries always filter on a fixed combination of columns.
  • Use partial indexes for low-cardinality filtered sets (e.g. WHERE processed = false — only a fraction of rows).
  • Monitor the query planner: if it chooses a full scan despite an existing index, the table statistics may be stale — run ANALYZE to refresh them.
Start with EXPLAIN. Before adding any index, run EXPLAIN ANALYZE <your query> in PostgreSQL (or EXPLAIN FORMAT=JSON in MySQL) and look at the actual rows scanned, the chosen plan node, and the execution time. This 30-second habit saves hours of guesswork and prevents unnecessary indexes from polluting your schema.

Summary

Indexes are the single most impactful tuning tool in relational databases. A B-tree index turns an O(n) table scan into an O(log n) seek by maintaining a balanced, sorted data structure with leaf-level row pointers. The cost is extra disk space and write amplification — every index must be updated on mutations. The art of indexing is finding the smallest set of indexes that satisfies your read patterns without crippling your write throughput. Always validate with the query planner before and after changes.