Normalization: 1NF, 2NF, 3NF
Normalization: 1NF, 2NF, 3NF
Redundancy is the silent enemy of a well-designed database. When the same fact is stored in multiple places, updates become risky, deletions destroy information unintentionally, and inserts fail until unrelated data is available. Normalization is the disciplined, step-by-step process of eliminating that redundancy by restructuring tables so that each fact lives in exactly one place.
There are many normal forms, but in practice the vast majority of business systems need only the first three: 1NF, 2NF, and 3NF. Together they remove three distinct categories of redundancy. We will walk through all three using a single running example drawn from an online bookstore.
The Starting Point: An Unnormalized Table
Imagine the bookstore captures each order in a spreadsheet-style table called OrderData:
Notice several problems: Sara Khalid / Riyadh appears twice (redundancy), the AuthorNames column holds multiple values in one cell (a violation we will fix first), and the book title depends only on the ISBN — not on which order it appears in. These are exactly the anomalies each normal form targets.
First Normal Form (1NF): Atomic Values, No Repeating Groups
Rule: Every cell must hold a single, indivisible (atomic) value. No arrays, no comma-separated lists, no repeating column groups (Author1, Author2, Author3 …). Each row must be uniquely identifiable by a primary key.
The AuthorNames column in row 4 holds two values — that violates 1NF. Fix it by splitting multi-valued data into separate rows, or into a related table. We also make the composite key explicit: (OrderID, BookISBN) uniquely identifies each line item.
Second Normal Form (2NF): No Partial Dependencies
Rule: The table must already be in 1NF, and every non-key attribute must depend on the whole primary key — not just part of it. This rule only applies when the primary key is composite (made up of more than one column).
In our 1NF table the composite key is (OrderID, BookISBN, AuthorID). Look at BookTitle: it depends only on BookISBN, not on OrderID or AuthorID. That is a partial dependency. Similarly, CustomerName and CustomerCity depend only on OrderID. Move each partially dependent group into its own table:
Now no non-key column "sneaks out" of its full key. The redundant repetition of Clean Code / 39.99 across rows 1001 and 1002 is gone — that fact lives once in Books.
Third Normal Form (3NF): No Transitive Dependencies
Rule: The table must already be in 2NF, and every non-key attribute must depend directly on the primary key — not on another non-key attribute. When attribute B determines attribute C, but B is itself a non-key column, that is a transitive dependency.
Look at the Orders table: if we add a CityRegion (e.g., "Central" for Riyadh, "Gulf" for Dubai), that column depends on CustomerCity, not directly on OrderID. The chain is: OrderID → CustomerCity → CityRegion. Move it out:
The Full Normalization Journey — Diagram
The diagram below shows the three stages as separate tables, with arrows indicating which foreign keys link them. Read it as a summary of where each fact ends up after normalization.
Spotting Each Violation Quickly
Analysts use a simple mental checklist when reviewing a table design:
- 1NF violation: Can you see a comma-separated list, a set of numbered columns (Phone1, Phone2 …), or a cell that clearly holds more than one piece of data? Fix it first.
- 2NF violation: Is the primary key composite? If yes, check every non-key column: does it need all parts of the key to be meaningful, or only some? Move partial-dependent columns out.
- 3NF violation: Can you draw a chain like Key → ColumnA → ColumnB? If ColumnB depends on ColumnA (a non-key), not directly on the key, move ColumnB out with ColumnA as its new key.
Update, Insertion, and Deletion Anomalies
Normalization is not an academic exercise — it prevents three real operational failures:
- Update anomaly: A book\'s price appears in 50 order rows. A price change must be applied 50 times; miss one and the data is inconsistent. After normalization, the price lives in
Booksonce. - Insertion anomaly: In the unnormalized table you cannot record a new book unless an order exists for it (because
OrderIDis part of the key). After normalization, you insert intoBooksindependently. - Deletion anomaly: If you delete the only order for a book, you lose the book record entirely. After normalization, deleting an order touches
OrderLinesonly;Booksis untouched.
Summary
Normalization moves a table through three levels of quality. 1NF enforces atomic values and a clear primary key. 2NF removes partial dependencies so that every non-key column truly needs the entire composite key. 3NF removes transitive dependencies so that non-key columns depend directly on the key, not on each other. After these three steps, each business fact lives in exactly one place — making your data model robust, consistent, and easy to maintain.