JPA, Hibernate & the EntityManager
JPA, Hibernate & the EntityManager
Before writing a single mapping annotation you need a clear mental model of the three-layer stack: the JPA specification, the Hibernate implementation, and the EntityManager API that sits between your code and the database. Confusing these layers is the root cause of most beginner mistakes — wrong imports, surprising behaviour, and hard-to-find bugs.
JPA: the Specification
JPA (Jakarta Persistence API) is a standard — a set of interfaces, annotations, and rules defined in the jakarta.persistence package. It ships no executable code of its own; it only defines what an ORM must look like. The key classes you interact with every day — @Entity, @Id, EntityManager, TypedQuery — are all JPA interfaces or annotations.
Because JPA is just a spec, any compliant library can implement it. The two most common implementations are Hibernate (by far the most popular) and EclipseLink. Your application code imports from jakarta.persistence.*, so switching providers requires zero code changes — only a dependency swap.
jakarta.persistence.*. If you see javax.persistence.* in a tutorial it is targeting an older stack (Spring Boot 2 / Hibernate 5). Do not mix them — the annotations look identical but are different classes and will not be detected by the ORM.
Hibernate: the Implementation
Hibernate is the concrete library that actually does the work: it scans your classes for JPA annotations, generates SQL, manages the first-level cache, translates exceptions, and integrates with connection pools. It also provides a set of Hibernate-specific extensions (annotations in org.hibernate.annotations.* and the Session API) that go beyond the JPA spec.
A pragmatic rule: prefer JPA annotations for everything the spec covers, and reach for Hibernate-specific extensions only when JPA cannot express what you need (e.g., custom SQL types, batch-fetch tuning). This keeps your code portable and easier to reason about.
spring-boot-starter-data-jpa pulls in Hibernate 6 automatically. You do not need to add Hibernate directly. The starter also configures a DataSource (HikariCP), an EntityManagerFactory, and Spring\'s transaction management — all from your application.properties.
The EntityManagerFactory and the EntityManager
The JPA runtime is bootstrapped through two objects:
- EntityManagerFactory (EMF) — a heavy, thread-safe object created once per application (one per persistence unit). It reads your mapping metadata, validates the schema, and prepares SQL templates. Creating it is expensive; you must never create it per-request.
- EntityManager (EM) — a lightweight, non-thread-safe object representing a single unit of work. It owns the persistence context (the first-level cache). You obtain a new EM for each request or transaction, use it, then close it.
In a Spring Boot application you almost never touch the EMF or create EMs manually. Spring's container manages them and injects an EM proxy into your beans via @PersistenceContext (or indirectly through Spring Data repositories).
Core EntityManager Operations
The EM exposes a small, orthogonal API. Understanding what each method does to the database and to the persistence context is critical for avoiding surprise queries and performance issues.
persist(entity)— schedules an INSERT. The entity transitions to the managed state. The SQL is deferred until flush (usually at transaction commit).find(Class, id)— returns the managed entity for a given primary key, ornullif none. Checks the first-level cache before hitting the database.getReference(Class, id)— returns a proxy (a lazy placeholder) without touching the database. Useful for setting foreign keys without loading the full object. ThrowsEntityNotFoundExceptionon first field access if the row does not exist.merge(entity)— copies the state of a detached entity into a managed copy and returns the managed copy. The original object stays detached. Required when you pass entities across transaction or session boundaries.remove(entity)— schedules a DELETE. The entity must be managed (pass it throughfindfirst if needed).flush()— synchronises the persistence context with the database immediately (executes pending SQL) without committing the transaction.detach(entity)/clear()— removes one or all entities from the persistence context, making them detached. Useful in batch processing to free memory.
Entity Lifecycle States
Every JPA entity exists in one of four states relative to an EntityManager:
- New (Transient) — created with
new, not associated with any EM. Changes are not tracked. - Managed — associated with an EM and the persistence context. All changes are automatically detected and flushed.
- Detached — was managed, but the EM was closed or
detach()was called. Changes are not tracked; usemerge()to re-attach. - Removed — scheduled for deletion; DELETE will fire on flush.
LazyInitializationException. The fix is to load the data while the entity is still managed: either use a JOIN FETCH query, call Hibernate.initialize(), or adopt a DTO projection.
The persistence.xml and Spring Boot Auto-configuration
Traditional JPA requires a META-INF/persistence.xml file to define the persistence unit. Spring Boot replaces this entirely with auto-configuration driven by application.properties:
Spring Boot scans all @Entity classes on the classpath automatically — no explicit listing required. The EntityManagerFactory bean is created by HibernateJpaAutoConfiguration and exposed as LocalContainerEntityManagerFactoryBean.
Summary
JPA defines the contract; Hibernate fulfils it. The EntityManager is your primary API: obtain one per transaction (Spring handles this), use persist/find/merge/remove to manage entity lifecycle, and let dirty checking handle UPDATEs automatically. In the next lesson you will write your first @Entity class and see exactly how Hibernate maps its fields to a database table.