Entity Relationships & Associations

Solving N+1: Join Fetch & Entity Graphs

18 min Lesson 8 of 13

Solving N+1: Join Fetch & Entity Graphs

In the previous lesson you saw how lazy loading triggers a separate SQL query for every item in a list — the N+1 select problem. This lesson covers the two main tools Hibernate and JPA give you to fix it: JPQL JOIN FETCH and Entity Graphs. Both approaches instruct the persistence provider to load the association in a single, efficient query rather than N additional ones. The choice between them is mainly about where you want to express that intent: in a query string or in a metadata annotation.

Approach 1 — JPQL JOIN FETCH

A standard JPQL join (JOIN o.items i) filters rows but does not initialise the items collection on the returned Order objects. Adding the FETCH keyword changes that: Hibernate performs an SQL INNER JOIN and uses the result to populate the collection in memory, all within one round-trip.

// Without fetch — lazy proxy, N+1 risk @Query("SELECT o FROM Order o JOIN o.items i WHERE i.productId = :pid") List<Order> findByProduct(@Param("pid") Long productId); // With FETCH — items collection populated in one query @Query("SELECT DISTINCT o FROM Order o JOIN FETCH o.items WHERE o.customer.id = :cid") List<Order> findWithItems(@Param("cid") Long customerId);

The DISTINCT keyword is important here. A SQL join duplicates the parent row for every child row. Without DISTINCT, Hibernate would return the same Order object multiple times in the list — once per OrderItem. DISTINCT tells Hibernate to deduplicate the result set in memory after hydrating the entities.

SQL DISTINCT vs JPQL DISTINCT: In Hibernate 6, DISTINCT in JPQL does not add DISTINCT to the SQL by default (it would prevent index use and force a sort). It only deduplicates in the Java object graph. If you want to suppress the SQL-level duplicate too, set the query hint HINT_PASS_DISTINCT_THROUGH to false.

Fetching Multiple Levels Deep

You can chain JOIN FETCH clauses to hydrate associations at multiple levels in a single query:

@Query(""" SELECT DISTINCT o FROM Order o JOIN FETCH o.customer c JOIN FETCH o.items i JOIN FETCH i.product p WHERE o.status = :status """) List<Order> findOrdersWithDetails(@Param("status") OrderStatus status);

This produces one SQL statement with three joins. Every Order in the result will have its customer, items, and each item's product fully initialised — no lazy proxies anywhere in the graph.

Do not JOIN FETCH two collections simultaneously. If Order had both items and vouchers as collections and you fetched both in a single query, Hibernate would raise a MultipleBagFetchException (when using List) or produce a Cartesian product that multiplies row counts. Fetch at most one collection per query; use a second query or an entity graph subgraph for the second collection.

Approach 2 — Entity Graphs

Entity graphs let you describe the fetch plan as metadata — either statically with annotations or dynamically at runtime — and then apply it to any query or find() call. The graph is decoupled from the JPQL string, which keeps queries reusable across different loading scenarios.

Static Named Entity Graph

import jakarta.persistence.*; @Entity @Table(name = "orders") @NamedEntityGraph( name = "Order.withItems", attributeNodes = { @NamedAttributeNode("items"), @NamedAttributeNode("customer") } ) public class Order { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; @ManyToOne(fetch = FetchType.LAZY) private Customer customer; @OneToMany(mappedBy = "order", fetch = FetchType.LAZY) private List<OrderItem> items = new ArrayList<>(); // getters / setters ... }

Apply the named graph in a Spring Data repository using the @EntityGraph annotation:

import org.springframework.data.jpa.repository.*; import org.springframework.data.repository.query.Param; public interface OrderRepository extends JpaRepository<Order, Long> { // Applies "Order.withItems" graph — loads items + customer in one query @EntityGraph("Order.withItems") List<Order> findByStatus(OrderStatus status); // Works with custom JPQL too @EntityGraph("Order.withItems") @Query("SELECT o FROM Order o WHERE o.customer.id = :cid") List<Order> findByCustomerId(@Param("cid") Long customerId); }

Dynamic Entity Graphs at Runtime

When you need to vary the fetch plan programmatically — for example, loading deeper subgraphs for an admin endpoint but a shallow graph for a public API — build the graph in code using the EntityManager:

import jakarta.persistence.*; import java.util.List; @Service @Transactional(readOnly = true) public class OrderService { @PersistenceContext private EntityManager em; public List<Order> findOrdersWithFullDetails(Long customerId) { EntityGraph<Order> graph = em.createEntityGraph(Order.class); graph.addAttributeNodes("customer"); Subgraph<OrderItem> itemsGraph = graph.addSubgraph("items"); itemsGraph.addAttributeNodes("product"); // fetch item.product too return em.createQuery( "SELECT DISTINCT o FROM Order o WHERE o.customer.id = :cid", Order.class) .setParameter("cid", customerId) .setHint("jakarta.persistence.fetchgraph", graph) .getResultList(); } }
fetchgraph vs loadgraph: Use the hint key "jakarta.persistence.fetchgraph" to make only the attributes in the graph EAGER (everything else stays LAZY). Use "jakarta.persistence.loadgraph" to load the graph attributes as EAGER in addition to any attributes that are already mapped EAGER. In practice, fetchgraph gives you the most explicit control.

Spring Data Shorthand — Inline Attribute Paths

For simple cases you do not need a @NamedEntityGraph on the entity at all. Spring Data JPA lets you specify attribute paths directly in the @EntityGraph annotation:

// No @NamedEntityGraph needed — attribute paths declared inline @EntityGraph(attributePaths = {"items", "items.product", "customer"}) List<Order> findAll();

This is the most concise form and works well when the fetch plan is tied to a single repository method. Prefer named graphs when the same plan is reused across multiple methods or repositories.

JOIN FETCH vs Entity Graphs — Trade-offs

  • JOIN FETCH is explicit and visible directly in the query. It is the right choice when the association is always required by that query, and when you want to control filtering (e.g., WHERE on a join attribute).
  • Entity Graphs keep queries clean and reusable. The same findByStatus method can be called with or without the graph just by adding or removing the annotation, without rewriting JPQL.
  • Both produce an SQL JOIN. The generated SQL is effectively equivalent; the difference is organisational.
Verify with SQL logging. Always enable spring.jpa.show-sql=true (or a proper log appender for org.hibernate.SQL) during development and confirm you see exactly one query, not N+1. Better yet, use p6spy or Hypersistence Optimizer to assert query counts in tests.

Pagination Pitfall with JOIN FETCH

Mixing JOIN FETCH with Spring Data pagination (Pageable) on a collection association is dangerous. Hibernate cannot push the pagination into SQL (it cannot know how many root entities a limited row count corresponds to) so it fetches all rows into memory and paginates there. You will see the Hibernate warning: "HHH90003004: firstResult/maxResults specified with collection fetch; applying in memory!"

The recommended pattern is the two-query approach: a first paged query fetches only the root entity IDs, and a second query uses those IDs with JOIN FETCH or an entity graph to load the full data:

// Step 1 — paged ID query (no fetch join, efficient SQL LIMIT/OFFSET) @Query( value = "SELECT o.id FROM Order o WHERE o.status = :status", countQuery = "SELECT COUNT(o) FROM Order o WHERE o.status = :status" ) Page<Long> findIdsByStatus(@Param("status") OrderStatus status, Pageable pageable); // Step 2 — fetch full data for those IDs (no Pageable, safe to JOIN FETCH) @Query("SELECT DISTINCT o FROM Order o JOIN FETCH o.items WHERE o.id IN :ids") List<Order> findWithItemsByIds(@Param("ids") List<Long> ids);

Summary

JOIN FETCH and entity graphs are complementary tools that solve the same root cause: unnecessary round-trips caused by lazy loading inside a loop. Use JOIN FETCH when the query logic and the fetch plan are inseparable. Use entity graphs when you want to keep queries generic and vary the fetch plan at the call site. Always confirm your fix with SQL logging, and watch for the pagination pitfall when fetching collections. With these techniques you can eliminate virtually all N+1 problems in a Hibernate-backed Spring Boot application.