Profiling Tools
Profiling Tools
Benchmarking tells you how fast your code runs. Profiling tells you why it runs that way. A profiler attaches to a running JVM and continuously samples or instruments it, giving you a live breakdown of CPU time, heap allocations, thread states, garbage collection, and lock contention — exactly the data you need to fix a problem you have already measured.
In this lesson we cover the three essential profiling tools for production-grade Java work: Java Flight Recorder (JFR), VisualVM, and the art of reading a heap dump.
Java Flight Recorder
JFR is a production-safe, low-overhead profiler built into the JVM itself. It was open-sourced as part of OpenJDK in Java 11 and ships with every JDK since then — no extra download, no agent, no licence fee. The overhead is typically under 1 % for most workloads, which makes it safe to run continuously in production.
JFR works by recording events. Every JVM subsystem (GC, JIT, class loading, socket I/O, thread locks, file I/O, and more) fires events with timestamps and metadata. JFR stores those events in a highly efficient binary format in a ring buffer and flushes them to a .jfr file on request. You then open that file offline in JDK Mission Control (JMC) to analyse it.
Starting a recording
There are three ways to start a JFR recording.
1. At JVM startup (best for catching problems from the very beginning):
2. On demand via jcmd while the app is running (no restart needed):
3. Programmatically inside the application:
default profile is designed to have near-zero overhead; it records GC events and a few I/O events. The profile setting also enables CPU sampling at 10 ms intervals, object allocation profiling, and lock profiling — higher detail, still very low overhead, but not zero. Use default in production continuously; switch to profile for targeted investigations.
Reading a JFR file in JDK Mission Control
Download JMC from jdk.java.net/jmc (it is a separate download from the JDK itself). Open a .jfr file and explore:
- Automated Analysis — JMC scans the recording and flags anomalies (high GC pause, lock contention hotspot, suspicious allocations). Start here.
- Method Profiling — a flame graph / call tree of sampled CPU time. The hottest frame is your performance bottleneck.
- Memory — allocation profiling shows which call sites allocate the most, and heap live-set over time.
- Threads — thread-state timeline: green = running, yellow = waiting on lock, purple = sleeping. Wide yellow bands mean contention.
- Garbage Collections — every GC pause, its type, duration, heap before and after, and the trigger.
maxsize=250m, no duration) and dump it on demand when an incident occurs — you get the last few minutes of profiling data right up to the problem without having planned ahead. This is far more useful than attaching a profiler after the fact.
VisualVM
VisualVM is a free, GUI-based profiler that connects to a local or remote JVM over JMX. It is installed separately from the JDK (visualvm.github.io) and is the most accessible starting point for developers who want a visual, real-time view of a running process.
What VisualVM gives you at a glance:
- Overview — JVM flags, system properties, uptime, PID.
- Monitor — live CPU %, heap used/committed, thread count, class count. Instantly shows whether you have a heap growth trend or a CPU spike.
- Threads — live thread timeline, thread dump on demand. Deadlocks are detected and highlighted.
- Sampler — low-overhead CPU and memory sampling. CPU sampling shows a hot method table; memory sampling shows allocation by class. Use the Sampler for quick investigations without stopping the world.
- Profiler — instrumentation mode (every method entry/exit is counted). More accurate but adds overhead — avoid on production traffic.
- Heap Dump — trigger or import a heap dump and browse object counts, retained sizes, and reference chains.
To connect to a remote process, start the JVM with JMX enabled:
In VisualVM, choose File → Add JMX Connection and enter host:9010.
ssh -L 9010:localhost:9010 user@host, then connect to localhost:9010 in VisualVM.
Reading a Heap Dump
A heap dump is a snapshot of all live objects in the JVM heap at a single point in time. It is the definitive tool for diagnosing memory leaks: everything on the heap is visible, along with reference chains that explain why objects are alive.
Capturing a heap dump
-XX:+HeapDumpOnOutOfMemoryError in production. If the app ever OOMs you get the dump automatically, which is often the only chance to know what was consuming the heap at that exact moment.
Analysing the dump
Open the .hprof file in Eclipse Memory Analyzer Tool (MAT) or VisualVM's Heap Dump viewer. The key concepts to understand:
- Shallow size — the memory used by the object itself (its fields). Useful for understanding object layout.
- Retained size — the memory that would be freed if this object were garbage collected, i.e., the object plus everything exclusively reachable through it. This is what matters for leak analysis.
- Dominator tree — a tree where each node is the object whose removal would free the most memory. The top of the dominator tree shows the biggest memory consumers; these are usually where leaks originate.
- GC roots — the starting points from which the GC traces reachability: static fields, thread stacks, JNI references. An object is alive because a chain of references leads to it from a GC root. The leak is the link that should have been cleared.
In MAT, run the Leak Suspects Report first. It auto-detects accumulator objects (collections or caches that hold tens of thousands of entries) and traces them back to the GC root. You then look at the reference chain to understand which component holds the reference and why it was never released.
.hprof file (uncompressed). MAT needs roughly 1–1.5x that as Java heap itself to analyse it. Use the 64-bit MAT binary and give it at least -Xmx6g if analysing a large dump. VisualVM's built-in heap viewer works better for smaller dumps (under 1 GB).
Choosing the Right Tool
- JFR + JMC — always-on production profiling, accurate timing, minimal overhead, rich event ecosystem. First choice for production investigations.
- VisualVM — fast, visual, great for local development and quick checks on staging. Easy to share findings as screenshots.
- Heap dump + MAT/VisualVM — diagnosing memory leaks and understanding live object graphs. Not a real-time tool; used reactively.
A mature performance workflow combines all three: run JFR continuously, use VisualVM for live exploration during development, and have a heap dump ready to capture automatically on OOM.