GIST NOTES 14 - Java Garbage Collection
[DISCLAIMER: This is solely for non-commercial use. I don't claim ownership of this content. This is a crux of all my readings studies and analysis. Some of them are excerpts from famous books on the subject. Some of them are my contemplation upon experiments with direct hand coded code samples using IDE or notepad.
I've created this mainly to reduce an entire book into few pages of critical content that we should never forget. Even after years, you don't need to read the entire book again to get back its philosophy. I hope these notes will help you to replay the entire book in your mind once again.]
>a technique new in the J2SE 5.0 release that combines (1) automatic selection of garbage
collector, heap sizes, and HotSpot JVM (client or server) based on the platform and operating system on which the application is running, and (2) dynamic garbage collection tuning based on user-specified desired behavior. This technique is referred to as ergonomics.
A number of choices must be made when designing or selecting a garbage collection algorithm:
• Serial versus Parallel
With serial collection, only one thing happens at a time. For example, even when multiple CPUs are
available, only one is utilized to perform the collection. When parallel collection is used, the task of
garbage collection is split into parts and those subparts are executed simultaneously, on different
CPUs. The simultaneous operation enables the collection to be done more quickly, at the expense of
some additional complexity and potential fragmentation.
• Concurrent versus Stop-the-world
When stop-the-world garbage collection is performed, execution of the application is completely
suspended during the collection. Alternatively, one or more garbage collection tasks can be executed
concurrently, that is, simultaneously, with the application. Typically, a concurrent garbage collector
does most of its work concurrently, but may also occasionally have to do a few short stop-the-world
pauses. Stop-the-world garbage collection is simpler than concurrent collection, since the heap is
frozen and objects are not changing during the collection. Its disadvantage is that it may be
undesirable for some applications to be paused. Correspondingly, the pause times are shorter when
garbage collection is done concurrently, but the collector must take extra care, as it is operating over
objects that might be updated at the same time by the application. This adds some overhead to
concurrent collectors that affects performance and requires a larger heap size.
• Compacting versus Non-compacting versus Copying
After a garbage collector has determined which objects in memory are live and which are garbage, it
can compact the memory, moving all the live objects together and completely reclaiming the
remaining memory. After compaction, it is easy and fast to allocate a new object at the first free
location. A simple pointer can be utilized to keep track of the next location available for object
allocation. In contrast with a compacting collector, a non-compacting collector releases the space
utilized by garbage objects in-place, i.e., it does not move all live objects to create a large reclaimed
region in the same way a compacting collector does. The benefit is faster completion of garbage
collection, but the drawback is potential fragmentation. In general, it is more expensive to allocate
from a heap with in-place deallocation than from a compacted heap. It may be necessary to search the
heap for a contiguous area of memory sufficiently large to accommodate the new object. A third
alternative is a copying collector, which copies (or evacuates) live objects to a different memory area.
The benefit is that the source area can then be considered empty and available for fast and easy
subsequent allocations, but the drawback is the additional time required for copying and the extra
space that may be required.
Several metrics are utilized to evaluate garbage collector performance, including:
• Throughput—the percentage of total time not spent in garbage collection, considered over long
periods of time.
• Garbage collection overhead—the inverse of throughput, that is, the percentage of total time spent in
• Pause time—the length of time during which application execution is stopped while garbage
collection is occurring.
• Frequency of collection—how often collection occurs, relative to application execution.
• Footprint—a measure of size, such as heap size.
• Promptness—the time between when an object becomes garbage and when the memory becomes
Memory in the Java HotSpot virtual machine is organized into three generations: a young generation, an old
generation, and a permanent generation. Most objects are initially allocated in the young generation. The old
generation contains objects that have survived some number of young generation collections, as well as some
large objects that may be allocated directly in the old generation. The permanent generation holds objects that the JVM finds convenient to have the garbage collector manage, such as objects describing classes and methods, as well as the classes and methods themselves.
The young generation consists of an area called Eden plus two smaller survivor spaces, as shown in Figure 2. Most objects are initially allocated in Eden. (As mentioned, a few large objects may be allocated directly in the old generation.) The survivor spaces hold objects that have survived at least one young generation collection and have thus been given additional chances to die before being considered “old enough” to be promoted to the old generation. At any given time, one of the survivor spaces (labeled From in the figure) holds such objects, while the other is empty and remains unused until the next collection.
>young generation collection is known as minor collection
>young generation fills up - minor collection(collection on young generation alone) is triggered
>old generation fills up - major collection(collection on all generations) is triggered
>old generation collection algorithm is used on both old and permanent generations (compaction occurs separately for these two generations)
>young generation collection algorithm is used always on young generation except when old generation doesn't have space to accommodate new promoted objects from young gen. In this case, old generation algorithm(not CMS) is used on the entire heap
>CMS - Concurrent Mark Sweep Collector
>For multithreaded applications, allocation operations need to be multithread-safe. If global locks were used to ensure this, then allocation into a generation would become a bottleneck and degrade performance. Instead, the HotSpot JVM has adopted a technique called Thread-Local Allocation Buffers (TLABs). This improves multithreaded allocation throughput by giving each thread its own buffer (i.e., a small portion of the
generation) from which to allocate. Since only one thread can be allocating into each TLAB, allocation can take place quickly by utilizing the bump-the-pointer technique, without requiring any locking. Only infrequently, when a thread fills up its TLAB and needs to get a new one, must synchronization be utilized. Several techniques to minimize space wastage due to the use of TLABs are employed. For example, TLABs are sized by the allocator to waste less than 1% of Eden, on average. The combination of the use of TLABs and linear allocations using the bump-the-pointer technique enables each allocation to be efficient, only requiring around 10 native instructions.
>tenured - promoted to old gen
These days, many Java applications run on machines with a lot of physical memory and multiple CPUs. The
parallel collector, also known as the throughput collector, was developed in order to take advantage of available CPUs rather than leaving most of them idle while only one does garbage collection work.
>Serial collectors always use stop-the-world method
>Parallel means gc runs on multiple CPUs
>Concurrent means GC runs along with user application
>Parallel collectors need not be concurrent, they can settle for stop-the-world method
>Young Generation Collection Using the Parallel Collector
The parallel collector uses a parallel version of the young generation collection algorithm utilized by the
serial collector. It is still a stop-the-world and copying collector, but performing the young generation collection in parallel, using many CPUs, decreases garbage collection overhead and hence increases
application throughput. The only difference between them is pause time is lesser for parallel collector
>young generation collector is called copying collector because, live objects are moved from 'Eden' to 'Survivor' and 'From Survivor' to 'To Survivor' spaces
>Old Generation Collection Using the Parallel Collector
Old generation garbage collection for the parallel collector is done using the same serial mark-sweepcompact collection algorithm as the serial collector.
When to Use the Parallel Collector
Applications that can benefit from the parallel collector are those that run on machines with more than
one CPU and do not have pause time constraints, since infrequent, but potentially long, old generation
collections will still occur. Examples of applications for which the parallel collector is often appropriate
include those that do batch processing, billing, payroll, scientific computing, and so on.
You may want to consider choosing the parallel compacting collector (described next) over the parallel
collector, since the former performs parallel collections of all generations, not just the young
>Parallel Collector Selection
In the J2SE 5.0 release, the parallel collector is automatically chosen as the default garbage collector on
server-class machines (defined in Section 5). On other machines, the parallel collector can be explicitly
requested by using the -XX:+UseParallelGC command line option.
>Parallel Compacting Collector Selection
If you want the parallel compacting collector to be used, you must select it by specifying the
command line option -XX:+UseParallelOldGC.
>Concurrent Mark-Sweep (CMS) Collector
For many applications, end-to-end throughput is not as important as fast response time. Young generation
collections do not typically cause long pauses. However, old generation collections, though infrequent, can
impose long pauses, especially when large heaps are involved. To address this issue, the HotSpot JVM includes a collector called the concurrent mark-sweep (CMS) collector, also known as the low-latency collector.
>Serial-Copying to Parallel to Parallel-Compacting to Concurrent-Mark-Sweep; that is all about the flavors of gc collectors
>If your application or environmental characteristics are such that a different collector than the default is
warranted, explicitly request that collector via one of the following command line options:
>The size of the heap will oscillate as the garbage collector tries to satisfy competing goals, even if the
application has reached a steady state. The pressure to achieve a throughput goal (which may require a larger heap) competes with the goals for a maximum pause time and a minimum footprint (which both may require a smaller heap).
>Refer whitepaper for CMS
>7 Tools to Evaluate Garbage Collection Performance
Various diagnostic and monitoring tools can be utilized to evaluate garbage collection performance. This section provides a brief overview of some of them. For more information, see the “Tools and Troubleshooting” links in Section 9.
–XX:+PrintGCDetails Command Line Option
One of the easiest ways to get initial information about garbage collections is to specify the command line
option –XX:+PrintGCDetails. For every collection, this results in the output of information such as the
size of live objects before and after garbage collection for the various generations, the total available space for each generation, and the length of time the collection took.
–XX:+PrintGCTimeStamps Command Line Option
This outputs a timestamp at the start of each collection, in addition to the information that is output if the
command line option –XX:+PrintGCDetails is used. The timestamps can help you correlate garbage
collection logs with other logged events.
jmap is a command line utility included in the Solaris™ Operating Environment and Linux (but not Windows)
releases of the Java Development Kit (JDK™). It prints memory–related statistics for a running JVM or core file. If it is used without any command line options, then it prints the list of shared objects loaded, similar to what the Solaris pmap utility outputs. For more specific information, the –heap, –histo, or –permstat options can be used.
The –heap option is used to obtain information that includes the name of the garbage collector,
algorithm–specific details (such as the number of threads being used for parallel garbage collection), heap
configuration information, and a heap usage summary
The –histo option can be used to obtain a class–wise histogram of the heap. For each class, it prints the number of instances in the heap, the total amount of memory consumed by those objects in bytes, and the fully qualified class name. The histogram is useful when trying to understand how the heap is used.
Configuring the size of the permanent generation can be important for applications that dynamically generate
and load a large number of classes (Java Server Pages™ and web containers, for example). If an application loads “too many” classes, then an OutOfMemoryError is thrown.
The –permstat option to the jmap command can be used to get statistics for the objects in the permanent generation.
The jstat utility uses the built–in instrumentation in the HotSpot JVM to provide information on performance
and resource consumption of running applications. The tool can be used when diagnosing performance issues, and in particular issues related to heap sizing and garbage collection. Some of its many options can print statistics regarding garbage collection behavior and the capacities and usage of the various generations.
17 Tools to Evaluate Garbage Collection Performance Sun Microsystems, Inc.
HPROF: Heap Profiler
HPROF is a simple profiler agent shipped with JDK 5.0. It is a dynamically–linked library that interfaces to the JVM using the Java Virtual Machine Tools Interface (JVM TI). It writes out profiling information either to a file or to a socket in ASCII or binary format. This information can be further processed by a profiler front–end tool.
HPROF is capable of presenting CPU usage, heap allocation statistics, and monitor contention profiles. In
addition, it can output complete heap dumps and report the states of all the monitors and threads in the Java
virtual machine. HPROF is useful when analyzing performance, lock contention, memory leaks, and other issues.
See Section 9 for a link to HPROF documentation.
HAT: Heap Analysis Tool
The Heap Analysis Tool (HAT) helps debug unintentional object retention. This term is used to describe an object that is no longer needed but is kept alive due to references through some path from a live object. HAT provides a convenient means to browse the object topology in a heap snapshot that is generated using HPROF. The tool allows a number of queries, including “show me all reference paths from the rootset to this object.” See Section 9 for a link to HAT documentation.
Java 1.7 G1 Garbage Collector
In Java 1.7 might have a new garbage collection strategy by default. It is called G1, which is short for Garbage First. It has been experimentally launched in the Java 1.6 update 14 to replace the regular Concurrent Mark and Sweep Garbage Collectors with increased performance.
G1 is considered as "server centric" with following attributes.
G1 uses parallelism which are mostly used in hardware today.The main advantage of G1 is designed in such a way to make use of all available CPU's and utilize the processing power of all CPU's and increase the performance and speed in Garbage Collection.
Concurrency feature of G1 allows to run Java threads to minimize the heap operations at stop pauses.
Next feature which plays a key role in increasing the Garbage Collection is treating the young objects(newly created) and the old objects(which lived for some time) differently.G1 mainly focuses on young objects as they can be reclaimable when traversing the old objects.
Heap compaction is done to eliminate fragmentation problems.
G1 can be more predictable when compared to CMS.
Features of G1 Garbage Collector
A single contiguous heap which is split into same-sized regions. No separation between younger and older regions.
G1 uses evacuation pauses.Evacuation pauses are done in parallel by using all the available processors.
G1 uses a pause prediction model to meet user-defined pause time targets
Like CMS,G1 also periodically performs a concurrent marking phase.
Unlike CMS, G1 does not perform a concurrent sweeping phase.