Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Heapothesys - an Open-Source GC Latency Benchmark by Amazon Corretto

Heapothesys - an Open-Source GC Latency Benchmark by Amazon Corretto

This item in japanese

The Amazon Corretto team has introduced Heapothesys, a collection of JVM garbage collection (GC) workloads designed for the application developer to compare alternative GC algorithms and configuration choices, and to detect GC performance and latency regressions.

Heapothesys, pronounced /hɪˈpɒθɪsɪs/, is comprised of two utilities: HyperAlloc, an open-source GC latency benchmark that synthesizes a workload on Java applications to study the effects on GC latency; and Extremem, a test workload that evaluates the strengths and weaknesses of the various approaches to pauseless garbage collection. The workload within Extremem commingles the allocation and deallocation of heap objects with the execution of business logic. We discuss the details of HyperAlloc in this news article.

HyperAlloc simulates fundamental application characteristics that creates and tests GC load scenarios defined by allocation rates, heap occupancy and JVM flags. Using the resulting JVM pauses, developers may produce their own reference points to study GC boundaries within their applications.

Inspiration for HyperAlloc came from HeapFragger, a utility that induces heap fragmentation created by Gil Tene, CTO and co-founder at Azul Systems. Although not as feature-rich as HeapFragger, HyperAlloc focuses on accurately predicting resulting GC allocation rates by evaluating two primary factors directly responsible for GC stress: heap allocation rate and heap occupancy defined as the predetermined amount of total persistent live objects resulting from the object scanning process during garbage collection. These two factors, among others, may be configured through a set of command-line arguments: -a for the heap allocation (default: 1024 MB/sec) and -h for the heap occupancy (default: 64 MB).

JVM pauses may be measured with a utility such as jHiccup, an open-source tool designed to measure an application's pauses and stalls, also known as "hiccups," associated with the underlying Java runtime platform.

jHiccup may be initiated multiple ways:

  • run as a Java agent (e.g., java -javaagent:jHiccup.jar -jar <filename>.jar)
  • injected into a running application (e.g., jHiccup -p <pid>)
  • run using a convenient wrapper command for an existing Java application (e.g., jHiccup java <myApp> ...)

Similar to HyperAlloc, jHiccup also supports a complete set of command-line arguments.

Getting Started

Consider the following template on how to configure the JVM, jHiccup and HyperAlloc with their respective command-line arguments:

java -Xms<bytes> -Xmx<bytes> <GC options> <other JVM options> -Xlog:gc:<GC log file> -javaagent:<jHiccup directory>/jHiccup.jar='-a -d 0 -i 1000 -l <jHiccup log file>' -jar <HyperAlloc directory>/HyperAlloc-1.0.jar -a <MB/sec> -h <MB> -d <seconds> -c <true/false> -l <CVS output file>

Now consider the following two concrete examples using the template:

java -Xms1g -Xmx4g -Xlog:gc:gc.log -javaagent:jHiccup.jar='-d 0 -l hiccups.hlog' -jar HyperAlloc-1.0.jar -a 65536 -h 128 -d 300 -l output.csv

java -Xms1g -Xmx4g -Xlog:gc:gc.log -javaagent:jHiccup.jar='-d 0 -l hiccups.hlog' -jar HyperAlloc-1.0.jar -a 131072 -h 128 -d 300 -l output.csv

Each of these measurements will start the JVM with a heap size of 1 GB (-Xms1g), allow it to grow to 4 GB (-Xmx4g) and create the gc.log file (-Xlog:gc:gc.log). HyperAlloc was configured to execute for 300 seconds and to use heap allocation rates of 65536 MB/sec and 131072 MB/sec, respectively.

Output and Data Analysis

After the two measurements have been completed, the generated output includes: gc.log logging relevant GC information; hiccups.hlog logging JVM pauses from jHiccup; and output.csv logging all HyperAlloc parameters used in the measurement including the default values not specified on the command line.

The results may be plotted with HistogramLogAnalyzer, a utility that plots log files in a histogram log format (hlog) produced by jHiccup, Cassandra Stress and other tools that support the hlog file format.

For the measurement with a heap allocation rate of 65536 MB/sec:

For the measurement with a heap allocation rate of 131072 MB/sec:

Note the significant difference in maximum latency time using the higher heap allocation rate.

jHiccup provides its own utility to process hlog files. For each of the two measurements:

<jHiccup directory>/jHiccupLogProcessor -i hiccups.hlog -o out.log

This produces out.log and out.log.hgrm, the latter of which may be imported and plotted into the High Dynamic Range (HDR) Histogram plotting application, HdrHistogram, an online plotting tool that accepts files in hgrm format. While this method requires an extra step to produce output files in hgrm format, multiple files may be imported to produce an overlay plot of the two measurements as shown below:

The Amazon Corretto team will be working on enhancing Heapothesys to better model and predict additional application behaviors. Developers are encouraged to collaborate on this project and track issues and ideas on their issue list.

Rate this Article