The Challenges in Java Benchmarking

| by Steven Haines Follow 0 Followers on Aug 09, 2008. Estimated reading time: 2 minutes |
Brent Boyer, a programmer at Elliptic Group, published a two-part article series on IBM's DeveloperWorks entitled Robust Java Benchmarking that explores the challenges in capturing effective Java benchmarks. He begins by discussing various JVM idiosyncrasies and optimizations in current compilers that might negatively affect performance testing. For example, he shows that if a complicated section of code computes a value that is never used, that an aggressive compiler might optimize out the section of code and hence the benchmark will not include it. To illustrate this point, one of his code snippets consistently runs in 4.9 seconds on his computer, but if he omits a println that displays the result, it consistently runs in 0.08 seconds. He also points out that the granularity of time measurements varies on different operating systems so benchmarking exercises need to be aware of this when interpreting the results. He shows that System.currentTimeMillis() is not a good measurement of elapsed time (as compared to System.nanoTime()) as it only has a 15ms accuracy on Windows XP (but it does have a 1ms accuracy in Linux with a 2.6 kernel.)

After addressing these peculiar behaviors, Boyer reviews more relevant problems that typical benchmarks might not consider, such as JVM caching and resource reclamations (e.g. garbage collection and object finalization.) He proposes that the only way to effectively avoid these problems is to "warm up" the code until it reaches a steady state. The warm up process can be time consuming and challenging because some JVMs can execute a method 10,000 times before it triggers a compilation (it will remain interpreted until then.) Once the code is in a steady state then the benchmark must run it several times and compute a statistical analysis of the results.

In addition to describing the problem, Boyer proposes the solution of adopting a benchmarking framework, one of which he has written and made available. Using his framework he shows the differences between accessing data contained within data structures (raw arrays, ArrayLists, Vectors, HashMap, TreeMap, and so forth) with varying number of contained elements. Boyer's analysis yields two interesting observations: (1) his benchmarking framework is able to report mean access times when those times are as quick as a few nanoseconds and (2) the behavior of some data structures is surprising under different loads. One peculiarity was in the behavior of the ConcurrentHashMaps when compared to a TreeMap: the CurrentHashMaps performed considerably better that the TreeMap with 1024 elements, but only marginally better with 1024x1024 elements. This is unexpected because hash maps have constant time search whereas trees have log(n) search. Regardless of the unexpected results, the article is worth reading and Boyer's concerns are worth considering when benchmarking Java code. 

Rate this Article

Adoption Stage

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread


Login to InfoQ to interact with what matters most to you.

Recover your password...


Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.


More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.


Stay up-to-date

Set up your notifications and don't miss out on content that matters to you