Article: Do Java 6 threading optimizations actually work? - Part II

| by Srini Penchikala on Jun 30, 2008. Estimated reading time: 1 minute |

Designing and executing a highly-concurrent multi-threaded Java application requires that JVM threading optimization settings are configured and applied correctly. These optimizations include biased locking, lock elision (which depends on escape analysis), method inlining, loop invariant hoisting, and dead code elimination.

In part 2 of "Java 6 threading optimizations" article series, author Jeroen Borgers examines various threading optimizations and the JVM arguments to manage them. He also talks about factors like On Stack Replacement (OSR), Heap Management and Lock object data size which could significantly affect the performance of multi-threaded java applications.

Jeroen compares the performance of a sample application running on a single-core versus multi-core machine and on different operating systems like Windows, Linux, and Mac OS X. With the help of code examples, the author explains why locking costs are different on different processors. The discussion also includes a number of tools and techniques used to validate the benchmarking results of the performance tests.

He comments on the significance of JDK version and processor type on the application performance as follows:

If you need to run a multi-threaded application on a multi-core machine and you care about performance then clearly you need to be continuously upgrading to the latest revision of the version of the JDK you're working with. Many, but not all, of the optimizations are being back ported from the latest version to previous ones. You will need to ensure that all of the threading optimizations are enabled. In the 6.0 they are enabled by default. However in the 5.0 you will need to explicitly set them on the command line.

Finally, Jeroen concludes the article by saying:

Fortunately lock-coarsening and in particular biased-locking did appear to have significant effect on benchmark performance. I was hoping that escape analysis combined with lock elision would have a much greater influence than it did. This technology works but only in a very limited number of cases. To be fair, escape analysis is still in its infancy and it will take time to mature this complex tool.

Read the article Do Java 6 threading optimizations actually work? - Part II

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Thanks for a great article by Tobias Hill

I have quite reason experiences of something (sort of) related to your processor core observations. I was doing a scale-out of a poker platform where I discovered that isolating every logical cluster instance to just one of the cores on the physical machine gave much better overall performance/throughput in the cluster. Typically each 8-core server would have 8 logical cluster nodes each on a separate core. For this I wrote a small bootstrapper in C++ which managed to set the right affinity on the jvm-process when launching it.

Biased locking delay by Ismael Juma


The biased locking delay is 4 seconds and can be override with -XX:BiasedLockingStartupDelay=0 according to this message:

There are other messages in the thread that discuss biased locking, java.util.concurrency locks and atomics that are also worth reading.


See Brent Boyer's work on Micro Benchmarks by Bernd Eckenfels

See also:

which describes some of the basics of those statistics and offers a framework to automate some of the fine-tuning for Micro Benchmarks.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

3 Discuss