InfoQ

News

Java 6 Hotspot Performance

Posted by Charles Humble on May 07, 2008 04:05 PM

Community
Java
Topics
Compilers,
Performance & Scalability
Using the JDK6 u10 b14 debug build, Sun Microsystems' Kohsuke Kawaguchi examined the assembly code produced by the Hotspot JIT in a recent blog post. The post highlights just how far Java optimization has come.

Kawaguchi focuses on two main areas. The first is loop unrolling, a technique where the instructions called in each iteration of the loop are replicated to form a single sequence. This saves time by reducing the number of overhead instructions that the computer has to execute in a loop. JIT combines this with a warm up and cool down and Kawaguchi's commentary highlights the fact that the compiler has removed a redundant array index check from the fast part of the loop. Moreover the resulting assembly code demonstrates how much processor specific optimization is going on. Kawaguchi notes, for instance, that the following code:

private static byte[] foo() {
byte[] buf = new byte[256];
for( int i=0; i<buf.length; i++ )
buf[i] = 0;
return buf;
}

results in assembly using the R8-R15 general purpose registers specific to the AMD64 chip.

The second area Kawaguchi's blog examines is the optimizations that have been made around locks. Whilst uncontended lock acquisition has been steadily improving in Java for some time, contended acquisition has remained problematic. Work in this area is still on-going but Kawaguchi's work does highlight several areas that have improved.

The article shows a number of other features of the Hotspot compiler, including how aggressive the inlining is - James Gosling notes in a corresponding blog post that "even storage allocation & initialization gets inlined". Part of the reason this level of aggression is possible is that the JVM is able to make potentially unsafe optimizations where necessary. Charles Nutter provides a good explanation of this in a post on attending the Lang .NET conference from earlier in the year. His post also highlights the relevance of this work to JRuby, and by implication any language which targets the JVM.

"The JVM has over time had varying capabilities to dynamically optimize and reoptimize code...and perhaps most importantly, to dynamically *deoptimize* code when necessary. Deoptimization is very exciting when dealing with performance concerns, since it means you can make much more aggressive optimizations--potentially unsafe optimizations in the uncertain future of an entire application--knowing you'll be able to fall back on a tried and true safe path later on. So you can do things like inlining entire call paths once you've seen the same path a few times. You can do things like omitting synchronization guards until it becomes obvious they're needed. And you can change the set of optimizations applied after the fact...in essence, you can safely be "wrong" and learn from your mistakes at runtime. This is the key reason why Java now surpasses C and C++ in specific handcrafted benchmarks and why it should eventually be able to exceed C and C++ in almost all benchmarks. And it's a key reason why we're able to get acceptable performance out of JRuby having done far less work than Microsoft has done on IronPython and the DLR."

It has always been theoretically possible that an interpreted language like Java would ultimately surpass the performance of a complied language, since it is able to make optimizations at run-time based on the available hardware, and the increasing use of processor specific optimization in Java is a particularly exciting development in this regard. For the developer targeting the Java platform a significant plus point is that with each new release of the Java compiler code performance improves without any changes needing to be made to the application's source.

2 comments

Reply

corrected listing... by Mirko Jahn Posted May 8, 2008 4:29 AM
Re: corrected listing... by Charles Humble Posted May 8, 2008 1:27 PM
  1. Back to top

    corrected listing...

    May 8, 2008 4:29 AM by Mirko Jahn

    As you can see in the mentioned post, the correct syntax for the method should look like the following:

    private static byte[] foo() {
        byte[] buf = new byte[256];
        for( int i=0; i<buf.length; i++ )
            buf[i] = 0;
        return buf;
    }
    
    (just a problem with the interpretation of the < sign in the post.)

  2. Back to top

    Re: corrected listing...

    May 8, 2008 1:27 PM by Charles Humble

    Many thanks for pointing this - have corrected the post.
    Charles

Exclusive Content

Tapestry for Nonbelievers

A new article by I. Drobiazko and R. Zubairov introduces v. 5 of the Apache Tapestry component-oriented web framework. The tutorial shows how to create a component and covers IoC in Tapestry and Ajax.

Pete Lacey on REST and Web Services

In this interview, Burton Group consultant Pete Lacey talks to Stefan Tilkov about his disillusionment with SOAP, his opinion on REST, and addresses some of the perceived shortcomings REST vs. WS-*.

Business Natural Languages Development in Ruby

Jay Fields presents his concept of Business Natural Languages - a type of Domain Specific Languages geared towards being readable by domain experts.

Distributed Version Control Systems: A Not-So-Quick Guide Through

Adoption and interest for Distributed Version Control Systems is constantly rising. We will introduce the concept of DVCS and have a look at 3 actors in the area: git, Mercurial and Bazaar.

Segundo Velasquez and Agile as Seen Through the Customer's Eyes

Deborah Hartmann interviewed Segundo Velasquez about his experience as customer with an Agile team during the initial phase of software design of a product.

Fine Grained Versioning with ClickOnce

David Cooksey shows how to fine grained versioning to a ClickOnce deployment using an HttpHandler written with ASP.NET, making partial rollouts to a test audience much easier.

Implementing Manual Activities in Windows Workflow

Windows workflow (WF) is an excellent framework for implementing business processes, but lacks support for human activities. This article describes a completely generic approach for changing this.

Markus Voelter about Software Architecture Documentation

In this interview taken during OOPSLA 2007, Markus Voelter talks about the importance of documenting the software architecture, and gives some good and also bad examples on how it could be done.