InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Java 6 Hotspot Performance

Posted by Charles Humble on May 07, 2008

Sections
Development,
Architecture & Design
Topics
Java ,
Performance & Scalability ,
Compilers
Using the JDK6 u10 b14 debug build, Sun Microsystems' Kohsuke Kawaguchi examined the assembly code produced by the Hotspot JIT in a recent blog post. The post highlights just how far Java optimization has come.

Kawaguchi focuses on two main areas. The first is loop unrolling, a technique where the instructions called in each iteration of the loop are replicated to form a single sequence. This saves time by reducing the number of overhead instructions that the computer has to execute in a loop. JIT combines this with a warm up and cool down and Kawaguchi's commentary highlights the fact that the compiler has removed a redundant array index check from the fast part of the loop. Moreover the resulting assembly code demonstrates how much processor specific optimization is going on. Kawaguchi notes, for instance, that the following code:

private static byte[] foo() {
byte[] buf = new byte[256];
for( int i=0; i<buf.length; i++ )
buf[i] = 0;
return buf;
}

results in assembly using the R8-R15 general purpose registers specific to the AMD64 chip.

The second area Kawaguchi's blog examines is the optimizations that have been made around locks. Whilst uncontended lock acquisition has been steadily improving in Java for some time, contended acquisition has remained problematic. Work in this area is still on-going but Kawaguchi's work does highlight several areas that have improved.

The article shows a number of other features of the Hotspot compiler, including how aggressive the inlining is - James Gosling notes in a corresponding blog post that "even storage allocation & initialization gets inlined". Part of the reason this level of aggression is possible is that the JVM is able to make potentially unsafe optimizations where necessary. Charles Nutter provides a good explanation of this in a post on attending the Lang .NET conference from earlier in the year. His post also highlights the relevance of this work to JRuby, and by implication any language which targets the JVM.

"The JVM has over time had varying capabilities to dynamically optimize and reoptimize code...and perhaps most importantly, to dynamically *deoptimize* code when necessary. Deoptimization is very exciting when dealing with performance concerns, since it means you can make much more aggressive optimizations--potentially unsafe optimizations in the uncertain future of an entire application--knowing you'll be able to fall back on a tried and true safe path later on. So you can do things like inlining entire call paths once you've seen the same path a few times. You can do things like omitting synchronization guards until it becomes obvious they're needed. And you can change the set of optimizations applied after the fact...in essence, you can safely be "wrong" and learn from your mistakes at runtime. This is the key reason why Java now surpasses C and C++ in specific handcrafted benchmarks and why it should eventually be able to exceed C and C++ in almost all benchmarks. And it's a key reason why we're able to get acceptable performance out of JRuby having done far less work than Microsoft has done on IronPython and the DLR."

It has always been theoretically possible that an interpreted language like Java would ultimately surpass the performance of a complied language, since it is able to make optimizations at run-time based on the available hardware, and the increasing use of processor specific optimization in Java is a particularly exciting development in this regard. For the developer targeting the Java platform a significant plus point is that with each new release of the Java compiler code performance improves without any changes needing to be made to the application's source.

corrected listing... by Mirko Jahn Posted
Re: corrected listing... by Charles Humble Posted
  1. Back to top

    corrected listing...

    by Mirko Jahn

    As you can see in the mentioned post, the correct syntax for the method should look like the following:


    private static byte[] foo() {
    byte[] buf = new byte[256];
    for( int i=0; i<buf.length; i++ )
    buf = 0;
    return buf;
    }

    (just a problem with the interpretation of the < sign in the post.)

  2. Back to top

    Re: corrected listing...

    by Charles Humble

    Many thanks for pointing this - have corrected the post.

    Charles

Educational Content

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.

Beauty Is in the Eye of the Beholder

Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.

Architecting Visa for Massive Scale and Continuous Innovation

John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.