Collaboration: At the Extremities of Extreme
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.
Posted by Charles Humble on May 07, 2008
Kawaguchi focuses on two main areas. The first is loop unrolling, a technique where the instructions called in each iteration of the loop are replicated to form a single sequence. This saves time by reducing the number of overhead instructions that the computer has to execute in a loop. JIT combines this with a warm up and cool down and Kawaguchi's commentary highlights the fact that the compiler has removed a redundant array index check from the fast part of the loop. Moreover the resulting assembly code demonstrates how much processor specific optimization is going on. Kawaguchi notes, for instance, that the following code:
private static byte[] foo() {
byte[] buf = new byte[256];
for( int i=0; i<buf.length; i++ )
buf[i] = 0;
return buf;
}
results in assembly using the R8-R15 general purpose registers specific to the AMD64 chip.
The second area Kawaguchi's blog examines is the optimizations that have been made around locks. Whilst uncontended lock acquisition has been steadily improving in Java for some time, contended acquisition has remained problematic. Work in this area is still on-going but Kawaguchi's work does highlight several areas that have improved.
The article shows a number of other features of the Hotspot compiler, including how aggressive the inlining is - James Gosling notes in a corresponding blog post that "even storage allocation & initialization gets inlined". Part of the reason this level of aggression is possible is that the JVM is able to make potentially unsafe optimizations where necessary. Charles Nutter provides a good explanation of this in a post on attending the Lang .NET conference from earlier in the year. His post also highlights the relevance of this work to JRuby, and by implication any language which targets the JVM.
"The JVM has over time had varying capabilities to dynamically optimize and reoptimize code...and perhaps most importantly, to dynamically *deoptimize* code when necessary. Deoptimization is very exciting when dealing with performance concerns, since it means you can make much more aggressive optimizations--potentially unsafe optimizations in the uncertain future of an entire application--knowing you'll be able to fall back on a tried and true safe path later on. So you can do things like inlining entire call paths once you've seen the same path a few times. You can do things like omitting synchronization guards until it becomes obvious they're needed. And you can change the set of optimizations applied after the fact...in essence, you can safely be "wrong" and learn from your mistakes at runtime. This is the key reason why Java now surpasses C and C++ in specific handcrafted benchmarks and why it should eventually be able to exceed C and C++ in almost all benchmarks. And it's a key reason why we're able to get acceptable performance out of JRuby having done far less work than Microsoft has done on IronPython and the DLR."
It has always been theoretically possible that an interpreted language like Java would ultimately surpass the performance of a complied language, since it is able to make optimizations at run-time based on the available hardware, and the increasing use of processor specific optimization in Java is a particularly exciting development in this regard. For the developer targeting the Java platform a significant plus point is that with each new release of the Java compiler code performance improves without any changes needing to be made to the application's source.
Using Drools? See what you're missing! Get the Power of Drools with the Assurance of Red Hat
Why NoSQL? A primer on Managing the Transition from RDBMS to NoSQL
18 agile and lean practices for effective software development governance
Improve Java Garbage Collection, Runtime Execution, and JVM visibility with Zing
As you can see in the mentioned post, the correct syntax for the method should look like the following:
private static byte[] foo() {
byte[] buf = new byte[256];
for( int i=0; i<buf.length; i++ )
buf = 0;
return buf;
}
(just a problem with the interpretation of the < sign in the post.)
Many thanks for pointing this - have corrected the post.
Charles
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).
Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.
Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.
John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.
2 comments
Watch Thread Reply