InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

Sun's Garbage First Collector Largely Eliminates Low Latency/High Throughput Tradeoff

Posted by Charles Humble on Apr 15, 2009

Sections
Architecture & Design,
Development,
Operations & Infrastructure
Topics
Performance & Scalability ,
Java ,
Deployment / Datacenter
Tags
Concurrency ,
JVM ,
GarbageCollection

Sun's Garbage First garbage collector (hereafter referred to by its nickname G1) is the new low latency garbage collector planned to replace CMS in the Hotspot JVM. It is a server-style collector, targeted at multi-processor machines with large amounts of memory. There are two major differences between CMS and G1. The first is that G1 is a compacting collector. Compacting, a process by which live objects are moved over free memory space towards one end of the heap so that the other becomes one contiguous free area of memory, is important in long running applications because it is inevitable that the heap will fragment over time. G1 compacts sufficiently to completely avoid the use of fine-grain free lists for allocation, which considerably simplifies parts of the collector and mostly eliminates potential fragmentation issues. As well as compacting, G1 offers more predictable garbage collection pauses than can be obtained with the CMS collector and allows users to set their desired pause targets. This strong determinism gives G1 some of the characteristics of a true real-time collector but it isn't genuinely hard real-time since factors like OS scheduling still mean that the pauses cannot be guaranteed. It is however considerably easier for developers to use than the Java real-time products since existing code is able to take advantage of the improved performance that it offers without needing to make any code-level changes. G1 uses a number of interesting techniques, based on global marking information and other metrics, to prioritise regions for collection according to their GC efficiency. A previous InfoQ article provides more of the technical details.

In a recent podcast, James Gosling highlights the importance of G1 for certain kinds of large-scale Java applications, such as financial exchanges, which are characterised by large amounts of live heap data and considerable thread-level parallelism, and are often run on high-end multi-core processors:

"...the deep hidden secret about many of these Java apps is that they don't really use databases. Instead of databases they use huge amounts of RAM and push the garbage collector like mad because they cannot afford to touch the disk ever. When you are doing many, many thousands of transactions per second it's all about keeping everything in RAM, using hash tables, getting as many cores focused on the transactions as possible, and they usually have big issues about transaction latency."

Gosling goes on to talk about the trade-offs between throughput and determinism. Typically a garbage collector is optimised for one or the other. A garbage collector optimised for throughput is ideally suited to tasks like long running batch jobs, where pauses for the garbage collector are less of an issue than getting the entire batch run to complete as rapidly as possible. Conversely, if you are working on an interactive system such as a web application, then a low latency garbage collector is generally the best choice. Gosling makes the point that these trade-offs also exist elsewhere in the JVM and that generally the JVM is optimised for throughput. Indeed:

"It happens everywhere where there are algorithms that reorganise anything. So take a hash table - everyone thinks that a hash table is constant time insertion and constant time removal, which is false. It's constant time insertion until it has to re-hash and then that one insertion will take a long time."

By allowing users to specify an explicit target that garbage collection consume no more than x ms of any y ms time slice, G1 can try to keep collection pauses as small and infrequent as necessary for the application, but not so low as to decrease throughput or increase footprint unnecessarily. Given that garbage collectors are one of the areas where the throughput/low latency trade-off is most visible, G1 should offer significant benefits to Java Enterprise developers. It is available in the early access release of Java 6 update 14 and Sun's Hotspot team are extremely keen to get feedback and bug reports from early adopters.

No comments

Watch Thread Reply

Educational Content

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.

Beauty Is in the Eye of the Beholder

Alex Papadimoulis discusses ugly code, where it comes from, how to avoid it, and how to get rid of it.

Architecting Visa for Massive Scale and Continuous Innovation

John Davies examines Visa’s architecture and shows how enterprises have architected complex integrations incorporating Hadoop, memcached, Ruby on Rails, and others to deliver innovative solutions.

Max Protect: Scalability and Caching at ESPN.com

Sean Comerford unveils ESPN.com’s architecture, what components are used and why, and the current changes the website goes through.

The Seven Deadly Sins of Enterprise Agile Adoption

Are there repeated patterns of failure on Enterprise Agile Enablement efforts? Sanjiv and Arlen discuss Seven Deadly Sins to avoid when adopting Agile in an enterprise.

Questions for an Enterprise Architect

Erik Dörnenburg answers: What is Enterprise and Evolutionary Architecture?, discussing 4 issues: Turning strategy into execution, Ensuring conformance, Where do the architects sit? Buying or building?

Wrap Your SQL Head Around Riak MapReduce

Sean Cribbs explains what Map-Reduce and Riak are, why and how to use Map-Reduce with Riak, and how to convert SQL queries into their Map-Reduce equivalents.