Sun's Garbage First Collector Largely Eliminates Low Latency/High Throughput Tradeoff
Sun's Garbage First garbage collector (hereafter referred to by its nickname G1) is the new low latency garbage collector planned to replace CMS in the Hotspot JVM. It is a server-style collector, targeted at multi-processor machines with large amounts of memory. There are two major differences between CMS and G1. The first is that G1 is a compacting collector. Compacting, a process by which live objects are moved over free memory space towards one end of the heap so that the other becomes one contiguous free area of memory, is important in long running applications because it is inevitable that the heap will fragment over time. G1 compacts sufficiently to completely avoid the use of fine-grain free lists for allocation, which considerably simplifies parts of the collector and mostly eliminates potential fragmentation issues. As well as compacting, G1 offers more predictable garbage collection pauses than can be obtained with the CMS collector and allows users to set their desired pause targets. This strong determinism gives G1 some of the characteristics of a true real-time collector but it isn't genuinely hard real-time since factors like OS scheduling still mean that the pauses cannot be guaranteed. It is however considerably easier for developers to use than the Java real-time products since existing code is able to take advantage of the improved performance that it offers without needing to make any code-level changes. G1 uses a number of interesting techniques, based on global marking information and other metrics, to prioritise regions for collection according to their GC efficiency. A previous InfoQ article provides more of the technical details.
In a recent podcast, James Gosling highlights the importance of G1 for certain kinds of large-scale Java applications, such as financial exchanges, which are characterised by large amounts of live heap data and considerable thread-level parallelism, and are often run on high-end multi-core processors:
"...the deep hidden secret about many of these Java apps is that they don't really use databases. Instead of databases they use huge amounts of RAM and push the garbage collector like mad because they cannot afford to touch the disk ever. When you are doing many, many thousands of transactions per second it's all about keeping everything in RAM, using hash tables, getting as many cores focused on the transactions as possible, and they usually have big issues about transaction latency."
Gosling goes on to talk about the trade-offs between throughput and determinism. Typically a garbage collector is optimised for one or the other. A garbage collector optimised for throughput is ideally suited to tasks like long running batch jobs, where pauses for the garbage collector are less of an issue than getting the entire batch run to complete as rapidly as possible. Conversely, if you are working on an interactive system such as a web application, then a low latency garbage collector is generally the best choice. Gosling makes the point that these trade-offs also exist elsewhere in the JVM and that generally the JVM is optimised for throughput. Indeed:
"It happens everywhere where there are algorithms that reorganise anything. So take a hash table - everyone thinks that a hash table is constant time insertion and constant time removal, which is false. It's constant time insertion until it has to re-hash and then that one insertion will take a long time."
By allowing users to specify an explicit target that garbage collection consume no more than x ms of any y ms time slice, G1 can try to keep collection pauses as small and infrequent as necessary for the application, but not so low as to decrease throughput or increase footprint unnecessarily. Given that garbage collectors are one of the areas where the throughput/low latency trade-off is most visible, G1 should offer significant benefits to Java Enterprise developers. It is available in the early access release of Java 6 update 14 and Sun's Hotspot team are extremely keen to get feedback and bug reports from early adopters.
Shane Hastie on Distributed Agile Teams, Product Ownership and the Agile Manifesto Translation Program
Shane Hastie Apr 17, 2015