Go 1.6 will Make its Garbage Collector Faster
While Go 1.5 is still relatively new on the blocks, the Go team is already at work on improving its new, low-pause, concurrent garbage collector, which aims to make Go better suited for new application fields, Google engineers Austin Clements and Rick Hudson say.
Go 1.5 was the first release to sport the new garbage collector(GC) that replaced the old “stop-the-world” GC and solved its latency problem. The new GC has lowered its activity to under 10ms for each 50ms time slot under heavy load, thus allowing Go programs to run a few percentages faster in the general case. In more extreme case, the pause could go down from 300ms to 4 ms.
With Go 1.6 the objective is stabilize the GC and improve it in a number of areas:
State coordination: a major bottleneck that Go 1.5 GC has inherited from Go 1.4 is its centralized GC coordinator, which is a single goroutine that further dispatches the work to be done to worker goroutines. One way to fix this is by replacing the centralized coordinator with a decentralized state machine. This change would additionally make it possible to redesign the mark completion barrier, which has grown messy and with bad performance, say Clements and Hudson.
Credit system: Go 1.5 uses a credit system in two different areas: to ensure that sweeping is completed between a GC cycle end and the next heap trigger; and to ensure that scanning is completed between an heap trigger and the subsequent heap goal attainment. One suggested way to improve the credit system is to make it always operate “in the black” to avoid that allocation debt is carried through to the next GC cycle.
Mark termination: the mark termination phase is, according to Clements and Hudson, the main contributor to pause time in Go 1.5. The goal here is to try and ensure that most applications can run under the 10ms threshold that Go 1.5 already attains in many cases. The changes required to this aim fall either in the easy or medium complexity category, such as moving the finalizer scanning from the mark termination phase to the concurrent scan, which should allow to save ~1ms per GB of heap, and removing an expensive count loop, which accounts for the other half of the mark termination phase for large heaps.
Sweep and scavenge: the sweeper is an area where some programs spend significant time, according to Google engineers, and giving some attention to it could bring several performance improvements. A quite radical take to this task would consist in eliminating the sweeper altogether. A less radical approach would aim to free large objects eagerly, at the very end of the GC phase, and to enable the scavenger on all systems, regardless of their physical page size.
The above provides just an overview of the changes that are being planned. To get full details, you can read the original document, which furthermore provides links to the GitHub issues documenting the rationale for each change and the suggested approach.