Challenges Performing Background Compilation in V8
By doing the optimization compilation in a separate thread, the application is not only more responsive, but it is faster by 27% on Nexus 5 in the Mandreel test from the Octane 2.0 benchmark suite, according to Guo.
InfoQ performed some tests on Chrome 33, with (--js-flags="--concurrent-recompilation") and without concurrent recompilation (--js-flags="--no-concurrent-recompilation"), and noticed the following performance improvements in the Octane 2.0 benchmarks, considering the average results from 5 consecutive runs of the tests having restarting the browser between each run:
|Octane 2.0 (all 17 tests)||
Higher improvements were noticed for 2D and 3D physics engines, while for the entire Octane suite of benchmarks we got a 7% improvement.
We asked Guo why optimizing compilation was not introduced when Crankshaft was released in December 2010. Making sure that we know he’s not speaking for Google and at that time he was not with the team, Guo said that improvements are done based on an actual need:
Without attempting to be exhaustive, Guo also told us what were some of the challenges to be dealt with in order to implement background compilation in V8:
- As every computer scientist can tell you, multithreading is hard to get right. Good test coverage is hard to get. Bugs may be hard or impossible to reproduce due to the inherent non-deterministic behavior. Having a good set of test cases, using invariants guarded by assertions, fuzz testing and last but not least Canary test coverage can give much confidence that it's correct. Kudos to the ThreadSanitizer team btw.
- V8 has a relocating GC, meaning whenever GC kicks in, objects may be moved, so references to it have to be updated. That could very well happen while a compile job is underway. If object references kept by the compile job are not updated, we end up with invalid memory accesses.
- Execution continues during concurrent compilation. That means that the state of the VM and object content and layout can change arbitrarily. Assumptions made upon those facts at the start of the compile job may not hold at the end any longer. The code produced at the end may not even be valid. Running it would cause bugs and crashes. This has to be dealt with correctly.
- In fact, having the background thread accessing the heap at any time will very likely lead to race conditions. We avoid that by gathering all necessary information for the compile job upfront.
- Finding the correct time to kick of a compilation job in the background thread is tricky: there is just no way to foresee for sure whether investing time in optimizing a piece of code is worthwhile, and whether it should have been done earlier to reap the benefits. Formulating a heuristic solution to take care of that is even harder. A lot of fine tuning was necessary, and it is still work in progress.
- The life cycle of a piece of source code has already been complicated, with it going through interconnected states, like being lazily parsed, compiled for the first time using the fast compiler, then optimized by the optimizing compiler, then maybe deoptimized (if assumptions made at compile time break later on), etc.. With concurrent compilation, a couple of new states are added to this life cycle. Keeping track of all of them and ensure that transitioning between them is bug-free and efficient is non-trivial. Unexpected corner cases may cause problems.
According to Guo, “V8 is under active development and being steadily improved”, and that can be seen in the live performance chart maintained by Dart where V8 jumped 30% in the DeltaBlue benchmark on Feb. 11th, improvement resulting from compiler optimizations, not being related to background compilation.