Azul ReadyNow! Seeks to Eliminate JVM Warm-up
JVM maker Azul Systems has announced the release of ReadyNow!, a package of features designed to obviate the need for "warming-up" the Java Virtual Machine. Warm-up is a practice performed by application owners attempting to take advantage of the JIT compiler optimizations after the JVM has had enough time to learn which code to compile to the machine level. ReadyNow! ships with the latest version of Azul's Zing runtime for Java, version 5.9.
The company says that ReadyNow! is particularly well-suited for use in the financial markets, where applications require peak system performance at critical moments like market open and during other events such as when there are unpredicted surges in volume or trading behavior.
A little background: A well-known feature of the Java platform is that as a Java application launches and executes, the JVM compiles the application into executable machine code. As the application continues to run, the JVM will evaluate the execution history of the application and recompile important parts of the application code to improve performance. Consequently, an application's performance will improve over time.
A common practice among application teams that require high performance from the start is to "warm up" the application by feeding it simulated "mock" data. According to Azul, such practices are risky and may not produce the desired optimizations.
InfoQ spoke with Azul co-founder and CTO Gil Tene about ReadyNow!
InfoQ: Optimization without warm-up seems like an oxymoron. Can you discuss how the JIT can do its job without the benefit of the warm up?
Tene: We spent the last year analyzing the causes. We've gone through this with customers and suggested possible solutions. But when we analyzed these with our customers, we learned what would work and what wouldn't. One ReadyNow! trick is taking care of speculative optimization; where a traditional JVM will optimize one path, we optimize both sides and take a 1% hit early rather than during a critical time window. We have added ‘aggressive class loading’: when we see classes in scope we load them but don’t initialize them, so if the code does go through that path it doesn't have to first load the class. And we see that frequently a lot of time is spent loading classes. Now we could initialize classes early, but initialization order has semantics, so that needs to be controlled, so we are giving people control by providing APIs so the developers can give hints to the JIT compiler about what to initialize and what to optimize. So the application code can say we want you to compile this now, or we want you to throw everything away and recompile now, for example when initialization code was optimized for the wrong case and now we want to re-optimize.
There is an interesting scenario usually observed in algorithmic trading, where you have algorithms that watch the market at high rates but trade infrequently. So the code you care about most is rarely executed. And this is precisely the code that needs to run fast. You want to be able to tell the compiler this is the code you care about, but it doesn’t run often. So this is an example of ReadyNow's ‘compiler directives’
All of this is already shipping. But in the coming months we are shipping our Holy Grail of warm-up. Let's say you had a nice hot day yesterday, and you hit a lot of things and your code was really sailing. So why not use yesterday's optimizations for today's runs? So we’re building that. This is not the end-all solution; it works nicely for certain apps as long as the code doesn’t change. For example, a matching engine would be a sensible use-case for this, because it is fairly stable, whereas an algorithmic trading system where the code is changing from day to day would have the wrong optimizations, so in that case yesterdays optimizations may not work.
InfoQ: But if code is changing how can you ever optimize?
Tene: That’s what ReadyNow! is; you need to add some optimization warm-up code to your system, so the deoptimization will go away, and we make sure not to deoptimize today, even for fresh code. Lets say we don’t remember anything, so the Holy Grail thing won’t work. Nonetheless our aggressive class loading, aggressive initialization, aggressive compiler API, these will still work regardless of history.
InfoQ: What do you mean by deoptimization?
Tene: You know warming up the JVM is hard, because the JIT compiler does some very interesting optimizations. For example, it continually monitors the executing code and makes an assumption that the code that didn’t yet run will never run. Unfortunately when people warm up trading systems, they run through some code that doesn’t exercise everything, so the code incorrectly optimizes for these mock runs. But then the JVM sees something changed in the execution pattern and so has to regress and run interpreted for a while to gather new metrics. This stage is referred to as deoptimization.
There are other examples as well of deoptimization. For example, classes that never ran now have to be loaded into the running JVM, and loading classes takes time.
The horror stories you hear are from people who figure out that anything besides authentic trades will cause this deoptimization, so they will execute a real trade and then cancel it. Imagine that risk, it's one of those things where they can route it outside so it never really gets to market. But if anything goes wrong you could have a million dollar mistake. And you know that whatever can go wrong will go wrong so eventually you are going to get burned.
InfoQ: You mentioned that this was in response to customer requests?
Tene: At Azul we frequently find ourselves in low latency Java systems, and this is the number one unsolicited feature request we receive. I have a personal counter for feature requests, and this one is up to 22, meaning 22 unrelated people have asked me to solve this trading warm up problem.
InfoQ: Thanks for speaking to us. Would you like to make any parting comments?
Yes, there is one other interesting note; when people ask us for this Holy Grail, my first question is why do you want that? They usually answer that they restart the systems every night, and don’t want to warm up. So I ask them "your algorithms don’t change for two weeks, why not just keep the code running?" So they respond "because of GC delays!"
With us, our pauseless GC doesn’t have that problem. So we are retraining people not to restart their system. So now they can restart when they need to restart and not for some operational reason.
Anatole Tresch Mar 03, 2015