BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Presentations Starting Fast: Investigating Java's Static Compilation Landscape

Starting Fast: Investigating Java's Static Compilation Landscape

Bookmarks
22:21

Summary

Dan Heidinga discusses how to start a Java application faster, and how Graal Substrate VM, Quarkus, Project Leyden, and others can help with that.

Bio

Dan Heidinga wears two hats: first as an Eclipse OpenJ9 project lead, and the second as a Red Hat employee. Fortunately, both of those hats let him hack on the OpenJ9 JVM which he's been doing since 2007. Along the way he's been part of the Expert Groups for multiple JSRs.

About the conference

QCon Plus is a virtual conference for senior software engineers and architects that covers the trends, best practices, and solutions leveraged by the world's most innovative software organizations.

Transcript

Heidinga: We're here to talk about starting fast, investigating Java static compilation landscape. When I think landscape, one of the things that comes to mind, at least for me, is Bon Echo Provincial Park. Bon Echo Provincial Park is a place I've been going to since long before I was born. My family's always gone there for their vacations over the summer. One of the things that you're struck by when you get to this park is a beautiful lake and then this majestic cliff that rises up across the lake staring at you. If you look into it, that cliff was formed by tremendous amounts of pressure, it's related to the glaciers that went across the area, pushing down on the land and the land pushing back up, and massive amounts of erosion as they melted, resulting in a very deep lake, and a very big cliff, which is quite beautiful.

Major Pressure: Java Is Dynamic

The reason I recount the story is because we're looking at the static compilation landscape in Java. What we see there is that Java itself has been formed by some tremendous pressures, both those of design decisions from the beginning and changes in the ecosystem and the landscape. What are some of those changes? The first major pressure is that Java is dynamic. This has been one of Java's claims to fame since the beginning. That's that everything is dynamic. Classes when you want one, you load them dynamically. You ask a classloader to load something by name, or you start talking about a new name in your source file, and that ends up resolving that name at runtime into a class. When you want to call a method, those methods get resolved at runtime. Fields, the same. None of this is statically bound.

In a language like C or C++, all of these lookups happen at compile time, they don't happen at runtime. Whereas Java is able to do this all at runtime, allowing different class implementations to be swapped in, allowing changes to be made, different JARs to be all connected together. It means that there's enough metadata kicking around so that you can do a whole bunch of introspection. You can reflect over which methods are available, which fields are available, for super classes, implemented interfaces. All kinds of questions you can ask about the hierarchy and shape of the classes that you're running. Then, of course, you can get function pointers, or other method handles to those fields and methods. Then you can execute them that way as well. While all this is running, Java agents can come along. Java agents can be configured to modify and hook just about any event. They can change classes that have already been loaded. They can change classes as they're being loaded. They can hook on to other events and operate on a system. There's a whole lot of dynamic behavior occurring, never mind all of the dynamic compilation happening.

Addressing the slow-start Reputation

Java has historically had a reputation as being slow. Originally, this reputation was all of Java, but it's switched to now people complain about how slow Java is to start. Despite all of the wonder and all of the majestic things it's been given, there's a complaint about being slow to start. Java has done a lot of work to address this over the years. If we look back to Java 1, and early days of Java, we have a system that started out interpreted, and so to get good performance, that complaint about slow went from slow in general to slow to start, because JIT compilers were introduced and those JIT compilers took some time at startup to be able to compile the code, so you could get good peak performance. Java introduced faster start modes, like -Xclient or -Xquickstart, which spent less time collecting profile data, got into JITed code faster, hoping to get your system up and running, even if it costs some peak performance.

Then along came tiered compilation, being able to move from quicker, cheaper compiles into better compiles while still having good profiling data, just to get you out of the interpreter sooner to get you starting faster. The more time we spend interpreting, the more time we spend being slow to start. Java has done a lot of work in this area. Then in the ecosystem, as well, there's been work that goes on to recognize the idea of second runs. Java is built on the idea that everything is a first run. We start our application. We load our classes. We start to use those classes to resolve fields and things. Each time we do it, the answer may be different because somebody has changed some of the classes. Really, at deployment time, it's usually not. It's usually exactly the same. Being able to recognize that allows you to reuse that class data that you've already loaded from previous runs, both HotSpot and OpenJ9 do this. HotSpot with CDS, and OpenJ9 with shared classes. OpenJ9 even goes further to start to save away compiled JIT code to be able to reuse it in subsequent runs. Then, of course, there's also tools being developed like the jaotc tool to be able to statically compile your code ahead of time.

Ecosystem Shift Puts Pressure On Startup

Java has already been working to address that slow startup. Now the ecosystem's shifted, and that's put increasing pressure on startup. We've had this shift to the cloud. I'm sure we've all talked about it and thought about how people are moving to the cloud and how that actually works for them. It brings microservices deployments, so rather than one big deployment, there's a bunch of them. Each one of those has to deal with startup. They're each getting scaled.

It's not just one of them running, it's, I'm scaling this so I'm running this same application, more instances of it as load comes in. Each one of those is dealing with more frequent startups. CI/CD has also changed the way that we deploy things. Instead of deploying once a quarter, or whatever the old deployment cadence may have been, lots of places are now deploying multiple times a day, which means that startup happens way more often than it used to. It's become a bigger and increasingly important part of the application's runtime. We take that even a step further with serverless, where we have a return to the old days of CGI-bin model of development, where you start a process every time you want to service a request. It's conceptually how it works. It's not actually how it's implemented. That's the mental model. It means that startup becomes increasingly important to be able to serve a request because it's part of that request time.

Typical Startup Time Contributors

If we look at the JDK, and we look at applications, there are three parts that contribute to startup. There's the JDK itself bringing up your Java heap, starting to do the classloading, verification, running class initializers, resolving fields and methods, and then doing all the interpretation and profiling, and finally, the JIT compilation that's happening. That's just to get the application started. There's a lot of classes to be loaded. Then the framework comes along, and it continues to load classes. In addition to that, it has to do a bunch of classpath scanning to find what annotations you've added to figure out which beans to configure, to figure out which IOC frameworks need to be configured, and how all those configuration files have to be parsed and figured out, wired together, generating code for proxies, and beans, and whatever else needs to be done. Reflectively wiring all these pieces together, before finally starting to actually load the application, which then has to find its own configuration, continuing to load classes, and finally starting to respond to requests. There's a lot of work in each of these phases, just to get an app up and running.

Classloading and Class Initialization Startup

If we just look at the classloading and the class initialization part of this, we see that to get the JVM to main, we get three classloaders are bound, 500 classes, and about 160 of those have a class initializer method. To get our framework initialized, for a lot of frameworks, we're talking 100-plus classloaders, we're talking thousands of classes, and thousands of class initializers. Then to get the application, the application pushes things to the next level. It's still hundreds of classloaders. It's still thousands of classes, and still thousands of class initializers. It's just the next level of work done there as well. We've got a lot of things, classes loading and initializing that dominate our startup time.

Selected JEPS from Java 8 to Java 15

If we look at Java, though, there's already been a lot of work to address parts of this. If we take a selected set of the JEPS from Java 8 to 15, we get this list. I've picked this list because it highlights some themes that I want to talk about. I do recognize that selecting JEPS always misses out the large amount of work that's gone in that's not covered by a JEP. It's enough to prove the point that we're trying to make here. We see this set of themes. We see the introduction of ahead of time. Things being done ahead of time, in this case, compilation. We see a push around making it easier to distribute packages of Java. That's the jlink tool that's statically linked to JNI libraries, and a packaging tool. We see delayed evaluation. Going back to Java 8, with lambda expressions, the use of invokedynamic and bootstrap methods to be able to generate classes at runtime as needed. Then starting to see that be used more throughout string concatenation, likely, it'll be used as part of pattern matching when that comes along. A shift here to be able to define classes in ways that are cheaper, and to be able to talk about them symbolically at runtime, rather than having to actually load them. Then we see some work on recognizing second runs by optimizing the CDS archives or class data sharing archives, including the ability to store objects into those archives. We see a big push from Java to introduce some to continue on its path of delayed evaluation of being able to do more at runtime, but pushing that runtime work out, and to be able to move some of that work before runtime. This is really what we're seeing. We're making initialization lazier. For one example of this, just look at the discussions around the lazy static finals, and the draft check for that, and moving stuff out of clinit. We're seeing this push to make the language more dynamic, more flexible, and to move that work to happen as late as possible. We're also seeing a trend, mostly from the ecosystem and the tools to do work before runtime.

Call for Discussion: New Project Leyden

This is good. In particular, we've seen the features that are coming in that are driving the ecosystem towards delaying things. I'll leave most of that off, and we'll look at the tooling ecosystem and the tooling part of the story around moving work before runtime. One of the ways that we see this starting to take shape in the OpenJDK project is this new project coming out called Project Leyden. This was announced and approved. Its goal is to address the slow startup time, the slow time to peak performance and the large footprint of Java applications. It does this by creating a closed world application and compiling that to a static image. This is really cool. This allows you to do ahead of time compilation, ahead of time class initialization, and to give you a closed world that runs just your application. We don't have this yet. Exactly what it is hasn't been completely decided because the expert group involved in it hasn't been formed, and the group of people working on it have not decloaked themselves. We know what the end goal is. There's still a lot to be figured out.

We should note that introducing this may come at a cost. This cost may be that by introducing more supported configurations, both the standard goal of introducing new features into Java, and often doing those with more dynamic features, whether that's Project Valhalla coming along being very dynamic in what it's doing, or Loom coming along, or whatever those features may be. We're now dealing with supporting both that normal JVM mode, and this static invocation mode. As the number of supported configurations goes up, the speed of delivery is likely to go down. No free lunch.

GraalVM

If we're interested in that static image for Java, we can try that out today with GraalVM. GraalVM has a subproject called the SubstrateVM, which is able to create native images from Java. These are these closed world static images. The way this works is, instead of doing classloading at runtime, all of the classes that could be loaded are found at native image build time. Those classes are loaded, where safe, their class initializers are run at build time. What's generated out of this is all of the compiled code for all of the methods that are actually used. A copy of the Java heap, an initial heap, and just enough of the metadata to support the pieces that your application needs. What you get out is this single executable, this native image that you can then use to run your application. We've moved all this work that previously happened at runtime, into build time.

Subsetting Java

In order to do that, you have to accept some limitations. Really, what this ends up with is the subset of Java that's usable in the native image is less than what you can use in the full Java spec. Java allows unlimited reflection, and that's now in a native image that's limited to pieces that had been declared as reflection, capable, ahead of time. There's also limitations around other features in Java, whether that's invokedynamic or method handles, serialization, security manager, and other large portions of Java that work slightly differently in this world. That's ok. This gives you the option to say I really want this fast startup, and to get that I'm willing to give up some level of features that may not be available otherwise. It's really about a choice that you as a user makes.

Checkpoint and Restore - CRIU

It's not the only game in town. It's something really cool. It does give you a very fast startup. There's another way to achieve that without giving up the capabilities of Java, and that's called CRIU, this is Checkpoint Restore in User Space. It's something that's a project that allows you to take a snapshot of a running Linux process, and to save that away. Then you can restore from that snapshot and run from that point later on. You can basically do your startup once, save it away, and then reuse that over again. If we look at how that works, is that, we have our normal build time. We do our runtime. There we take a checkpoint, which we save away, and then we can restore, and we can continue that runtime. We're able to save all the time that's enclosed in that checkpoint. There are limitations around this as well, around how you can deploy this and where you can deploy it, and making sure that process IDs and the like all match, and that the appropriate files have all been there and not changed. It works reasonably well restoring inside a container. It's a little harder to do restoring outside of a container.

Another way to look at this is that we can take that snapshot at different points. We could checkpoint after running main, or after getting our framework initialized, or maybe even after getting our whole application up and ready to serve requests. Maybe we've actually served some number of requests before we're ready to take that snapshot. There's a whole realm of opportunities here still to be explored. There are two projects I'm aware of that are actively trying to explore this. One of them being the Coordinated Restore at Checkpoint, which is looking at adding APIs to make it so that the application can cooperate with the snapshot, so that only the appropriate state is snapshot. OpenJ9 is taking a look at this space as well. What it's doing is trying to move that snapshot restore mechanism into the JVM, so that it becomes a JVM feature instead of a Linux feature. Both of these are looking at making it easier to do that snapshot to get fast startup. This, in both cases, are likely to come at a cost of footprint. You won't get the small package, you'll get a larger package. It will start and it will start very quickly.

Frameworks

That's not the only game in town, though. That's pieces that are happening at the JVM and at the ecosystem level. There's also changes coming in the frameworks. Frameworks are looking at the techniques that they can use to work with GraalVM and saying, if we do that, if we change the way our applications work so that we favor doing this work at build time rather than at runtime, if we do our classpath scanning then, if we generate the code we need then, if we avoid reflection, we could do all this work at build time rather than doing it on every execution. Yes, it makes things slightly less dynamic, but that's actually a good thing for getting fast startup. It moves the code into places where it's easy for the JVM to optimize it, and to make use of things like CDS archives, or shared classes to give you that faster startup. It makes them cooperate well with GraalVM.

We see two frameworks in particular that I'm aware of that are really opting into this approach. That's the Quarkus framework, and Micronaut. The changes they've made means that they start very quickly in a JVM. They tend to use less memory when they do that, because they've already moved a bunch of things out of startup time, they're no longer doing that classpath scanning. They've generated the code they need to be able to just operate on that data. It means that these frameworks are compatible with GraalVM's native image generation, which enables even faster startup, and even smaller packages. We're seeing changes in frameworks start to take advantage of this ecosystem change as well.

Final Pressure

To determine where things go is really up to you. It's up to you to say, which is most important to you? Do you really care about the full set of Java features? Do you really care about being able to use things immediately, when they come out? Then you're going to want to just stick to your JVM and start using CDS, and shared classes, and those kinds of approaches. Otherwise, you're going to be looking at, can you fit inside the container that SubstrateVM wants you to fit in? Are you willing to give up some of that dynamic behavior to get there? What frameworks are you going to base your application on? Are you going to opt into frameworks that make it easy for you to run on GraalVM? Are you going to opt into ones that move work into build time, or are you going to stick with ones that do it at runtime? Really, the final pressure that determines what solution succeeds in this space is really up to you, the developer.

 

See more presentations with transcripts

 

Recorded at:

Apr 09, 2021

BT