The OpenJDK project Coordinated Restore at Checkpoint (CRaC) drastically reduces the startup time of a Java application and its time to peak performance. It does so by taking a memory snapshot at runtime and restoring it in later executions. Azul, the creator of CRaC, has shipped an OpenJDK 17 distribution with built-in support for CRaC. Micronaut and Quarkus already support CRaC, and Spring Framework will do so in November 2023.
Azul provided benchmark numbers from 2022 for the time-to-first operation of Java applications. A Miconaut application dropped from 1s to 46ms with CRaC, a Quarkus application from 1s to 53ms, and a Spring Boot application from 3.9s to 38ms.
CRaC exists because of the "long-term pain points of Java's slow startup time, slow time to peak performance, and large footprint", as Java Language Architect Mark Reinhold stated in April 2020. CRaC solves the slow startup time and the slow time to peak performance. The GraalVM Native Image Ahead-of-Time (AOT) compiler also solves all three pain points but at the price of more constraints and a potentially more expensive troubleshooting process.
It's rather apparent that loading a snapshot of a Java application will cut down its startup time: everything is already initialized, and all necessary objects are already created. What is less obvious is that this can also solve the slow "time to peak" challenges: The JIT compiler also stores profiling data and machine code in Java objects. So, taking a snapshot after the JIT compiler created the optimal machine code (e.g., after a couple of hours or even days) means immediately starting with optimized performance when restoring that snapshot.
CRaC only works on the Linux operating system at this time because it relies on the Linux feature Checkpoint/Restore In Userspace (CRIU). On other operating systems, CRaC has a no-op implementation for creating and loading snapshots.
CRaC requires all files and network connections to be closed before taking a snapshot and then re-opened after restoring it. That‘s why CRaC requires support in the Java runtime and the framework. The CRaC API also allows Java applications to act before a snapshot is taken and after it's restored. Applications can take a snapshot through a CRaC API call or with the Java utility jcmd
. The Azul OpenJDK loads a snapshot with the -XX:CRaCRestoreFrom
command-line option.
The Spring team has a sample Spring Boot project with CRaC using the first milestone release of Spring Framework 6.1. The CRaC team created a sample Quarkus project with CRaC.
Simon Ritter, deputy CTO at Azul, was kind enough to answer questions about CRaC.
InfoQ: You have worked on CRaC for several years. How do you feel now that CRaC ships in an Azul OpenJDK distribution and is also embraced by Java frameworks?
Simon Ritter: Yes, this has been a project Azul has been working on for some time. It was great when it was accepted as a project under OpenJDK. We are very happy that we have now produced a distribution that includes CRaC, which is production-ready and supported, rather than just being a proof-of-concept.
InfoQ: What are Azul's plans for supporting CRaC in other Java versions than 17, such as the upcoming Java 21 LTS release?
Ritter: Since we have done the work for JDK 17, moving to newer JDKs will be straightforward. We anticipate having this available soon after the release of JDK 21 (although we do not have a confirmed date for this).
InfoQ: To your knowledge, what other Java distributions plan to ship CRaC support?
Ritter: We are not aware of any other distributions with plans to support CRaC at this time.
InfoQ: AWS uses CRaC to speed up Java code in its Lambda serverless offering ("AWS Snap Start"). What other planned uses of CRaC in cloud platforms are you aware of?
Ritter: We are not aware of any other cloud-specific offerings at the moment.
InfoQ: How do you define success for CRaC? And how do you measure it?
Ritter: CRaC is one solution to the issue of startup times for JVM-based applications. Success for this project will certainly be where people use it in production to improve performance. We can measure this by how many people are downloading our Azul Zulu builds of OpenJDK with CRaC included. Personally, I would see the ultimate success for CRaC to be included in the mainstream OpenJDK project. We're some way off from that, though!
InfoQ: Let's say an application is run in various configurations – different parameters, heap sizes, garbage collectors, or even processor architectures (such as x64 vs. ARM). How portable is the snapshot file between these configurations?
Ritter: Since CRaC takes a snapshot of a running application, portability is very narrow. You need to restore the snapshot on the same architecture, so trying to restore an x64 image on an ARM-based machine will fail. Even using x86, you will need to make sure that the microarchitecture is compatible. For example, a checkpoint made on a Haswell x64 processor will not run on an older Sandy Bridge processor but should run on a newer Ice Lake processor.
When you restore from an image, you do not specify any command line parameters for things like GC, heap size, etc., as these are already included in the checkpoint. The only way to change these parameters is after the restore using tools like
jcmd
.
InfoQ: How portable is a snapshot file between different versions of an application in an identical runtime configuration?
Ritter: A checkpoint is made of a running application. So restoring it simply restarts the same application. There is no ability to change the application version, and CRaC is not intended for this.
InfoQ: What are the best practices for creating and managing snapshot files?
Ritter: We haven't really developed any specific best practices for this yet.
InfoQ: CRaC only works on Linux. But most developers either use Windows or macOS. What are your CRaC debugging and troubleshooting recommendations to these developers?
Ritter: The simplest way to use CRaC on either Windows or Mac (at least for development) is to use something like a Docker image or a virtual Linux machine.
InfoQ: CRaC is an OpenJDK project. What are the prospects of including CRaC in a future Java version?
Ritter: We are definitely some way from this. CRaC has been designed to be independent of the underlying OS, but, at the moment, we are using Linux CRIU to simplify persisting the JVM state. To run on Windows or Mac, equivalent functionality would need to be developed, which is a non-trivial task.
Even if we had CRaC running cross-platform, we would still need to convince the OpenJDK architects to add it to the mainstream project. As this is quite a significant feature, that will take both time and effort. There are other projects, like Leyden, that are evaluating alternative strategies for solving this problem.
InfoQ: What's one thing most people don't know about CRaC but should?
Ritter: Maybe it's not that they don't know, but it's worth reiterating: When you restore an image, you get the same level of performance you had before, as all JIT-compiled code is included in the snapshot.
InfoQ: You sometimes mention that perhaps CRaC should get a new name, as it shares its pronunciation with a dangerous drug. It seems that ship has sailed – or has it?
Ritter: I think the name will stick. I tend to use the issue more as a joke when I do these presentations. I don't think anyone will really get confused between our JVM technology and a drug.
:-)
More information can be found on the CRaC project website.