InfoQ Homepage Presentations Quarkus and GraalVM: Booting Hibernate at Supersonic Speed, Subatomic Size

Quarkus and GraalVM: Booting Hibernate at Supersonic Speed, Subatomic Size

Bookmarks

View Presentation

Speed:

Download

50:38

Summary

Sanne Grinovero discusses how Quarkus was created, how it works and how it’s able to get complex libraries such as Hibernate ORM compatible with GraalVM native images.

Bio

Sanne Grinovero is a Quarkus co-founder, Hibernate team member since 10+ years, with a strong interest in scalability and performance improvements, integration with NoSQL, Lucene and Caching technologies.Sanne leads the Hibernate project in his role at Red Hat to implement various improvements and optimizations.

About the conference

Software is changing the world. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

Transcript

Grinovero: I have a lot of very interesting things to share so I will start right away and not waste much time. I hope to fit it all in. First, let me introduce myself, I'm Sanne Grinovero. I'm Dutch, I'm Italian, I'm living in London and I came here because I was invited to introduce this to you, so I'm very glad for that, thank you. I work for Red Hat where I'm in the middleware research and development area, so mostly Java. I'm known to lead the Hibernate team. I have been working for 10 years on Hibernate now and more recently we started this Quarkus thing, which started a bit like, "Hi, can you improve bootstrap times because clouds work better if you boot faster," and many other ideas.

While working on that on all this time I have been contributing in several other open-source projects. That's a little list, I started contributing a bit from GraalVM as well now because I got really interested in the potential of this project. What are you going to talk about? I'll try to introduce you to the GraalVM and native images, what they are and what the problem and benefits are with them, and that's where we get to Quarkus. Then I'll do a little demo of showing you, hopefully, how nice it is to code on this platform. I really hope to be able to explain a bit more into detail how it actually works behind the scene, so what are the tricks and how we actually get this stuff working.

Native Image

Let me start with native image. This is a term I've been throwing around, but it's not always clear. What is a native image? Let's start with a very quick demo on that. First thing, if you want to try this at home, it's really simple. You will need to download the GraalVM distribution and then it looks like a JDK. You point your environment variables like Java Home, you point it to where you extract this, and you probably want to put it on your path as well because it has some additional binaries in there which are useful. Then what you do is, you build your application as usual, but then there is an additional phase called the native image, which can convert a JAR into a native image, which is like a platform-dependent specific highly optimized binary, which runs straight away.

Let's see this directly. I have a demo here, which is extremely simple. You all know Hello World, as a main class here, which is printing Hello World. Let's make a native image out of this. The first thing I do is, of course, I need to compile my file. Initially, we had this Main.java and now we have the Main.class file as well. Now we need the JAR from this, 'cfe,' the name of the application I want to run, what's the main entry point, and which classes to include. Ok, let me see if we created a JAR - it's there. Now, in Java, I run this like that, and it's printing, "Hello World!" Nothing fancy or special about this so far.

We can also do this, native-image-jar-app.jar. This is using the GraalVM compiler to compile this code and all the code of the JDK to build a highly optimized binary. Let me just show you what this looks like. There is this app binary now, you see this is an executable and it's 2.4 megabytes. This is the Linux Elf binary, so it's linking directly to some other libraries and I can run it like that, "Hello World!" Another benefit is this doesn't require a JDK. You just take this binary, drop it in a docker or container and that's your application. It's ready to roll, ready to go.

That's a native image. The question is, of course, "Ok, that's Hello World. How do I get it running for more complex applications?" Well, you need to learn about some of the limitations. It cannot just convert any application and make it executable.

GraalVM

Let's talk a second about GraalVM. GraalVM is a large project which has several interesting tools in there; it has Truffle components, which I'm not going to talk about today. It's also integrated in open JDK so you can actually run the Graal compiler within the JVM to optimize your code instead of in C2. Chris Dollinger explained how to do that earlier in this room. We are going to focus now on the Substrate VM component, which is using the GraalVM compiler to compile your application classes, the JDK classes that your application is using and these bits from the SubstrateVM, which are like the implementation of some components that cannot be ported otherwise. That's mangled together, extremely optimized by the compiler and you get this executable out of that.

A big thing about this is, it needs to run this static analysis, so this is ahead-of-time compilation. You're not doing just-in-time compilation like we used to in Java. It needs to compile everything, which means it needs to also see everything. That brings us to this closed world assumption like here; when you're building the application, you need to have the code there and the compiler needs to be able to see all the code and the possible flows that are going to be executed.

There are great benefits from this approach, mostly like dead code elimination. For example, in the demo I ran before, all the components in the JDK - and I know we have modules, but modules are very [inaudible 00:06:15]. This is looking at method by method, field by field. Are you actually going to need this field in your final image? If not, it's thrown out, it's not being included. That's keeping the memory costs very low, the bootstrap times very low and just the size on this, all the resources pushed down to the minimum. We really like this aspect of aggressive dead code elimination, which comes from a static analysis.

It’s a strong suit, but it's also a very weird way to look at the Java platform because we're not used to this, and all the libraries we're running and all the platforms we're running make assumptions on the fact that it is a very dynamic and rich platform that you can do stuff, generate code at run time and do interesting things.

These limitations bring us to, you cannot have dynamic classloading. In practice, if you try to get the reference to a class loader or your current class loader in the context of something you'll get a null back, null reference, which means most of the libraries out there will throw some random null pointer exceptions there, because they might try to do something nobody expected when writing this code, that it could return null, but now it's returning now when it's compiled.

This also implies you cannot deploy jars and wars at runtime. You started with Tomcat years ago and then you produce a war and then you load these kinds of multiple wars, like plugins of your containers, and you'll load these things dynamically. That's not possible because this is violating the principle of the compiler needs to be able to see all the code so that it can actually prune the dead code. It cannot know what it can remove if you are allowed to introduce additional methods which might code, invoke things, methods, JDK methods or used constants, which he actually removed upfront.

Another thing as well is that the management extensions and the tooling interface are not available, which means no agents, which means no JRebel, no Byteman, no profilers, no tracers. This is killing a whole area of very interesting tools that we are used to, and no Java Debugger. You cannot use the JVM debugger to connect one of these native images because if you didn't tell him that you're going to use these kinds of things, the infrastructure to support this stuff has been thrown out to save more memory, save more disc. Now, of course, some of these things can have flags that you enable at build time and that make the compiler behave differently so that some of these things are not thrown out, because you might want to do them later, but you have to tell it explicitly.

Other areas are like "No: security manager", and we're like, "Finally," because that's very complex to handle, but also, why do you need the security manager if nobody can actually load new code in there and there is no classloader? No finalize() support, which is great because it's been deprecated by years and now you just cannot usually use them, so you'll [inaudible 00:09:30]. This is a bit of a sore point; InvokeDynamic and MethodHandles have only very limited support so if you're doing crazy optimizations based on InvokeDynamic and MethodHandles, there's a good chance that this will not compile to native either. Why is that? Because you are generating code essentially at runtime, that's a violation of the closed work principles.

Reflection - if you allow reflection, the compiler cannot see which fields you are going to read, which methods you are going to invoke, and so on. Reflection is not allowed, except of course if you explicitly tell the compiler, "Look, when you're compiling this, can you keep in mind that I will invoke reflectively this one constructor of this class?" If you tell it that upfront, it will just keep the infrastructure in the image so that this is then going to work at runtime. But if you forget one of these things, it's not going to work.

Of course, dynamic proxies, aspect-oriented programming, interceptors created at runtime, loading resources or using JNI, Unsafe Memory, all of these things are not available either unless you hint at the competitor, "Look, I have specific needs. I need this method to work in these conditions," and then it can fold this information into the optimization decisions. It's an opt-in thing, you have to tell it.

The static initialization is another interesting aspect. This is a very special behavior. Every block which is static in your code, which also means simple constants - you have a "public static final string" and then there is something there; that's a static initializer. You know that when your application is initializing this class, that string is created at that point. What's happening here is that this string is actually created during the compilation of your application. It's not just strings; it's every more complex static block that you might have. Every more complex object that you might be initializing is run, the code is run within the context of the build, so the compiler is running it.

It's actually checking what kind of code you are including in your static initializers, because some things are not legal, and we'll get back to that later. Essentially these things that we look at, they look constants when you're reading the code. We know they're not really constants because we have the reflection API. You can say, "I said that's a final field, but let's make it a writeable again. Remove the protections because my library needs to do something so I'm going to change it again." In the JVM, all these things are allowed and they kind of have back doors. You cannot allow these backdoors to run in GraalVM because that would mutate the constants, which are strong assumptions that the compiler again is going to use to optimize your code like crazy.

In a way, I prefer this; if it's a constant, it really is a constant, and the optimizations behind that are quite strong. What happens is, these objects that you created during the build will take a snapshot of that memory, and that's compiled into the native and that's already initialized, ready to go. This goes in the constant pool of the binary.

Among the things that are not allowed to run in the static initializer, if you try to open file handles, sockets, or start background threads, there are many things that need a timer that started in the background to do some periodic things; if you do anything like that, the GraalVM compiler actually sees through that, and that's a violation of the rules.

Of course, you don't want to start these timers within the compiler. I'd want to start them in the end application. In these cases, to be fair, this is changing right now in [inaudible 00:13:35], so I'm not sure in what direction we're going, but it seems the idea is that rather than failing the compilation, it will automatically detect that these blocks will not be run during the build, and they will be run later. This is a change of how the current GraalVM latest release is, which would just fail to build. You're not allowed to do this in static new initialization, and you can explicitly opt-in yourself to deferred initialization of a specific list of classes.

These three are illegal, but you also have to be careful with other things. If you're taking the current timestamp in your constant, you want to know when this application was booted, that timestamp is actually going to have the timestamp of when this application was built, maybe on a different machine, maybe years before it was actually started on this new machine. You also don't want to capture environment variables or things like that, system-dependent constants. If you're doing optimizations based on the number of CPU cores your system has, you might be capturing the number of cores that the system has when it compiled your application, which is not maybe what you really meant to look at. You need to be careful at what these static blocks are containing.

I mentioned you need to disable some things, like JMX is not supportive. Let's look at some examples from the Hibernate library, which is what we are going into. In Hibernate, you can enable the management bean, we have some configuration attributes and that allow you to look at statistics, like what's my slowest query, what's the right query ratio and all kinds of other metrics that are registered in this management bean if you enable these in configuration.

At a very high level the code will look this. If it's enabled, then we register some things on the management bean. This code will not compile because the registerJMX() method will do something which invokes this API from the management beans, which is a violation of your reachability. You're reaching into code, which is not implemented, so there is no way to compile this. The compiler cannot see that this flag is maybe off in your specific configuration, because what will this binary do when you actually enable this in your configuration? It needs to know what to do in this case. The correct way to disable the feature is to have code that looks like this. You have to have a constant which blocks any flow of code from going into that method. The compiler sees this and the reflection is not allowed, so this is a real constant. This is always off. This method will never be invoked; it's not even compiling it. The whole code is going to disappear. The trick is you need to make sure that it looks like that, that's what we're getting at.

You need to adapt your code to make sure that the code flow that you're having is really legal code, not having these assumptions that are no longer valid. It's not that much work for your library. The big problem is you need to compile all your dependencies as well, because everything is going to be compiled and optimized into the single binary that's representing your application. It's not just your code. That's really all of your dependencies, your dependencies and the dependencies of your dependencies. Everything that you have on class parts, including the JDK, is included into this. That's a lot of code that you need to probably verify is behaving as you're expecting.

What's the impact on Hibernate? I'm taking Hibernate here first off, because I ported Hibernate to Quarkus myself, so I can answer very deep questions about this. What we learned about porting Hibernate to GraalVM can be applied to all the other libraries. It's interesting to discuss Hibernate because pretty much all the illegal things that we are seeing here, it does them somewhere. We had to figure out how to have an alternative plan to make sure that you can have all the benefits of Hibernate, no trades-off and still compile your application to a native.

Hibernate ORM & GraalVM

Let me switch to a mind map here. This is pretty much the steps that we needed to do. Let me just select some, we don't have to discuss all of them. For example, resource loading; you might have an import sequel script that you want to be imported in your database when your application is run. This is mostly done for developments, but some people do it in production as well. You need to have these resource included in the binary, and that's what the Quarkus Hibernate extension will do. If it sees that you're importing this by just parsing your configuration, it will include as a resource in there, easy.

No support for JMX, you've seen that before. It's just disabled in this specific code, which is generating the bootstrap of your application. It's making sure that the compiler can see that this is really off, and there is no way for you to enable it. Same for a security manager and securities, and then of course, there were some libraries that were starting threads in the background on class initialization. That's a bit dodgy anyway, so I just patched it to make sure that doesn't trap and again, and that's resolved.

Reflection - you might assume Hibernate does a lot of reflection. In fact, it doesn't have to. There are alternative ways that it can enhance mostly for performance reasons, it can enhance access to your entities. It turns out that using reflection in GraalVM is actually super-efficient because the GraalVM, as soon as you tell it, "I'm going to do this," it's not really using the same code, it's just shortcutting the whole operation. What's missing here is that you need to register the reflective needs of your application. All the entities you have, all the constructors of these entities, the accessors of these collections, the getters, setters, all these methods that the framework needs to invoke or read, need to be registered in a configuration file for the compiler, which is to say, "All of this stuff needs to have reflective access."

If you are not using Quarkus, you can use a JSON file and just list all of that stuff in there, but it's very tedious. What Quarkus is doing is literally, "I know your application." You're not allowed to mutate it later, so you're not going to deploy additional entities later. You're not going to instrument it, so we're just looking at your entities on the classpath. That's the list of things that we automatically then register into the compiler. There is a callback from the GraalVM compiler into the Quarkus framework, which then delegates to all the frameworks that are integrated in the Quarkus build, and each of them can list, "I'm going to search for JPA entities. These are the classes for which I will need to have reflective access." Then you're good to go. The compiler knows this and it's folding this information in.

Another thing we had to do was on dependencies; all the high burner dependencies have been converted to run fine on GraalVM. A big one was I needed to convert even the JDBC drivers. PostgreSQL driver had some issues initially. First off it does a reflection internally. We patched this to not do that anymore. It uses phantom references, which used to be another thing that's not all really allowed within the GraalVM, but actually is out of date. Now phantom references work fine within GraalVM, but there was a patch for that.

Then you discover interesting things, like the MariaDB driver; when it's authenticating to your database, it's actually initializing JavaFX in background. Did you know that? I had no idea, so every time I ever connected to MariaDB or MySQL, you have additional memory consumed by your JDK because it's initializing all of these Swing, Java 2D, and JavaFX classes just because in the security options, one of the ways to connect to get the password is to open a dialogue box and show you the dialogue box. Even if you're not using that, that code is still being initialized and triggers the JVM class initialization in a chain. That's just an example of all the things we had to do to get there.

Hibernate also needs ANTLR to do the parsing of your queries. It needs a transaction manager - the Narayana transaction manager, which is now probably better known as the JBoss transaction manager, was also converted to have a Quarkus extension and work there. XML parsers need to work, a connection pool, and so on.

Let's go to the more interesting things. Hibernate can not create proxies to do lazy initialization of your entities, and it cannot enhance your entities at runtime. If you are very familiar with Hibernate, you might know there are also plugins for Maven, and Gradle, and Ant to do class enhancement of your entities at build time. That's what you're using. Technically all your entities are being enhanced before the compilation phase to native. The interesting fact is you don't have to set up these tools anymore. What you do here with Quarkus is, we already set up all the tools automatically, so that all the classes when they are inspected for your application, then we know, "There are JPA entities here, let me enhance them." Then, it's the enhanced version of the class that's then being put in the JAR with all the other enhancement that Quarkus is generating and that's then later compiled to native codes. There is no dynamic classloading, there are no proxies, there is no runtime bytecode actually happening in your application because it was already done before that.

Another aspect of this is you don't really need to do ClassPass scanning. This is probably the slowest component of booting Hibernate. When you're starting it, it needs to find out which are your entities in your model. To find them, we need to look for JPA annotations in all your dependencies to figure where they are. This stuff can be done at build time. Then there is no need to repeat it every time you're starting the same application. Since we are talking here about microservices or immutable applications, and not servers that are behaving like a container in which you can dynamically add or remove code, then everything is known during the build.

As you are looking at your code and you know which entities you have, so can the compiler look and see what are the entities you have, and that's a constant. As soon as we have these constants, these constants are literally generated as static initializers constants in classes which are dumped on disc and this is additional code that's been included in the compilation of your application.

When we have these constants, these can actually optimize the size of the whole application like crazy. If I can see that in your entities, you're always using a specific ID generation and you're not using any other ID generation, then these other idea generation strategies which are in the Hibernate library, are getting removed. If we see that you're compiling your application to connect to a Postgres database, all the code that we have to support MariaDB, Oracle databases, and all the other stuff is removed. That's because you can now rely on these constants after the build.

This area is also interesting. What we're doing is, we can start Hibernate within the compilation. When you're compiling and bootstrapping Hibernate up to the point that it needs to connect to the database, we don't want it to connect to the database because you are compiling. Maybe you don't even have the passwords to your production database, or it's not reachable from this machine. What we did is split the phases within the framework so that we can get to a point in which all the metadata about your application has been computed. We have all the entities. We have them all enhanced and all the code that's ready to initialize it has been run. Then we take a snapshot of that and that's what's included in the binary.

All that work doesn't have to happen every time the application is run, because it's already done. The only thing that's missing is, "Now you really can connect to the database and we get going." The same approach can be applied to all the other frameworks. The important thing is that you can reorganize the code into this phase, which is not what Java developers usually do. Then, we snapshot that and that's the state of your application when it's meant to run.

What Is Quarkus?

This is the introduction to Quarkus. What is it really? It's a bit of a toolkit and a framework to start your application. We're focusing on Java applications of course, and a little bit on Kotlin as well, so exploring it. It looks interesting; so far there is an extension for Kotlin too, if you want to play with that. It's really designed to be really light and aiming for cloud environments. It's designed upfront thinking about the problems of GraalVM.

What I really like is these limitations of GraalVM; they become almost like a strong point. Since we can really rely on these constants and the dynamic aspect is gone, then you can optimize even further for everything. This is what it looks like in terms of process. You have your application which gets compiled, then the Quarkus build plugins, that are for Maven and that are for Gradle, they work pretty much to inspect your application. I can see if you're using Hibernate or not and which entities they are, and we can enhance them, apply some additional magic. That's then a very highly optimized JAR representing that application which can run on the JVM. In fact, it runs on the JVM consuming far less memory than everything we had before, because most of the work was done at build time.

Since we're moving into this idea that one application goes in one JVM and it's not really changing - when you want to have a change, you just build it over and maybe you create a new container and replace the old one - it means that all of this work that we are used to doing during the initialization of the framework can really be moved into the phase in which you are creating the application. It can go the GraalVM way and build the native executable as well.

The interesting thing is, like I said before, you cannot use the Java debugger on the native executable, but since it's the same application that you can run on the JVM mode as well. If you have a problem with your business logic, you will just run on the JVM and you work as usual. It's just consuming less memory and booting in way less. It works via these extensions and for all of these frameworks that we're supporting now need one extension to support it. All the stuff I showed you on the mind map for Hibernate, we did that for several other libraries, and you might want to add some four additional libraries that you need. They do multiple things, one of them is to make sure that the library that comes compatible with the native image of GraalVM, but that's not the only goal. Quarkus is very interesting even if you are not planning to run it as a binary - we'll see some more benefits now, like the development live reload capabilities - and it just takes way less memory in JVM mode.

These are some of the libraries that are supported now. There are actually many more, but these are highlights. You might see Kafka is quite good now, Hibernate, RESTEasy, Undertow is here too and it's not on the list, Netty, Vert.x and we're heavily focused on Kubernetes going for open shift, and Infinispan, Prometheus, the whole thing, and there are many more coming here.

Another goal of Quarkus in terms of a platform is to expose the ease of use world; it's going to try give you an API which is for imperatives and reactive, which look similar. You can mix imperative and reactive coding like the RESTEasy and the Vert.x things, they can be used reactively, but I'm not going to talk much about this today. With this goal of being container first, we see the size of this becomes really small, the boot time is much faster, which means you can really scale up your instances on CloudZone Kubernetes without worrying too much about all the time it takes to boot the whole JVM and the warmup or fabrication of the JVM because everything was precompiled.

You can have your full application started in milliseconds. We're looking at memory consumption, but we're not really looking at heap sizes anymore. That's not really interesting in Kubernetes. What we're looking is the total resident set size consumed by your application. What is that? That literally is the sum of all the memory regions that your application is consuming. It's not just a heap; it's your heap, of course, but also all the other costs that JVM has when starting. The more threads you have, they all need to stack, that's one region. Then you have the metadata of all the classes that you are including. The less classes you have, the less metadata. All the dead code elimination helps there too.

The compilation, the just-in-time processes, and all these things that need to happen in the JVM, they are consuming additional memory that you don't really necessarily see in the heaps. If you're breaching your RSS limit on the cloud, your application gets killed, so it's important for us today to focus more on RSS than heap consumption, even if consumption matters as well. This is how we measure memory - we'll see that in a second. How much memory does it consume? In a REST application compiled to native, the total memory consumption is about 13 megabytes. This is with one-megabyte heap. That gives you an idea that you really need to look at the whole thing, not just at the heap, which is just one.

The whole memory consumption on the same application running on Quarkus on the JVM is about 74 megabytes. If you try to beat that with any other framework out there today, you're unlikely to get below 140 megabytes. We can say that both of these are running on the JVM traditional, not GraalVM, so the savings in memory are very strong. With more complex applications, which are using Hibernate as well, which include a caching library, connection pool, a transaction manager running, the JDBC driver and a lot more, the memory consumption is still very low and super-competitive even compared to other frameworks.

Start up time - let's just skip this, but let's keep to the time it takes to run, the same REST demo. In native, it will start in about 14 milliseconds. On JVM, it will run in less than one second. In a different cloud stacks, it will be at least four seconds. With JPA enabled, it gets a bit slower; do you know why? It doesn't actually get much slower at all, but it's also connecting to the database, filling the pool. It's multiple connections being authenticated to the database, creating the schema and all these things. You can boot it in about 50 milliseconds, including a Hibernate and the transaction manager, everything. On the JVM, it's 2 seconds traditional, it's getting close to 9, 10 seconds with this specific demo we have.

Developer’s Joy

There's this comic on the website, it's pretty good. It feels a bit like coding on PHP. You're making code, your code changes, and you just reload it in the browser and the code is there already. How do we do that? In practice, since it is becoming so light and fast, we can actually reboot the whole application from scratch. When you're refreshing, everything will reboot and there is no drawback. Let's have a look at this small demo here, just have a look at the POM file, and how it works. I'm importing the Quarkus [inaudible 00:36:08] and then I have some extensions - the ORM extension, the connection pool or RESTEasy and MariaDB driver. I have the MariaDB driver running here. This is in Docker and this application has one configuration file, that's the Quarkus configuration file. It has mostly just data source properties, telling Hibernate that it needs to drop and recreate a database every time it's restored.

Also, we want to log SQL statements, but just to see what it's doing. We have default import SQL statements, you might be familiar with this. Then, there is a page and then there are just two classes in this application. This is a RESTEasy entry point, so we're exposing a REST API to load all the fruits and then load the single fruit, create a fruit, update a fruit, and that's it. Note that you don't need anything else. Since we can inspect your application, then we can infer what you're needing out of this and nothing else is needed.

Then there is one entity, the fruit entity. This is using autogeneration for IDs, it has a unique name and that's it. There is an Entity here and note to name it query. Let me just show you what it looks like. I have some terminals here, "mvn package DskipTest." Let's have it as a package first. It just runs a build, it'll create a JAR file. It ran those tests, it connected to my MariaDB locally and verified we are good to go. Let's set Quarkus in development mode, it's running the tests again, and then it started. This is what it looks like. There are apple, banana, and cherry in the database. We can remove some and we can add some new fruits in the database.

Let me just show you this table as well. Let me remove the sequence because it's not there yet. There is a fruit table here. "Show create table Fruit." Of course, it has an ID and a name, which is a primary key and there is a unique index on name. Now, if I want to go and make some changes here, we can say the "unique" needs to go. Let me just switch it here and then I go here, I refresh this. You see the lowest terminal had some noise here. It did a hot-replace here and if we now switch to this one shown, there a new fruit here. The uniqueness is gone, you can have live changes on your entities included.

That's when you make it public, public String, Let me add the field. I refresh here, it's restarting. I look at the table again and then your column is there. Much funnier to work with this and it works with anything in your application. If I want to, let's say, add something here, "pineapple," column four. I save this, I refresh, and pineapple is there. You can change really anything except the dependencies, because dependencies need to make them to rebuild everything. This is much safer than existing tools, which try to connect and use agents to replace some components because it was too heavy to reboot it. Now it's very light and we can reboot everything, so it's the same as killing an application and starting it over again. That's the LiveReload capability.

The other alternative is we can build a native. This is going to take some time. We just have our minutes left and I think it will take four minutes to build. It runs the test first, it's created the JAR version and now this is starting the native image phase of the build of Quarkus. Let's have a look at the log here. It's invoking a native image with a ton of special flags, but that's not all what Quarkus is doing. Quarkus also generated in previous phases a bunch of bytecode, which is then stored in disc, and these are all the call decks that the competitor is running. This is the compiler, here this is booting Hibernate. This has selected the dialect and it's started the transaction manager and XNIO and some other technologies. They have been booted within the JVM context of the compiler phase.

In a traditional observer, when you're deploying something, there is a lot of stuff that needs to happen. Take again the Hibernate example; you need to parse, say, the persistence XML file. It's cheap to parse a single file but before you can do that, you have to initialize all the classes of the JVM that enable the XML parser subsystem. That actually takes a lot of time, the first time you do it.

Within Quarkus, the XML file is not really parsed into the runtime; it was parsed before, which means the XML parser implementation of the JVM is not included in your final image because it's not really needed anymore. It's the same for anything related with annotation lookups. We're not really reading your annotations at runtime because they've been run. They have been read before at build time, and even just validating that your model is fine and all of these things. Since we run Hibernate, we need a compiler, it would have failed. You would have got feedback about invalid code already.

All that we do is then record it into bytecode, and you get the static initializer or a main depending on what the specific framework needs to do. Let's go back a second to see if that was done- it's still compiling. It's a very heavy process and it takes a lot of memory, which we can probably see here. See, this is a poor dual-core machine with hyper-threading, but it's taking my four hyper-threaded CPUs to the extreme and I don't have more than that memory so it's trying to use all of it.

Let me finish with the slides. The architect within Quarkus is working with these main components. There is the GraalSDK which allows us to have these call decks to the compiler and enable flags, or make sure that the specific classes initialize later, before, or with the special flag. Gizmo is a new library we created to create this bytecode that's being dumped on disc and compiled then later. Jandex is an indexer, it's a super fast and efficient way of finding your annotations without initializing those clusters. With Jandex, we scan and have a very good picture of what your application is meant to do and which annotations are there, which technologies are being used without needing to initialize it all.

You have seen hot reload goes into action in milliseconds as well. That's not in native mode. Consider that during a hot reload, we need to also rescan your whole application again. That's fast because of these other technologies. Then there are extensions for all these other things; every different library has its own extension. The purpose is both to work around the limitations of Graal, or let's say, take advantage of the limitations of Graal, but also optimize it for JVM mode. All the code needs to split very cleanly between what's being run during the build, that's one thing, and what's being run at your runtime. They are different, to the point that we have different dependency sets. The dependencies that you need at runtime are a much-trimmed version of what you actually have during the build, because if you don't need it later, but just for build, then it can go.

Build success. That took 3 minutes 38. What do we have now? We have the binary in my target. You see this? We have the JAR, which is the normal executable, let me start that one first. That's the Quarkus application ready to run in a JVM. "-jar demo2, runner.jar" and it connected to the database, created the new schema and all that in less than two seconds on my very poor laptop. If we do the same but without Java, we can run that other executable we have, runner up. That's six milliseconds boot time, and this is the same application. This is using RESTEasy, Undertow, transaction manager, Hibernates, connection to the database and everything else, but I think the most interesting part really is this code. This code is following the standards and the libraries you are already used to.

There is not much for you to learn for, "How do I get working with this technology?" This is the usual RESTEasy with the standard annotations, and this is totally just the standard JPI and we just can convert everything to the native thing. Let's just test this application. You've seen here it's logging the queries, I can make those same changes as before. I can't live reload now because this is a binary. It was highly optimized to do what it is doing. How much memory is this taking? This is consuming a total of less than 40 megabytes, but that's total RSS, it's not just heap. In fact, how much heap is this using? It's using one-megabyte heap total for a total of 40 megabytes application. That's the same use of application.

Questions and Answers

Participant 1: I think you mentioned at the beginning, when you start adding new libraries and dependencies, who is going to do the work of adapting to Quarkus, say, for a JSON library or something else?

Grinovero: It depends on how complex the library is. I did the work on Hibernate because I know Hibernate very well internally. Obviously, that was a difficult task; you wouldn't have done that in a day. Also, I don't expect most libraries to do all this crazy stuff that a framework like Hibernate is doing right. Many, libraries, if they're just using reflection, it's just a couple of flags. If they're doing weird things during starting initialization, then there is a single flag to say, "That flag needs to be run at runtime.” It really depends on what the library is doing. I think the good news is the GraalVM compiler is extremely thorough. It has to; it's analyzing all the code flow paths, so it knows every possible action of this library. It will just fail the build, telling you, "There is this illegal thing that I found in this method over there," and you get an error and then you will need to make an extension for Quarkus, or ask somebody to make an extension for Quarkus pretty much.

See more presentations with transcripts

Recorded at:

Sep 11, 2019

Sanne Grinovero

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Quarkus and GraalVM: Booting Hibernate at Supersonic Speed, Subatomic Size

Summary

Bio

About the conference

Transcript

Native Image

GraalVM

Hibernate ORM & GraalVM

What Is Quarkus?

Developer’s Joy

Questions and Answers

Related Sponsored Content

This content is in the Java topic

Related Topics:

Related Editorial

Popular across InfoQ