BT

Solving Fat JAR Woes at HubSpot

| by Matt Raible Follow 11 Followers on Aug 22, 2016. Estimated reading time: 3 minutes |

A note to our readers: You asked so we have developed a set of features that allow you to reduce the noise: you can get email and web notifications for topics you are interested in. Learn more about our new features.

Spring Boot 1.4 and Dropwizard 1.0 were both released at the end of July, both based on fat JARS. As adoption of such frameworks and microservices increases, fat JARs are becoming a more common deployment mechanism.

Fat JARs is a technique for packaging all dependencies of a Java application into a single bundle for execution, and is used by many Java microservice frameworks, including Spring Boot and Dropwizard. There's even a Fat JAR Eclipse Plug-In.

For organizations with a few microservices, the bandwidth used by fat JARs is hardly noticeable. However, when you get into thousands of microservices, bandwidth usage can become an issue.

Earlier this summer, HubSpot cited issues with Fat JARs as a deployment technique experiencing problems with the maven-shade-plugin, and efficiency problems when packaging 100,000 tiny files as a JAR. They also mentioned a large duplication of dependency JARs stemming from their 1,000 plus applications constantly building and deploying.

They experimented with the maven-dependency-plugin to reduce the bloat, but their efforts didn't reduce the size of the generated build artifacts.

To cure their fat JAR pain, HubSpot created the SlimFast plugin for Maven, which creates a build artifact containing only the specific projects classes. It piggybacks on the deploy phase and uploads all of the application's dependencies to Amazon Simple Storage Service (S3) individually. Using this plugin HubSpot reportedly realized 60% faster build times and a 99% increase in storage capacity.

The graph below shows how much faster builds have been for them after using SlimFast.

Hubspot Build Times Graph

To learn more about HubSpot's fat JAR issues, InfoQ interviewed Staff Software Engineer Jonathan Haber.

InfoQ: Are the fat JAR problems you experienced mostly due to continuous integration and deployment?

Jonathan Haber: Yes, I think the issues we ran into are largely caused by our style of development. We have lots of small teams pushing code, building, and deploying hundreds of times per day. Because of our small units of build, it would often take longer to create and upload the fat JAR than to actually compile and test the code. On the other hand, if you have a monolith that takes 20+ minutes to build, then the overhead of a fat JAR probably isn't very noticeable. But I think more companies are moving to this faster, lighter style of development and may run into the same challenges.

InfoQ: Do you think alternate packaging techniques like SlimFast provides should be native to frameworks like Spring Boot and Dropwizard?

Haber: Because this approach requires integration with the build and deploy system, my feeling is that it's too opinionated to include in something like Spring Boot or Dropwizard. However, one way to handle this would be to put the SlimFast plugin in a Maven profile activated by an environment variable. That way the build system could indicate that it supports this feature, otherwise it falls back to using a fat JAR.

InfoQ: If cloud providers (e.g. Heroku, CloudFoundry, etc.) adopted a similar technique to reduce duplication of JARs among applications, could they save a lot of money on bandwidth?

Haber: I'm not sure what savings are achievable, but I think it would be possible to use a similar strategy. However, we have the advantage of all of our apps using the same versions for 3rd party libraries and having a huge amount of overlap in terms of libraries used. For cloud providers, their users will depend on a much wider array of libraries across all different versions, so if you wanted to cache dependencies on the application servers it would take up a huge amount of space. But if you didn't, you'd lose out on a lot of the speed/bandwidth savings. That's not to say there aren't savings to be had, I just think the implementation would need to be more sophisticated than our approach. Another issue you run into is that usually these cloud providers are just running Maven with the POM supplied by the user so they don't have much control over the build lifecycle to add these types of optimizations.

InfoQ: Are there any additional improvements you'd like to see in fat JAR applications?

Haber: I'm not sure if this is on the list for Java 9, but if Java could handle nested JARs it would make it a lot easier to build and run a fat JAR. Tools like Spring Boot and One-JAR do a good job of working around this limitation, but they add complexity and can never be completely transparent.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Close, but no cigar by Will Hartung

They're doing this the wrong way.

The answer is "simple". It's a "maven aware" class loader.

Out of the box, maven bundles the pom in the META-INF of a Java jar.

Maven already has code to "run" an application based on its pom, standing up the class path before executing the prescribed main routine.

That code need to "simply" be ported as a stand alone class loader, invocable by the main method. It will essentially do what maven does - it will reify the dependencies, use the classic local maven repository, build up an internal class path that based on those dependencies, and then fire off whatever method the developer desires.

MavenClassloader.run(MyAppClassWithMainMethod.class, args);


Worst case, when the jar starts up, it downloads the internet in to a local repository.

But, that happens once, and rarely. As the jars are deployed over and over, the repository normalizes with the code and eventually there's little to do but start the code up.

You can stage pre-loaded repositories in containers that are brought up to date automatically by the jars as necessary.

Now, the jar you upload is wafer thin, and it leverages all of the tooling everyone is already using.

Re: Close, but no cigar by Bruce Wayne

Its perhaps reasonable to desire not running an assemble(sort-of) job in production where build once deploy many(same artefact) is the philosophy..

Questions! by Bruce Wayne

What I fail to understand is the 1000s of micro-services bit, why are there soo many micro-services?
What is the bandwidth gain thats being talked about? If there is an artefact server which is mirroring some internet maven repos, and is within the same environment. That could potentially reduce bandwidth dramatically isn't it? Unless I've mis-understood the bandwidth concerns.
Looks like an invented problem to me!

Re: Close, but no cigar by Will Hartung

Well the you're at an impasse, then, aren't you? The artifacts need to get to the machines somehow. So they either have to already be out there, or you have to drag them with you each time. It is straightforward to add a flag to the command that simply tells the process to satisfy its requirements before "running it for real", but for whatever reason, folks itch at that too.

This is no different than pre-staging all of the jars, setting the class path up, and firing off the code. But we did that back in 1999, and had to manage all of that. It was awful and brittle and nasty back then.

This is a natural extension to what Maven can already do, using tools and infrastructure everyone already has.

Of course, you don't have to do anything at all, you can simply run mvn directly using the exec plugin, but folks cry and whine about that as well.

Re: Close, but no cigar by Jonathan Haber

Hey Will, thanks for the feedback. The way I think about it is that before we would upload a single large artifact at build time and download a single artifact at deploy time. Now, we upload lots of small artifacts at build time and download lots of small artifacts at deploy time. These are functionallity equivalent and you'll notice in either case there's nothing happening at runtime, by that point all of this work is already done. So my philosophical objection to your suggestion is that we don't have Maven installed on our application servers and we like it that way. We let our Mesos scheduler, Singularity, handle artifact downloads for us at deploy time in a language agnostic manner. Previously we gave it a single S3 URL to download (fat JAR), but now we give it a list of S3 URLs (the app plus its dependencies) and it handles downloading these artifacts, caching on disk, retrying, verifying checksums, signatures, etc. Since this was already implemented in Singularity, it's hardly any more complexity at deploy time and it ended up being really easy to integrate into our deploy process. By the time the app starts up, all of its dependencies are guaranteed to be present (if an S3 download failed, the deploy would have failed) so it's totally transparent to the application.

I also have some more pragmatic concerns, such as Maven not using locking or atomic operations when interacting with the local repository, so if we had concurrent deploys accessing the local repository we would run into issues. Also, we use snapshot versions extensively for our internal dependencies and our Nexus instance is configured to only keep the latest snapshot to keep disk space under control. But to make sure our dependencies aren't shifting at deploy time, SlimFast always uses resolved snapshot versions. So if we were fetching artifacts via Maven at runtime, these resolved snapshots would quickly point at purged artifacts and the app would fail to start up (ie, my app depends on artifact A snapshot version 1, artifact A builds again and publishes snapshot version 2 to Nexus which causes snapshot version 1 to get purged, now my app can't start because it can't fetch artifact A snapshot version 1 from Nexus). Additionally, for dependencies that come from 3rd party repositories there's also concerns that those repositories could disappear at any time. This is acceptable when it causes a build time failure, but if we fetched via Maven at runtime it could cause a serious outage.

Re: Questions! by Jonathan Haber

Hey Bruce, thanks for the questions! We have lots of microservices mainly because of the breadth of our product and our team structure. The HubSpot product is extremely broad: our customers can manage their leads, segment them into lists, build automated marketing workflows, score them with predictive lead scoring, build their website, run their blog, send emails, publish to various social media networks, view analytics across all of these channels, and much much more. Each of these pieces of the product is owned by a different product team, and each product team owns the potentially dozens of microservices that power that piece of the product. For example on the email side, there's an API to send an email, there's an API to record email opens and clicks, there are jobs to handle bounces and spam reports, there's an API to fetch statistics about an email send for display to the customer (open percentage, click percentage, etc.), and many more. These aren't integrated into a single service because they have very different performance characteristics and reliability needs. For example, the click tracking service needs to respond within a few ms and emphasizes availability and eventual consistency because if it's down the email links won't work. This is very different from the API that sends emails, which may take a few seconds to complete all of the SMTP operations and favors consistency over availability to make sure we never double-send the same email.

On the bandwidth side, the bandwidth savings are from not uploading a fat JAR on build and downloading it on deploy. Instead, we only upload the dependencies that have changed, which are usually none, and download the dependencies that aren't already cached on the application server, again usually none. Previously, the transfers were happening within the same environment so the bandwidth was free, but the real problem was the time being wasted. It could take ~30 seconds to assemble the fat JAR, another 30 seconds to upload it to S3, and another 30 seconds to download it at deploy time. Now all of these times are measured in milliseconds. For some background, the metrics the HubSpot platform optimizes for are developer time, productivity, and happiness. If we found a way to save 30 seconds for our developers on builds and deploys but it used 3x the bandwidth and storage we would do it in a heartbeat. It just so happened that this approach saves bandwidth and storage in addition to time.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

6 Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT