Solving Fat JAR Woes at HubSpot

Spring Boot 1.4 and Dropwizard 1.0 were both released at the end of July, both based on fat JARS. As adoption of such frameworks and microservices increases, fat JARs are becoming a more common deployment mechanism.

Fat JARs is a technique for packaging all dependencies of a Java application into a single bundle for execution, and is used by many Java microservice frameworks, including Spring Boot and Dropwizard. There's even a Fat JAR Eclipse Plug-In.

For organizations with a few microservices, the bandwidth used by fat JARs is hardly noticeable. However, when you get into thousands of microservices, bandwidth usage can become an issue.

Earlier this summer, HubSpot cited issues with Fat JARs as a deployment technique experiencing problems with the maven-shade-plugin, and efficiency problems when packaging 100,000 tiny files as a JAR. They also mentioned a large duplication of dependency JARs stemming from their 1,000 plus applications constantly building and deploying.

They experimented with the maven-dependency-plugin to reduce the bloat, but their efforts didn't reduce the size of the generated build artifacts.

To cure their fat JAR pain, HubSpot created the SlimFast plugin for Maven, which creates a build artifact containing only the specific projects classes. It piggybacks on the deploy phase and uploads all of the application's dependencies to Amazon Simple Storage Service (S3) individually. Using this plugin HubSpot reportedly realized 60% faster build times and a 99% increase in storage capacity.

The graph below shows how much faster builds have been for them after using SlimFast.

Hubspot Build Times Graph

To learn more about HubSpot's fat JAR issues, InfoQ interviewed Staff Software Engineer Jonathan Haber.

InfoQ: Are the fat JAR problems you experienced mostly due to continuous integration and deployment?

Jonathan Haber: Yes, I think the issues we ran into are largely caused by our style of development. We have lots of small teams pushing code, building, and deploying hundreds of times per day. Because of our small units of build, it would often take longer to create and upload the fat JAR than to actually compile and test the code. On the other hand, if you have a monolith that takes 20+ minutes to build, then the overhead of a fat JAR probably isn't very noticeable. But I think more companies are moving to this faster, lighter style of development and may run into the same challenges.

InfoQ: Do you think alternate packaging techniques like SlimFast provides should be native to frameworks like Spring Boot and Dropwizard?

Haber: Because this approach requires integration with the build and deploy system, my feeling is that it's too opinionated to include in something like Spring Boot or Dropwizard. However, one way to handle this would be to put the SlimFast plugin in a Maven profile activated by an environment variable. That way the build system could indicate that it supports this feature, otherwise it falls back to using a fat JAR.

InfoQ: If cloud providers (e.g. Heroku, CloudFoundry, etc.) adopted a similar technique to reduce duplication of JARs among applications, could they save a lot of money on bandwidth?

Haber: I'm not sure what savings are achievable, but I think it would be possible to use a similar strategy. However, we have the advantage of all of our apps using the same versions for 3rd party libraries and having a huge amount of overlap in terms of libraries used. For cloud providers, their users will depend on a much wider array of libraries across all different versions, so if you wanted to cache dependencies on the application servers it would take up a huge amount of space. But if you didn't, you'd lose out on a lot of the speed/bandwidth savings. That's not to say there aren't savings to be had, I just think the implementation would need to be more sophisticated than our approach. Another issue you run into is that usually these cloud providers are just running Maven with the POM supplied by the user so they don't have much control over the build lifecycle to add these types of optimizations.

InfoQ: Are there any additional improvements you'd like to see in fat JAR applications?

Haber: I'm not sure if this is on the list for Java 9, but if Java could handle nested JARs it would make it a lot easier to build and run a fat JAR. Tools like Spring Boot and One-JAR do a good job of working around this limitation, but they add complexity and can never be completely transparent.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the Java topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter