InfoQ

InfoQ

News

My Bookmarks

Login or Register to enable bookmarks for unlimited time.

The content has been bookmarked!

There was an error bookmarking this content! Please retry.

The Massive, Monolithic JDK should become Modular

Posted by Dionysios G. Synodinos on Nov 26, 2008

Sections
Development,
Enterprise Architecture
Topics
Runtimes ,
Platforms ,
Java
Tags
Java SE ,
JDK

Mark Reinhold, Principal Engineer at Sun Microsystems, has been advocating about how “cool” would be for the Sun JDK to be modular. He’s is putting up a good argument about how the complexity is hurting the platform and how the Java Kernel and Quickstarter  features in the JDK 6u10  release just address the symptoms of JDK’s long-term interconnected growth.

Mark starts-off by explaining how we got to the current state of having such a huge JDK:

The JDK is big, too-though not (yet) as big as space.

It’s big because over the last thirteen years the Java SE platform has grown from a small system originally intended for embedded devices into a rich collection of libraries serving a wide variety of needs across a broad range of environments.

It’s incredibly handy to have such a large and capable Swiss-Army knife at one’s disposal, but size is not without its costs.

He continues by explaining the disadvantages that come from this:

The JDK is big—and it’s also deeply interconnected. It has been built, on the whole, as a monolithic software system. In this mode of development it’s completely natural to take advantage of other parts of the platform when writing new code or even just improving old code, relying upon the flexible linking mechanism of the Java virtual machine to make it all work at runtime.

Over the years, however, this style of development can lead to unexpected connections between APIs—and between their implementations—leading in turn to increased startup time and memory footprint. A trivial command-line “Hello, world!” program, e.g., now loads and initializes over 300 separate classes, taking around 100ms on a recent desktop machine despite yet more heroic engineering efforts such as class-data sharing. The situation is even worse, of course, for larger applications.

Mark doesn’t seem to think that the Java Kernel and Quickstarter  features in the JDK 6u10  release, are enough:

The Java Kernel and Quickstarter  features in the JDK 6u10  release do improve download time and (cold) startup time, at least for Windows users. These techniques really just address the symptoms of long-term interconnected growth, however, rather than the underlying cause.

The modular JDK The most promising way to improve the key metrics of download time, startup time, and memory footprint is to attack the root problem head-on: Divide the JDK into a set of well-specified and separate, yet interdependent, modules.

He continues on about how this modularization would benefit the platform:

The process of restructuring the JDK into modules would force all of the unexpected interconnections out into the open where they can be analyzed and, in many cases, either hidden or eliminated. This would, in turn, reduce the total number of classes loaded and thereby improve both startup time and memory footprint.

If we had a modular JDK then at download time we could deliver just those modules required to start a particular application, rather than the entire JRE. The Java Kernel is a first step toward this kind of solution; a further advantage of having well-specified modules is that the download stream could be customized, in advance, to the particular needs of the application at hand.

Weijun comments on the original post, that the monolithic nature of the JDK is the result of Java not having a proper way of managing dependencies:

The JDK is big because Java never specified any industrial way of managing software dependencies.

Thus the only way to deploy reliably the Java stack was to bundle it in a huge monolithic monster.

Incidently, this does not work for third-parties, only for SUN and the JDK. The worst consequence of this lack of dependency management is not the bloated JDK but all the unmanageable apps which have been produced over time, with hardcoded classpath strings and massive forking (because if you can't manage deps and update them independently you may as well fork the private copy of the jars you're forced to bundle with your apps).

You don't have to look much farther to understand why Java only ever strived in J2EE servers (that provided a bit of the management lacking in the base Java platform).

GeekyCoder argues that a modular JDK is probably not a top priority for most developers:

While it may be "cool", I doubt that this is a top priority for most Java developers.

I have this bad feeling that you might be swayed by a handful of positive responses to your blog, without any idea of whether they're representative of the Java developer community as a whole.

Even just fixing the bugs that have the most votes would be a better strategy than just working on what's "cool".

I suppose "listening to your customers" is just one of those old-fashioned, closed-source ideas that has been thrown overboard. But look at the bright side...you now have two random developers who say they're willing to help. Good luck with that.

Similarly Michael B seems to think that enterprise users don’t care about a modular JDK:

A modular JDK (or rather JRE) is completely irrelevant to enterprise users. I think enterprise users prefer the "blob" as it is now, because modules mean dependencies and that smells like "DLL hell". The blob is easy to distribute and patch, that's what counts. Also, Java has good upward compatibility: Not only "write once, run everywhere", but also "write once, run forever", which means great return on investment (ROI). This is one of the reasons people prefer Java over .NET. MS strategy has always been: "Here's the new version of insert-favorite-hype-here, please retrain and port all your apps." MS technologies are way too short lived. The Java platform is already modular: The modules are third-party libraries that my app needs. I like the way Sun only makes libraries part of the platform when they have become mature enough. Robustnest and reliability are key to Java's success. So please go back now and fix those remaining bugs. I really liked that aspect of the 6u10 release!

You can find more information on the Java Platform or specifically on the Java SE right here on InfoQ.

Dionysios G. Synodinos is a Web Engineer and a freelance consultant, focusing on Web technologies

10 comments

Watch Thread Reply

Good Writing. by Ashwanth Fernando Posted
Harmony has already done this. by Neil Bartlett Posted
Leonardoavs by Leonardo Vargas Posted
Re: Leonardoavs by andrew mcveigh Posted
Re: Leonardoavs by Neil Bartlett Posted
Re: Leonardoavs by andrew mcveigh Posted
Re: Leonardoavs by Brian Edwards Posted
Re: Leonardoavs by Natan Cox Posted
Re: Leonardoavs by Alex Panzin Posted
Re: Leonardoavs by Pavel Rodinov Posted
  1. Back to top

    Good Writing.

    by Ashwanth Fernando

    That was an excellent writing, especially about the part that a modular JDK does not appeal to JEE developers because the JEE developers simply do not care about initialization time. Developers execute a hideous bulk of heavy lifting init code with a context listener instance that execute for 2 to 3 minutes in some cases. The truth really is that the customer does not care about the initialization time. The time taken to service each client request (whether from the consumer or an admin) matters more when compared to the initialization time for JEE apps.
    Modular JDK really really makes more sense for the Java ME apps where every millisecond of performance on resource constrainted hardware really matters.

  2. Back to top

    Harmony has already done this.

    by Neil Bartlett

    Yes the JDK can and should be modularised, and I'm very glad that somebody like Reinhold is talking about it. But I'm curious why there was no mention of Apache Harmony which has already done exactly this: it has separated the monolithic JRE libraries into modules using OSGi to describe the dependencies.

    That means if you don't need one or more of those modules (e.g. you're building a server app and you don't need Swing/AWT) you can simply remove them. There are runtime benefits but also distribution benefits, i.e. you can distribute applications with a slimmed-down copy of the JRE that contains only the modules your application needs. In contrast, the "consumer JRE" from Sun simply gives us a bootstrapping device that downloads the same old cruft in the background.

    Now, the modularisation is not perfect because the JRE APIs are somewhat intertwined. For example java.lang depends on java.net and vice versa, so they cannot be separated. Unfortunately fixing this problem would require rewriting a lot of those APIs which would break compatibility for almost every existing Java application. Still, Harmony has done an amazing job of modularising the JRE while remaining compatible with standard Java.

  3. Back to top

    Leonardoavs

    by Leonardo Vargas

    DLL Hell is produce for the lack of Modularity in Com+. Modularity is not only group classes in one package, Modularity is version, is coexistence of different version of the same library in the same virtual machine. I am a developer and when I find that a library need referents, but I don’t know where are these referents I hate Java or when a library with different versions numbers do not coexist I hate Java, because in many cases I have applications that need the same library, but with different version numbers, and in the JVM does not possible, so I need modularity.

  4. Back to top

    Re: Leonardoavs

    by andrew mcveigh

    > DLL Hell is produce for the lack of Modularity in Com+

    DLL Hell comes about because of a single global registry. Lots of different apps want a particular version of a DLL registered with the registry, and this is how the problem occurs. Hence local app-specific registries recently.

    COM+ is actually very modular. It's a component model with explicit separation between interfaces and components.

  5. Back to top

    Re: Leonardoavs

    by Neil Bartlett

    I'm not familiar with COM+, but the original DLL Hell stems from a single flat location (C:\Windows\System32) in which "modules" were stored with no version information, no dependency information, and often no meaningful name. It was common for an application installer to drop a DLL in there that had the same name as an existing DLL but was actually a different version, thereby breaking some other application.

    It's my understanding that Microsoft largely solved these problems with .NET Assemblies. But, I'm not a .NET developer so I don't want to comment on that.

    In Java today we have JAR Hell, which is comparable to the original DLL Hell, though it tends to affect individual applications rather than the whole O/S. First, we have a single flat location, known as the "classpath". Second we have deployment artefacts (JARs) that contain no dependency information, no versioning and often no meaningful name. I've seen Java projects that have literally hundreds of JARs on their classpath, where even the developers don't know exactly what every JAR does... but they can't remove any JAR because they don't know what might depend on it!

    Solving these problems is not rocket science: you simply need modules that are explicit about their version, what they depend on, and what they provide to other modules. The details can be tricky but OSGi has been around for 10 years now refining those details.

  6. Back to top

    Re: Leonardoavs

    by Brian Edwards

    Maven2 to the rescue!

  7. Back to top

    Re: Leonardoavs

    by andrew mcveigh

    I'm not familiar with COM+, but the original DLL Hell stems from a single flat location (C:\Windows\System32) in which "modules" were stored with no version information, no dependency information, and often no meaningful name.

    yes, that's the first one. There are COM related variants, specifically related to the use of a registry to find which DLLs to use for the subcomponents that make up a larger COM component. In this case, the use of a central registry caused a new form of hell ;-)
    It's my understanding that Microsoft largely solved these problems with .NET Assemblies. But, I'm not a .NET developer so I don't want to comment on that.

    I can't really comment on it too much either (I use mainly Java, but work in .Net also). I think the versioning is a big step fwd, but I think the GAC still causes problems. The problem is unavoidable to an extent -- providing a feature to centrally upgrade a particular component/DLL will naturally cause some conflicts/hell in some apps that don't quite conform.
    In Java today we have JAR Hell, which is comparable to the original DLL Hell, though it tends to affect individual applications rather than the whole O/S.

    yes, it's bad.

    Solving these problems is not rocket science: you simply need modules that are explicit about their version, what they depend on, and what they provide to other modules. The details can be tricky but OSGi has been around for 10 years now refining those details.

    Actually, modules and versioning is really, really hard. Even OSGi's approach has its own set of problems. The problems result from the natural tension between wanting every component within an app (or os?) to use a single current version of another component (for security and state reasons), and conversely insulating existing components from any changes due to the new version.

    Consider for instance how Eclipse (built around OSGi) relies on a central registry for plugins. When declaring dependencies, each plugin uses a convention of a version range (i.e. i accept anything less than v4.0.0 for this particular dependency) and relies on non-breaking API changes for minor updates (i.e. 3.5.4 etc). This is so that if you introduce a new version of a plugin, it will be used all over, rather than causing more than 1 version to be run concurrently. In the case of a plugin holding state or contributing to extension points (i.e. something non-trivial), the concurrent version situation is not allowed anyway.

    Andrew

  8. Back to top

    Re: Leonardoavs

    by Natan Cox

    Well. Maven does not save you at all.

    The typical Maven repository is configured in such a way that if you download one XML library (XOM for example, love it btw) you'll get dom4j, jaxen, jdom, xalan, xerxes and xml-apis.

  9. Back to top

    Re: Leonardoavs

    by Alex Panzin

    HA! Download the internet is the answer to over-bloated stuff?

  10. Back to top

    Re: Leonardoavs

    by Pavel Rodinov

    Maven2 to the rescue!

    Maven could not help you with handling different versions of libraries, it just could help you with choosing appropriate version of some libraries, which in some cases is not enough. Such problems should be solved by OSGI or Java Modular, but because it's not standart now it's still problematic.

Educational Content

New-age Transactional Systems - Not Your Grandpa's OLTP

John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.

Cool Code

Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.

Collaboration: At the Extremities of Extreme

Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.

Yesod Web Framework

Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).

Transactions without Transactions

Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.

Attila Szegedi on JVM and GC Performance Tuning at Twitter

Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.

10 tips on how to prevent business value risk

One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.

Interview: Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives

InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.