The Massive, Monolithic JDK should become Modular
Mark Reinhold, Principal Engineer at Sun Microsystems, has been advocating about how “cool” would be for the Sun JDK to be modular. He’s is putting up a good argument about how the complexity is hurting the platform and how the Java Kernel and Quickstarter features in the JDK 6u10 release just address the symptoms of JDK’s long-term interconnected growth.
Mark starts-off by explaining how we got to the current state of having such a huge JDK:
The JDK is big, too-though not (yet) as big as space.
It’s big because over the last thirteen years the Java SE platform has grown from a small system originally intended for embedded devices into a rich collection of libraries serving a wide variety of needs across a broad range of environments.
It’s incredibly handy to have such a large and capable Swiss-Army knife at one’s disposal, but size is not without its costs.
He continues by explaining the disadvantages that come from this:
The JDK is big—and it’s also deeply interconnected. It has been built, on the whole, as a monolithic software system. In this mode of development it’s completely natural to take advantage of other parts of the platform when writing new code or even just improving old code, relying upon the flexible linking mechanism of the Java virtual machine to make it all work at runtime.
Over the years, however, this style of development can lead to unexpected connections between APIs—and between their implementations—leading in turn to increased startup time and memory footprint. A trivial command-line “Hello, world!” program, e.g., now loads and initializes over 300 separate classes, taking around 100ms on a recent desktop machine despite yet more heroic engineering efforts such as class-data sharing. The situation is even worse, of course, for larger applications.
Mark doesn’t seem to think that the Java Kernel and Quickstarter features in the JDK 6u10 release, are enough:
The Java Kernel and Quickstarter features in the JDK 6u10 release do improve download time and (cold) startup time, at least for Windows users. These techniques really just address the symptoms of long-term interconnected growth, however, rather than the underlying cause.
The modular JDK The most promising way to improve the key metrics of download time, startup time, and memory footprint is to attack the root problem head-on: Divide the JDK into a set of well-specified and separate, yet interdependent, modules.
He continues on about how this modularization would benefit the platform:
The process of restructuring the JDK into modules would force all of the unexpected interconnections out into the open where they can be analyzed and, in many cases, either hidden or eliminated. This would, in turn, reduce the total number of classes loaded and thereby improve both startup time and memory footprint.
If we had a modular JDK then at download time we could deliver just those modules required to start a particular application, rather than the entire JRE. The Java Kernel is a first step toward this kind of solution; a further advantage of having well-specified modules is that the download stream could be customized, in advance, to the particular needs of the application at hand.
Weijun comments on the original post, that the monolithic nature of the JDK is the result of Java not having a proper way of managing dependencies:
The JDK is big because Java never specified any industrial way of managing software dependencies.
Thus the only way to deploy reliably the Java stack was to bundle it in a huge monolithic monster.
Incidently, this does not work for third-parties, only for SUN and the JDK. The worst consequence of this lack of dependency management is not the bloated JDK but all the unmanageable apps which have been produced over time, with hardcoded classpath strings and massive forking (because if you can't manage deps and update them independently you may as well fork the private copy of the jars you're forced to bundle with your apps).
You don't have to look much farther to understand why Java only ever strived in J2EE servers (that provided a bit of the management lacking in the base Java platform).
GeekyCoder argues that a modular JDK is probably not a top priority for most developers:
While it may be "cool", I doubt that this is a top priority for most Java developers.
I have this bad feeling that you might be swayed by a handful of positive responses to your blog, without any idea of whether they're representative of the Java developer community as a whole.
Even just fixing the bugs that have the most votes would be a better strategy than just working on what's "cool".
I suppose "listening to your customers" is just one of those old-fashioned, closed-source ideas that has been thrown overboard. But look at the bright side...you now have two random developers who say they're willing to help. Good luck with that.
Similarly Michael B seems to think that enterprise users don’t care about a modular JDK:
A modular JDK (or rather JRE) is completely irrelevant to enterprise users. I think enterprise users prefer the "blob" as it is now, because modules mean dependencies and that smells like "DLL hell". The blob is easy to distribute and patch, that's what counts. Also, Java has good upward compatibility: Not only "write once, run everywhere", but also "write once, run forever", which means great return on investment (ROI). This is one of the reasons people prefer Java over .NET. MS strategy has always been: "Here's the new version of insert-favorite-hype-here, please retrain and port all your apps." MS technologies are way too short lived. The Java platform is already modular: The modules are third-party libraries that my app needs. I like the way Sun only makes libraries part of the platform when they have become mature enough. Robustnest and reliability are key to Java's success. So please go back now and fix those remaining bugs. I really liked that aspect of the 6u10 release!
You can find more information on the Java Platform or specifically on the Java SE right here on InfoQ.
Good Writing.
by
Ashwanth Fernando
Modular JDK really really makes more sense for the Java ME apps where every millisecond of performance on resource constrainted hardware really matters.
Harmony has already done this.
by
Neil Bartlett
That means if you don't need one or more of those modules (e.g. you're building a server app and you don't need Swing/AWT) you can simply remove them. There are runtime benefits but also distribution benefits, i.e. you can distribute applications with a slimmed-down copy of the JRE that contains only the modules your application needs. In contrast, the "consumer JRE" from Sun simply gives us a bootstrapping device that downloads the same old cruft in the background.
Now, the modularisation is not perfect because the JRE APIs are somewhat intertwined. For example java.lang depends on java.net and vice versa, so they cannot be separated. Unfortunately fixing this problem would require rewriting a lot of those APIs which would break compatibility for almost every existing Java application. Still, Harmony has done an amazing job of modularising the JRE while remaining compatible with standard Java.
Leonardoavs
by
Leonardo Vargas
Re: Leonardoavs
by
andrew mcveigh
DLL Hell comes about because of a single global registry. Lots of different apps want a particular version of a DLL registered with the registry, and this is how the problem occurs. Hence local app-specific registries recently.
COM+ is actually very modular. It's a component model with explicit separation between interfaces and components.
Re: Leonardoavs
by
Neil Bartlett
It's my understanding that Microsoft largely solved these problems with .NET Assemblies. But, I'm not a .NET developer so I don't want to comment on that.
In Java today we have JAR Hell, which is comparable to the original DLL Hell, though it tends to affect individual applications rather than the whole O/S. First, we have a single flat location, known as the "classpath". Second we have deployment artefacts (JARs) that contain no dependency information, no versioning and often no meaningful name. I've seen Java projects that have literally hundreds of JARs on their classpath, where even the developers don't know exactly what every JAR does... but they can't remove any JAR because they don't know what might depend on it!
Solving these problems is not rocket science: you simply need modules that are explicit about their version, what they depend on, and what they provide to other modules. The details can be tricky but OSGi has been around for 10 years now refining those details.
Re: Leonardoavs
by
andrew mcveigh
I'm not familiar with COM+, but the original DLL Hell stems from a single flat location (C:\Windows\System32) in which "modules" were stored with no version information, no dependency information, and often no meaningful name.
yes, that's the first one. There are COM related variants, specifically related to the use of a registry to find which DLLs to use for the subcomponents that make up a larger COM component. In this case, the use of a central registry caused a new form of hell ;-)
It's my understanding that Microsoft largely solved these problems with .NET Assemblies. But, I'm not a .NET developer so I don't want to comment on that.
I can't really comment on it too much either (I use mainly Java, but work in .Net also). I think the versioning is a big step fwd, but I think the GAC still causes problems. The problem is unavoidable to an extent -- providing a feature to centrally upgrade a particular component/DLL will naturally cause some conflicts/hell in some apps that don't quite conform.
In Java today we have JAR Hell, which is comparable to the original DLL Hell, though it tends to affect individual applications rather than the whole O/S.
yes, it's bad.
Solving these problems is not rocket science: you simply need modules that are explicit about their version, what they depend on, and what they provide to other modules. The details can be tricky but OSGi has been around for 10 years now refining those details.
Actually, modules and versioning is really, really hard. Even OSGi's approach has its own set of problems. The problems result from the natural tension between wanting every component within an app (or os?) to use a single current version of another component (for security and state reasons), and conversely insulating existing components from any changes due to the new version.
Consider for instance how Eclipse (built around OSGi) relies on a central registry for plugins. When declaring dependencies, each plugin uses a convention of a version range (i.e. i accept anything less than v4.0.0 for this particular dependency) and relies on non-breaking API changes for minor updates (i.e. 3.5.4 etc). This is so that if you introduce a new version of a plugin, it will be used all over, rather than causing more than 1 version to be run concurrently. In the case of a plugin holding state or contributing to extension points (i.e. something non-trivial), the concurrent version situation is not allowed anyway.
Andrew
Re: Leonardoavs
by
Natan Cox
The typical Maven repository is configured in such a way that if you download one XML library (XOM for example, love it btw) you'll get dom4j, jaxen, jdom, xalan, xerxes and xml-apis.
Re: Leonardoavs
by
Pavel Rodinov
Maven2 to the rescue!
Maven could not help you with handling different versions of libraries, it just could help you with choosing appropriate version of some libraries, which in some cases is not enough. Such problems should be solved by OSGI or Java Modular, but because it's not standart now it's still problematic.
Educational Content
Concurrency in Clojure
Stuart Halloway May 17, 2013
Confessions of an Agile Addict
Ole Friis Østergaard May 16, 2013
Web Development: You're Doing It Wrong
Stefan Tilkov May 16, 2013
Programming The Feynman Way
Ben Evans May 15, 2013





Hello stranger!
You need to Register an InfoQ account or Login to post comments. But there's so much more behind being registered.Get the most out of the InfoQ experience.
Tell us what you think