Mark Reinhold, Principal Engineer at Sun Microsystems, has been advocating about how “cool” would be for the Sun JDK to be modular. He’s is putting up a good argument about how the complexity is hurting the platform and how the Java Kernel and Quickstarter features in the JDK 6u10 release just address the symptoms of JDK’s long-term interconnected growth.
Mark starts-off by explaining how we got to the current state of having such a huge JDK:
The JDK is big, too-though not (yet) as big as space.
It’s big because over the last thirteen years the Java SE platform has grown from a small system originally intended for embedded devices into a rich collection of libraries serving a wide variety of needs across a broad range of environments.
It’s incredibly handy to have such a large and capable Swiss-Army knife at one’s disposal, but size is not without its costs.
He continues by explaining the disadvantages that come from this:
The JDK is big—and it’s also deeply interconnected. It has been built, on the whole, as a monolithic software system. In this mode of development it’s completely natural to take advantage of other parts of the platform when writing new code or even just improving old code, relying upon the flexible linking mechanism of the Java virtual machine to make it all work at runtime.
Over the years, however, this style of development can lead to unexpected connections between APIs—and between their implementations—leading in turn to increased startup time and memory footprint. A trivial command-line “Hello, world!” program, e.g., now loads and initializes over 300 separate classes, taking around 100ms on a recent desktop machine despite yet more heroic engineering efforts such as class-data sharing. The situation is even worse, of course, for larger applications.
Mark doesn’t seem to think that the Java Kernel and Quickstarter features in the JDK 6u10 release, are enough:
The Java Kernel and Quickstarter features in the JDK 6u10 release do improve download time and (cold) startup time, at least for Windows users. These techniques really just address the symptoms of long-term interconnected growth, however, rather than the underlying cause.
The modular JDK The most promising way to improve the key metrics of download time, startup time, and memory footprint is to attack the root problem head-on: Divide the JDK into a set of well-specified and separate, yet interdependent, modules.
He continues on about how this modularization would benefit the platform:
The process of restructuring the JDK into modules would force all of the unexpected interconnections out into the open where they can be analyzed and, in many cases, either hidden or eliminated. This would, in turn, reduce the total number of classes loaded and thereby improve both startup time and memory footprint.
If we had a modular JDK then at download time we could deliver just those modules required to start a particular application, rather than the entire JRE. The Java Kernel is a first step toward this kind of solution; a further advantage of having well-specified modules is that the download stream could be customized, in advance, to the particular needs of the application at hand.
Weijun comments on the original post, that the monolithic nature of the JDK is the result of Java not having a proper way of managing dependencies:
The JDK is big because Java never specified any industrial way of managing software dependencies.
Thus the only way to deploy reliably the Java stack was to bundle it in a huge monolithic monster.
Incidently, this does not work for third-parties, only for SUN and the JDK. The worst consequence of this lack of dependency management is not the bloated JDK but all the unmanageable apps which have been produced over time, with hardcoded classpath strings and massive forking (because if you can't manage deps and update them independently you may as well fork the private copy of the jars you're forced to bundle with your apps).
You don't have to look much farther to understand why Java only ever strived in J2EE servers (that provided a bit of the management lacking in the base Java platform).
GeekyCoder argues that a modular JDK is probably not a top priority for most developers:
While it may be "cool", I doubt that this is a top priority for most Java developers.
I have this bad feeling that you might be swayed by a handful of positive responses to your blog, without any idea of whether they're representative of the Java developer community as a whole.
Even just fixing the bugs that have the most votes would be a better strategy than just working on what's "cool".
I suppose "listening to your customers" is just one of those old-fashioned, closed-source ideas that has been thrown overboard. But look at the bright side...you now have two random developers who say they're willing to help. Good luck with that.
Similarly Michael B seems to think that enterprise users don’t care about a modular JDK:
A modular JDK (or rather JRE) is completely irrelevant to enterprise users. I think enterprise users prefer the "blob" as it is now, because modules mean dependencies and that smells like "DLL hell". The blob is easy to distribute and patch, that's what counts. Also, Java has good upward compatibility: Not only "write once, run everywhere", but also "write once, run forever", which means great return on investment (ROI). This is one of the reasons people prefer Java over .NET. MS strategy has always been: "Here's the new version of insert-favorite-hype-here, please retrain and port all your apps." MS technologies are way too short lived. The Java platform is already modular: The modules are third-party libraries that my app needs. I like the way Sun only makes libraries part of the platform when they have become mature enough. Robustnest and reliability are key to Java's success. So please go back now and fix those remaining bugs. I really liked that aspect of the 6u10 release!
You can find more information on the Java Platform or specifically on the Java SE right here on InfoQ.
Community comments
Good Writing.
by Sara Jay,
Harmony has already done this.
by Neil Bartlett,
Leonardoavs
by Leonardo Vargas,
Re: Leonardoavs
by andrew mcveigh,
Re: Leonardoavs
by Neil Bartlett,
Re: Leonardoavs
by andrew mcveigh,
Re: Leonardoavs
by Brian Edwards,
Re: Leonardoavs
by Natan Cox,
Re: Leonardoavs
by Alex Panzin,
Re: Leonardoavs
by Pavel Rodinov,
Good Writing.
by Sara Jay,
Your message is awaiting moderation. Thank you for participating in the discussion.
That was an excellent writing, especially about the part that a modular JDK does not appeal to JEE developers because the JEE developers simply do not care about initialization time. Developers execute a hideous bulk of heavy lifting init code with a context listener instance that execute for 2 to 3 minutes in some cases. The truth really is that the customer does not care about the initialization time. The time taken to service each client request (whether from the consumer or an admin) matters more when compared to the initialization time for JEE apps.
Modular JDK really really makes more sense for the Java ME apps where every millisecond of performance on resource constrainted hardware really matters.
Harmony has already done this.
by Neil Bartlett,
Your message is awaiting moderation. Thank you for participating in the discussion.
Yes the JDK can and should be modularised, and I'm very glad that somebody like Reinhold is talking about it. But I'm curious why there was no mention of Apache Harmony which has already done exactly this: it has separated the monolithic JRE libraries into modules using OSGi to describe the dependencies.
That means if you don't need one or more of those modules (e.g. you're building a server app and you don't need Swing/AWT) you can simply remove them. There are runtime benefits but also distribution benefits, i.e. you can distribute applications with a slimmed-down copy of the JRE that contains only the modules your application needs. In contrast, the "consumer JRE" from Sun simply gives us a bootstrapping device that downloads the same old cruft in the background.
Now, the modularisation is not perfect because the JRE APIs are somewhat intertwined. For example java.lang depends on java.net and vice versa, so they cannot be separated. Unfortunately fixing this problem would require rewriting a lot of those APIs which would break compatibility for almost every existing Java application. Still, Harmony has done an amazing job of modularising the JRE while remaining compatible with standard Java.
Leonardoavs
by Leonardo Vargas,
Your message is awaiting moderation. Thank you for participating in the discussion.
DLL Hell is produce for the lack of Modularity in Com+. Modularity is not only group classes in one package, Modularity is version, is coexistence of different version of the same library in the same virtual machine. I am a developer and when I find that a library need referents, but I don’t know where are these referents I hate Java or when a library with different versions numbers do not coexist I hate Java, because in many cases I have applications that need the same library, but with different version numbers, and in the JVM does not possible, so I need modularity.
Re: Leonardoavs
by andrew mcveigh,
Your message is awaiting moderation. Thank you for participating in the discussion.
> DLL Hell is produce for the lack of Modularity in Com+
DLL Hell comes about because of a single global registry. Lots of different apps want a particular version of a DLL registered with the registry, and this is how the problem occurs. Hence local app-specific registries recently.
COM+ is actually very modular. It's a component model with explicit separation between interfaces and components.
Re: Leonardoavs
by Neil Bartlett,
Your message is awaiting moderation. Thank you for participating in the discussion.
I'm not familiar with COM+, but the original DLL Hell stems from a single flat location (C:\Windows\System32) in which "modules" were stored with no version information, no dependency information, and often no meaningful name. It was common for an application installer to drop a DLL in there that had the same name as an existing DLL but was actually a different version, thereby breaking some other application.
It's my understanding that Microsoft largely solved these problems with .NET Assemblies. But, I'm not a .NET developer so I don't want to comment on that.
In Java today we have JAR Hell, which is comparable to the original DLL Hell, though it tends to affect individual applications rather than the whole O/S. First, we have a single flat location, known as the "classpath". Second we have deployment artefacts (JARs) that contain no dependency information, no versioning and often no meaningful name. I've seen Java projects that have literally hundreds of JARs on their classpath, where even the developers don't know exactly what every JAR does... but they can't remove any JAR because they don't know what might depend on it!
Solving these problems is not rocket science: you simply need modules that are explicit about their version, what they depend on, and what they provide to other modules. The details can be tricky but OSGi has been around for 10 years now refining those details.
Re: Leonardoavs
by Brian Edwards,
Your message is awaiting moderation. Thank you for participating in the discussion.
Maven2 to the rescue!
Re: Leonardoavs
by andrew mcveigh,
Your message is awaiting moderation. Thank you for participating in the discussion.
yes, that's the first one. There are COM related variants, specifically related to the use of a registry to find which DLLs to use for the subcomponents that make up a larger COM component. In this case, the use of a central registry caused a new form of hell ;-)
I can't really comment on it too much either (I use mainly Java, but work in .Net also). I think the versioning is a big step fwd, but I think the GAC still causes problems. The problem is unavoidable to an extent -- providing a feature to centrally upgrade a particular component/DLL will naturally cause some conflicts/hell in some apps that don't quite conform.
yes, it's bad.
Actually, modules and versioning is really, really hard. Even OSGi's approach has its own set of problems. The problems result from the natural tension between wanting every component within an app (or os?) to use a single current version of another component (for security and state reasons), and conversely insulating existing components from any changes due to the new version.
Consider for instance how Eclipse (built around OSGi) relies on a central registry for plugins. When declaring dependencies, each plugin uses a convention of a version range (i.e. i accept anything less than v4.0.0 for this particular dependency) and relies on non-breaking API changes for minor updates (i.e. 3.5.4 etc). This is so that if you introduce a new version of a plugin, it will be used all over, rather than causing more than 1 version to be run concurrently. In the case of a plugin holding state or contributing to extension points (i.e. something non-trivial), the concurrent version situation is not allowed anyway.
Andrew
Re: Leonardoavs
by Natan Cox,
Your message is awaiting moderation. Thank you for participating in the discussion.
Well. Maven does not save you at all.
The typical Maven repository is configured in such a way that if you download one XML library (XOM for example, love it btw) you'll get dom4j, jaxen, jdom, xalan, xerxes and xml-apis.
Re: Leonardoavs
by Alex Panzin,
Your message is awaiting moderation. Thank you for participating in the discussion.
HA! Download the internet is the answer to over-bloated stuff?
Re: Leonardoavs
by Pavel Rodinov,
Your message is awaiting moderation. Thank you for participating in the discussion.
Maven could not help you with handling different versions of libraries, it just could help you with choosing appropriate version of some libraries, which in some cases is not enough. Such problems should be solved by OSGI or Java Modular, but because it's not standart now it's still problematic.