New-age Transactional Systems - Not Your Grandpa's OLTP
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
The content has been bookmarked!
There was an error bookmarking this content! Please retry.

Posted by Tom Tromey on Jun 26, 2006
Fedora Core 4 is the first Fedora release to include a substantial amount of code written in the Java programming language. These additions were made possible by improvements in GNU Classpath and GNU gcj.
First, GNU gcj is not Java.
Nevertheless, gcj aims to implement a complete system, compatible with Java, centered around an ahead-of-time compiler. It has a cleanroom class library based on GNU Classpath, and a built-in interpreter. The compiler can compile Java source files, class files, or even entire jar files to object code.
Historically gcj has taken a "radically traditional" approach of treating Java as if it were a somewhat unusual dialect of C++. This yields many nice results, but unfortunately the runtime linking models are too different -- and thus this approach breaks when faced with large, complex Java applications, particularly those that come with sophisticated class loading infrastructures.
For the GCC 4.0 release we implemented a new compilation mode for gcj, called the Binary Compatibility ABI. This compilation approach lets us defer all linking to runtime and fully implement Java's binary compatibility specification -- exactly what is needed to let precompiled code interact properly with class loading.
We also added a class mapping database. At runtime, whenever a class is defined, the gcj runtime (called "libgcj") will look for this class in this database. If the class is found (note that the *contents* of the class are used -- not merely the name), then libgcj will map in a shared library containing the compiled version of the class.
These two changes, taken together, let us do something quite powerful: we can compile Java programs ahead-of-time without requiring any application-level changes. Furthermore, due to a novel approach to bytecode verification, we're also able to ensure runtime type safety of the compiled code.
Building your Java program on Fedora Core is straightforward -- your existing build infrastructure should work as-is. Fedora Core ships 'ant' and uses the Java compiler from Eclipse to convert your Java programs to bytecode.
Describing how to write a source RPM is outside the scope of this article, but the Fedora RPM Guide has useful general information and JPackage has some Java-specific guidelines.
Once your program is compiled to bytecode, you can compile it to native code. Because gcj does not yet include a JIT, this is the way to get reasonable performance. In some cases the result may outperform existing JITs, due to gcj's use of shared libraries... you can expect to see the biggest difference when simultaneously running many instances of your application.
Fedora provides two programs which make it very simple to natively compile your package. These are used when building an RPM.
The first program is 'aot-compile-rpm'. This searches for jar files and compiles each to a shared library, using gcj. aot-compile-rpm knows a few useful gcj-specific tricks, such as splitting large jar files into pieces before compilation (compiling a large jar at one go can use an enormous amount of memory), and using -Bsymbolic when linking the resulting shared library (this results in a runtime performance improvement).
A substitute for this program, in case you are not building an RPM, is to simply compile each jar file in your program to a shared library. Here we show the simplest approach (as mentioned earlier, for a large jar this might be quite slow):
gcj -fjni -findirect-dispatch -fPIC -shared \
-Wl,-Bsymbolic -o foo.jar.so
foo.jar
Breaking it down:
The second useful program supplied with Fedora Core is rebuild-gcj-db. This program is used when installing or uninstalling an RPM to update the global class mapping database, and should be run in your RPM's %post and %postun sections.
rebuild-gcj-db operates by convention -- it assumes that each package installs its own class mapping database somewhere beneath /usr/lib/gcj (/usr/lib64/gcj for a 64 bit package on a multi-arch OS -- there are RPM macros to abstract out this oddity). Then it simply loops over all these individual databases and merges them into the global database which is used at runtime.
Note that it isn't always possible to compile a Java program using gcj. Gcj's class library is not quite complete, and on occasion a program will use some APIs which have not yet been implemented. For instance, Swing is currently under active development. Also, despite Sun's warning, some packages use private com.sun.* or sun.* APIs -- and generally speaking gcj does not implement these.
Fedora Core 4 uses gcj compilation for a number of programs.
The JOnAS application server, a J2EE implementation, is also working but has not yet gone through the Fedora Extras review process.
This past year the GNU Classpath community has made great strides toward completion, and we expect to continue this through 2006. We have an API comparison page which is updated regularly; you can track our API status here.
We're also working on some core improvements to gcj: updating the compiler, runtime, and class libraries to Java 5.
Finally, we're implementing the Java security infrastructure in libgcj. This will let us ship a Mozilla plugin and netx , a Java Web Start implementation.
Tom Tromey graduated from Caltech in 1990. He works on the GNU Java compiler and runtime for Red Hat. He wrote GNU Automake.
John Hugg discusses high volume transaction processing applications with high and low frequency profiles, and how VoltDB can be used for that purpose.
Kevlin Henney examines code samples to see what can be learned from them starting from the premise that one won’t write great code unless he knows how to read it.
Jason Ayers share the observations he made watching a team of developers collaborating in real time on the same code base, pushing XP, pair programming and continuous integration to their extremes.
Michael Snoyman presents Yesod, a web framework written in Haskell and containing a web server, templating, ORM, libraries (templating, gravatar, etc.).
Richard Kreuter and Kyle Banker on how to avoid classical RDBMS transactional systems by using compensation mechanisms, transactional messaging or transactional procedures.
Attila Szegedi talks about performance tuning Java and Scala programs at Twitter: how to approach GC problems, the importance of asynchronous I/O, when to use MySQL/Cassandra/Redis, and much more.
One category of risk that project teams need to ensure they address is business value failure – delivering a product that fails to provide value for the business investor.
InfoQ spoke to the authors of Software Systems Architecture on a couple of new topics, the System Context viewpoint and Agile, which have been added to the second edition.
No comments
Watch Thread Reply