BT

InfoQ Homepage News Serialization Optimization Pitfalls

Serialization Optimization Pitfalls

Bookmarks

In a response to a recent JavaLobby thread, Tom Hawtin looks at optimization of serialization and decides that you shouldn't do it.

The JavaLobby post describes a few different options for optimization and found that by using the Externalizable interface, instead of Serializable, yields significant performance gains (up to 55% in one version of the JDK). Hatwin noted some flaws in the microbenchmark, most significantly:

Serializable objects that implement Externalizable do not have fields included in their class descriptors. Not a lot of people know that. The descriptions are used if there is no writeObject/readObject method and for defaultReadObject and readFields. So, for fairness, the fields for the writeObject/readObject version should be marked as transient.

After running his updated microbenchmark he finds four conclusions:

  • Don't use Externalizable
  • Do reuse streams
  • Implementing writeObject and readObject by hand can improve performance
  • JVMs get better at serialization with each release

In a follow up to his benchmark, Hatwin finds a small optimization that is valuable. He notes that calling ObjectStreamClass.lookup on all classes that you are going to serialize, and assigning the results to static fields, will provide a boost. The boost comes from the fact that the soft reference cache used by serialization will never miss if you hold on to a reference. He has not yet updated his benchmarks to show the impact on it.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • JBoss Serialization

    by Sacha Labourey /

    Your message is awaiting moderation. Thank you for participating in the discussion.

    You might want to check "JBoss Serialization" as well (led by Clebert Suconic):
    labs.jboss.com/portal/serialization/

    From the project page:


    We (java developers) have accepted over the years java.io.ObjectInputStream and java.io.ObjectOutputStream being slow when dealing with writeObject operations.
    We then started using Externalizable objects as a faster approach for serialization, but even that way was slow when using writeObject operations inside externalizable classes.
    Recently we discovered that most of the problems in JavaSerialization are related to static synchronized caching, what causes CPU spikes and also diminishes scaling capabilities.
    With JBossSerialization we have done internal benchmarks and we have realized at least 2 times faster serialization with this library. These benchmarks are commited into our CVS repository (as testcases) and they are publicly available.

    The main feature in JBossSerialization besides performance, is Smart Cloning
    Smart cloning is the capability of the reuse of final fields among different class loaders doing exactly what serialization does, without the need of convert every field into a byteArray.
    This approach is at least 10 times faster than using serialization over a byte array.

    Using JBossSerialization is very simple:
    Instead of instantiating ObjectOutputStream and ObjectInputStream, you would instantiate JBossObjectInputStream and JBossObjectOutputStream (org.jboss.serial.io). You will also have to add jboss-serialization.jar to your classpath.
    Everything from the specification is then respected. The only exceptions are:
    I - We are using a different protocol as we are focusing in performance (e.g. Smart Cloning ), so that requires the object being serialized and deserialized with JBossSerialization classes.
    II - You can serialize classes that are not implementing Serializable, although that's a feature as you can enable that checking.

    It works on Java 1.4.2 or JVM 1.5

  • you need multi-threaded microbench

    by Bill Burke /

    Your message is awaiting moderation. Thank you for participating in the discussion.

    If you run a multi-threaded microbench you'll find that the SoftReferenceCache mentioned above becomes a point of contention because it is accessed in a synchronized block. Performance starts to degrade after like 50-100+ threads because of the contention. Replacing this cache with a ConcurrentHashMap (i used the oswego one when I did my benchmarks) removes this contention and give you better scalability.

    Bill

  • Re: you need multi-threaded microbench

    by Thomas Hawtin /

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Recent 1.5.0 updates (and 1.6.0) have ditched ye olde SoftCache for a new concurrent-friendly cache. I'm not going to suggest that java.io has respectable performance, but it's an improvement.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT

Is your profile up-to-date? Please take a moment to review and update.

Note: If updating/changing your email, a validation request will be sent

Company name:
Company role:
Company size:
Country/Zone:
State/Province/Region:
You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.