BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Under The Hood with the JVM's Automatic Resource Management

Under The Hood with the JVM's Automatic Resource Management

Bookmarks

Key Takeaways

  • Learn the difference between C++'s RAII pattern and Java finalization
  • Dive into Hotspot's source code and see how finalizers are registered
  • Compare finalize() with Java 7 try-with-resources
  • See how TWR is implemented in bytecode
  • Understand why TWR is superior to finalize()

Material in this article has been adapted with permission from the forthcoming book "Optimizing Java" by Ben Evans and James Gough. The book is published by O’Reilly and is available now in Early Release from O'Reilly and from Amazon.

InfoQ recently reported on the proposed deprecation of the method finalize() on the Object type. This method has been present since Java 1.0, but is widely regarded as a misfeature and a significant piece of legacy cruft in the platform. Nevertheless, the deprecation of a method present on Java’s Object type, would be a highly unusual step.

Background

The finalize() mechanism is an attempt to provide automatic resource management, in a similar way to the RAII (Resource Acquisition Is Initialisation) pattern from C++ and similar languages. In that pattern, a destructor method (known as finalize() in Java) is provided, to enable automatic cleanup, and release of resources when the object is destroyed.

The basic use case for this is fairly simple - when an object is created, it takes ownership of some resource, and the object’s ownership of that resource persists for the lifetime of the object. Then, when the object dies, the ownership of the resource is automatically relinquished.

Let’s look at a quick simple C++ example that shows how to put an RAII wrapper around C-style file I/O. The core of this technique is that the object destructor method (denoted with a ~ at the start of a method named the same as the class) is used for cleanup:

class file_error {};

class file {
  public:
     file(const char* filename) : _h_file(std::fopen(filename, "w+")) {
         if (_h_file == NULL) {
             throw file_error();
         }
     }

     // Destructor
     ~file() { std::fclose(_h_file); }

     void write(const char* str) {
         if (std::fputs(str, _h_file) == EOF) {
             throw file_error();
         }
     }

     void write(const char* buffer, std::size_t numc) {
         if (numc != 0 && std::fwrite(buffer, numc, 1, _h_file) == 0) {
             throw file_error() ;
         }
     }

 private:
     std::FILE* _h_file;
};

The standard rationale for this approach is the observation that when the programmer opens a file handle it is all too easy to forget to call the close() function when it is no longer required, and so tying the resource ownership to the object lifetime makes sense. Getting rid of the object’s resources automatically then becomes the responsibility of the platform, not the programmer.

This promotes good design, especially when the only reason for a type to exist is to act as a "holder" of a resource such as a file or network socket.

In the Java world, the way that this is implemented is to use the JVM’s garbage collector as the subsystem that can definitively say that the object has died. If a finalize() method is provided on a type, then all objects of that type receive special treatment. An object that overrides finalize() is treated specially by the garbage collector.

NOTE

The JVM registers finalizable objects by running a special handler on them just after a successful return from Object.<init> (the ultimate superclass constructor for the specified type).

One detail of Hotspot that we need to be aware of is that the VM has some special, implementation specific bytecodes in addition to the standard Java instructions. These specialist bytecodes are used to rewrite the standard ones in order to cope with certain special circumstances.

A complete list of the bytecode definitions, both standard Java and Hotspot special-case can be found here.

For our purposes, we care about the special case: return_register_finalizer instruction. This is needed because it is possible for JVMTI to rewrite bytecode for Object.<init>. To precisely obey the standard, and register the finalizer at the correct time, it is necessary to identify the point at which Object.<init> completes without the rewriting, and the special-case bytecode is used to mark this point.

The code for actually registering the object as needing finalization can be seen in the Hotspot interpreter. The file hotspot/src/cpu/x86/vm/c1_Runtime1_x86.cpp contains the core of the x86-specific port of the Hotspot interpreter. This has to be processor-specific because Hotspot makes heavy use of low-level assembly / machine code. The case register_finalizer_id contains the registration code.

Once the object has been registered as needing finalization, then instead of being immediately reclaimed during the garbage collection (GC) cycle, the object undergoes the following extended lifecycle:

  1. Finalizable objects are recognised due to their prior registration. They are placed onto a special finalization queue.
  2. After application threads restart following GC, separate finalization threads drain the queue.
  3. Each object is removed from the queue and a secondary finalization thread is started to run the finalize() method for this instance.
  4. Once the finalize() method terminates, the object will be ready for actual collection in the next cycle.

Overall, this means that all objects to be finalized must first be recognized as unreachable via a GC mark, then finalized, and then GC must run again in order for the data to be collected. This means that finalizable objects persist for 1 extra GC cycle at least. In the case of objects that have become tenured, this can be a significant amount of time.

The mechanism has some extra complexity - more than we would like - as the queue-draining threads have to start secondary finalization threads that actually run the finalize() method. This is necessary to guard against the possibility that finalize() will block.

If finalize() were run on the queue-draining threads then a badly written finalize() could prevent the entire mechanism from working. To prevent this, we are forced to create a brand new thread for each object instance that requires finalization.

Not only that, but finalization threads must also ignore any exceptions that are thrown. This seems strange at first, but the finalization thread has no real way to handle the exception, and the original context that created the finalizable object is long gone. There is no meaningful way for any user code to be provided that could be aware of, or recover from the exception.

To clarify this, recall that an exception in Java provides a way to unwind the stack to find a method within the current execution thread that can recover from a non-fatal error. Seen in this light the restriction that finalization ignores exceptions is more understandable - the finalize() call happens on a totally different thread than the one that created or executed the object.

The majority of the finalization implementation is actually written in Java. The JVM has separate threads to perform finalization, that run at the same time as application threads for the majority of the required work. The core functionality is contained in the class java.lang.ref.Finalizer, a package-private class that is fairly simple to read.

The Finalizer class also provides some insight into how classes that are granted additional privilege by the runtime are granted that privilege. For example, it contains code like this:

/* Invoked by VM */
static void register(Object finalizee) {
    new Finalizer(finalizee);
}

Of course, in regular application code, this code would be nonsensical, as it creates an unused object. Unless the constructor has side-effects (usually considered a bad design decision in Java), this would do nothing. In this case, the intent is to "hook" a new finalizable object.

The implementation of finalization also relies heavily on the FinalReference class. This is a subclass of java.lang.ref.Reference, a class that the runtime and VM handle specially. Like the more well-known soft and weak references, FinalReference objects get special treatment by the GC subsystem, comprising a mechanism that provides an interesting interaction between the VM and Java code (both platform and user).

The Downside

For all its technical interest the Java finalization implementation is fatally flawed, due to a mismatch with the memory management scheme of the platform."

In the C++ case, memory is handled manually, with explicit lifetime management of objects under the explicit control of the programmer. This means that destruction can happen as the object is deleted, and so the acquisition and release of resources is directly tied to the lifetime of the object.

Java’s memory management subsystem is a garbage collector that runs as-needed, in response to running out of available memory to allocate. It therefore runs at non-deterministic intervals (if at all) and so the finalize() method is run only when the object is collected, at some unknown time.

If the finalize() mechanism is used to automatically release resources (such as file handles), then there is no guarantee as to when (if ever) those resources will actually become available. This makes the finalize() mechanism fundamentally unsuitable for its stated purpose - automatic resource management.

Try-with-resources

To safely handle resource-owning objects, Java 7 introduced try-with-resources - a new syntax feature especially designed for handling resources automatically. This language level construct allows the a resource that is to be managed to be specified in parenthesis following the try keyword.

This must be an object construction clause - regular Java code is not permitted. The Java compiler will also check that the object being created is of a type that implements the AutoCloseable interface (which is a superinterface of Closeable that was introduced in Java 7 specifically for this purpose).

The resource objects are thus in scope for the body of the try block, and at the end of the scope of the try block the close() method is called automatically, rather than making the developer remember to call the function. The invocation of the close() method behaves as if it were in a finally, and so is run even when an exception is thrown in the business logic.

NOTE

The automatic part of the cleanup actually produces better code than humans do - as javac knows the order to close resources that have dependencies between them (such as JDBC Connection and related types). This means that the preferred way to use the mechanism is to use try-with-resources rather than the old manual close.

The key point is that the lifetime of the local variable is now constrained to a single scope, so the automatic cleanup becomes tied to a scope and not to object lifetime. For example:

public void readFirstLine(File file) throws IOException {
    try (BufferedReader reader = new BufferedReader(new FileReader(file))) {
        String firstLine = reader.readLine();
        System.out.println(firstLine);
    }
}

This innocent try-with-resources code is compiled into a fairly large amount of bytecode, which we can see by using the -p switch to javap to dump a decompiled form.

public void readFirstLine(java.io.File) throws java.io.IOException;
    Code:
       0: new           #2  // class java/io/BufferedReader
       3: dup
       4: new           #3  // class java/io/FileReader
       7: dup
       8: aload_1
       9: invokespecial #4  // Method java/io/FileReader."<init>":(Ljava/io/File;)V
      12: invokespecial #5  // Method java/io/BufferedReader."<init>":(Ljava/io/Reader;)V
      15: astore_2
      16: aconst_null
      17: astore_3
      18: aload_2
      19: invokevirtual #6  // Method java/io/BufferedReader.readLine:()Ljava/lang/String;
      22: astore        4
      24: getstatic     #7  // Field java/lang/System.out:Ljava/io/PrintStream;
      27: aload         4
      29: invokevirtual #8  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      32: aload_2
      33: ifnull        108
      36: aload_3
      37: ifnull        58
      40: aload_2
      41: invokevirtual #9  // Method java/io/BufferedReader.close:()V
      44: goto          108
      47: astore        4
      49: aload_3
      50: aload         4
      52: invokevirtual #11 // Method java/lang/Throwable.addSuppressed:(Ljava/lang/Throwable;)V
      55: goto          108
      58: aload_2
      59: invokevirtual #9  // Method java/io/BufferedReader.close:()V
      62: goto          108
      65: astore        4
      67: aload         4
      69: astore_3
      70: aload         4
      72: athrow
      73: astore        5
      75: aload_2
      76: ifnull        105
      79: aload_3
      80: ifnull        101
      83: aload_2
      84: invokevirtual #9  // Method java/io/BufferedReader.close:()V
      87: goto          105
      90: astore        6
      92: aload_3
      93: aload         6
      95: invokevirtual #11 // Method java/lang/Throwable.addSuppressed:(Ljava/lang/Throwable;)V
      98: goto          105
     101: aload_2
     102: invokevirtual #9  // Method java/io/BufferedReader.close:()V
     105: aload         5
     107: athrow
     108: return
    Exception table:
       from    to  target type
          40    44    47   Class java/lang/Throwable
          18    32    65   Class java/lang/Throwable
          18    32    73   any
          83    87    90   Class java/lang/Throwable
          65    75    73   any

Despite having the same design intent, the finalization and try-with-resources are radically different from each other; finalization relies on assembly code deep in the interpreter to register objects for finalization and uses the garbage collector to kick off the cleanup using a reference queue and separate dedicated finalization threads. In particular, there is little if any trace of the mechanism in the bytecode, and the capability is provided by special mechanisms within the VM.

By contrast, try-with-resources is a purely compile-time mechanism that can be seen as syntactic sugar that simply produces regular bytecode and has no other special runtime behavior. The only possible visible effect of try-with-resources is that as it emits a large amount of automatic bytecode, it may impact the ability of the JIT compiler to effectively inline or compile methods that use it. This is not a reason to avoid it, however.

In summary, for resource management and in almost all other cases finalization is not fit for purpose. Finalization depends on GC, which is itself a non deterministic process, so that anything relying on finalization has no time guarantee as to when the resource will be released.

Whether or not the deprecation of finalization eventually leads to its removal, the advice remains the same; never write classes that override finalize(), and always refactor any classes you find in your own code that do.

The try-with-resources mechanism is the recommended best practice for implementing something similar to the C++ RAII pattern. It does limit the use of the pattern to block-scoped code, but this is due to the Java platform’s lack of low-level visibility into object lifetime. The Java developer must simply exercise discipline when dealing with resource objects and scope them as highly as possible - which is in itself a good design practice.

About the Author

Ben Evans is co-founder of jClarity, a startup which delivers performance tools & services to help development & ops teams. He is an organizer for the LJC (London's JUG) and a member of the JCP Executive Committee, helping define standards for the Java ecosystem. He is a Java Champion; 3-time JavaOne Rockstar Speaker; co-author of "The Well-Grounded Java Developer" & the new edition of "Java in a Nutshell" and a regular speaker on the Java platform, performance, concurrency, and related topics. Ben is available for speaking, teaching, writing and consultancy engagements - please contact for details.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Cleaner to the rescue?

    by Kirk Pepperdine,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    To state that finalize() is unsuitable for it's intended purpose is a bit of a misdirection IMHO. finalize() is often maligned because developers don't understand the intent and they see finalize() as an unnecessary cost. Quite often that is the case but there are cases where it is perfect for what you need. For example, if don't completely understand the lifecycle of the object you are dealing with. Given that GC has a global perspective, it is the perfect mechanism to solve this visibility issue.

    All of the issues you've brought up w.r.t. FinalReference apply to *all* Reference types as they all require some sort of special treatment by the run time and the garbage collector. Thus if you need to process Phantom references you need to use guard threads or you risk deadlocks and thread death due to uncaught exceptions. If you supply a ReferenceQueue to soft or weak references, same issues. The advantage of FinalReference and Final is that all of these things are baked in so you don't have to worry about them.

    The key issue is when FinalReference is used when you know the life-cycle and you don't need finalize() as you can call close yourself and have all of the JNI resources returned to the system. At that point finalize() *still* needs to be run in order to clear the object out of memory (extra state bits in the header). As you've stated, this requires two GC cycles in the memory pool that the finalizable object resides in. Cleaner is suppose to replace finalize(). It is an extension of PhantomReference and will come with the property that it can be canceled. Thus it won't need to go through the finalize() workflow. Will this be a win? IMO, this sucks more oxygen out of the room than the problems is solves.. but so be it.

  • Re: Cleaner to the rescue?

    by Ben Evans,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Thanks for the GC expert's perspective, Kirk. What you're describing is true, but only from a certain perspective, and doesn't apply to the classic case of resource management - that RAII is trying to address.

    There's also the question of what a Java developer should regard as an object's lifetime. This is often a secondary consideration for many programmers, partly because automatic GC allows developers to ignore it to a greater extent than is possible in C++.

    In fact, you could argue that one of the major points of GC is to reduce the need for Java programmers to be concerned with the specifics of object lifetime, and to regard an object's lifetime as over at the point where it is no longer referenced.

  • Re: Cleaner to the rescue?

    by Kirk Pepperdine,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    I think it does apply to the case of resource management in that if you understand the life-cycle of the resource and you properly recover it, there should be no need to finalize and hence finalization should be canceled. If you don't understand the life-cycle or you forget to recover the resource than finalization can be used to negate or at the very least, minimize the effects of leaking the resource. In particular, java.io tries to minimize the leaking of file handles (and associated native memory) with the use of finalize(). The unfortunate thing is, even if you call close() yourself, you still take on the expense of finalize(). Cleaner is designed to "fix" this.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT