BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Easily Create Java Agents with Byte Buddy

Easily Create Java Agents with Byte Buddy

Bookmarks

 

A Java agent is a Java program that executes just prior to the start of another Java application (the “target” application), affording that agent the opportunity to modify the target application, or the environment in which it runs. In this article we will start with the basics, and crescendo to an advanced agent implementation using the bytecode manipulation tool Byte Buddy.

In the most basic use case, a Java agent sets application properties or configures a certain environment state, enabling the agent to serve as a reusable and pluggable component. The following example describes such an agent, which sets a system property that becomes available to the actual program:

public class Agent {
  public static void premain(String arg) {
    System.setProperty("my-property", “foo”);
  }
}

As demonstrated by the above code, a Java agent is defined like any other Java program, except that  premain replaces the main method as the entry point. As the name suggests, this method is executed before the main method of the target application. There are no other specific rules for writing an agent other than the standard rules that apply  to any other Java program. As a minimal difference, a Java agent receives a single, optional argument instead of an array of zero or more arguments.

To launch your agent you must bundle the agent classes and resources in a jar, and in the jar manifest set the Agent-Class property to the name of your agent class containing the premain method. (An agent must always be bundled as a jar file, it cannot be specified in an exploded format.) Next, you must launch the application by referencing the jar file's location via the javaagent parameter on the command line:

java -javaagent:myAgent.jar -jar myProgram.jar

You can also prepend optional agent arguments to this location path. The following command starts a Java program and attaches the given agent providing the value myOptions as the argument to the premain method:

java -javaagent:myAgent.jar=myOptions -jar myProgram.jar

It is possible to attach multiple agents by repeating the javaagent command.

A Java agent is capable of much more than just altering the state of an application's environment, however; a Java agent can be granted access to the Java instrumentation API, allowing the agent to modify the code of the target application. This little known feature of the Java virtual machine offers a powerful tool that facilitates the implementation of aspect-oriented programming.

Such modifications of a Java program are applied by adding a second parameter of type Instrumentation to the agent's premain method.  The Instrumentation parameter can be used to perform a range of tasks, from  determining an object's exact size in bytes, to actually modifying class implementations by registration of ClassFileTransformers. After it is registered, a ClassFileTransformer is invoked by any class loader upon loading a class. When invoked, a class file transformer has the opportunity to transform or to even fully replace any class file before the represented class is loaded. In this way, it is possible to enhance or modify a class's behavior before it is put to use, as exemplified by the following example:

public class Agent {
 public static void premain(String argument, Instrumentation inst) {
   inst.addTransformer(new ClassFileTransformer() {
     @Override
     public byte[] transform(
       ClassLoader loader,
       String className,
       Class<?> classBeingRedefined, // null if class was not previously loaded
       ProtectionDomain protectionDomain,
       byte[] classFileBuffer) {
       // return transformed class file.
     }
   });
 }
}

After registering the above ClassFileTransformer with an Instrumentation instance, the transformer is invoked every time a class is loaded. For this purpose, the transformer receives a binary representation of the class file and a reference to the class loader that is attempting to load this class.

A Java agent can also be registered during the runtime of a Java application. In this case, the instrumentation API allows for the redefinition of already loaded classes, a feature that is known as “HotSwap”. Unfortunately, redefining loaded classes is limited to replacing method bodies. No members may be added or removed, and no types or signatures may change when redefining a class. This limitation does not apply when a class is loaded for the first time, and in those cases the classBeingRedefined parameter is set to null.

Java bytecode and the class file format

A class file represents a Java class in its compiled state. A class file contains the bytecode representation of program instructions originally coded as Java source code. Java bytecode can be considered to be the language of the Java virtual machine. In fact, the JVM does not have a notion of Java as a programming language, but exclusively processes bytecode. As a result of the binary representation, bytecode consumes less space than a program’s source code. Also, representing a program as bytecode allows for easier compilation of languages other than Java, for example Scala or Clojure, to run on the JVM. Without bytecode as an intermediate language, it would have been necessary to translate any program into Java source code before running it.

In the context of code manipulation, this abstraction does however come at a price. At the time of applying a ClassFileTransformer to a Java class, this class can no longer be processed as Java source code, even assuming that the transformed code had been written in Java in the first place. To make things worse, at the time of transforming the class, the reflection API for introspecting the class's members or annotations is also off limits as the access to the API would require the class to already be loaded, which would not happen before the transformation process completed.

Fortunately, Java bytecode is a comparatively simple abstraction with a relatively small number of operations, and can generally be learned with little effort. The Java virtual machine executes a program by processing values as a stack machine. A bytecode instruction typically indicates to the virtual machine that it should pop values from an operand stack, perform some operations, and push a result back on the stack.

Let’s consider a simple example: adding the numbers one and two. Both numbers are first pushed onto the operand stack by the JVM, by executing the bytecode instructions iconst_1 and iconst_2. iconst_1 is a one byte convenience operator that pushes the number one onto the stack. Similarly iconst_2 pushes the number two onto the stack. Subsequently, executing the iadd instruction pops the two latest values off the stack and pushes back the sum of those numbers. Within a class file, each instruction is not stored by its mnemonic name but rather as a single byte that uniquely identifies a specific instruction, thus the term bytecode. The above bytecode instructions and their impact on the operand stack are visualized in the picture below.

But fortunately for humans, who do better with source than bytecode, the Java community has created several libraries that parse class files and expose a compacted bytecode as a stream of named instructions. For example, the popular ASM library offers a simple visitor API that dissects a class file into members and method instructions, operating in a similar fashion to a SAX parser for reading XML files. Using ASM, the bytecode for the above example can be implemented as demonstrated by the following code (where the visitIns instructions are ASM’s way of providing the revised method implementation):

MethodVisitor methodVisitor = ...
methodVisitor.visitIns(Opcodes.ICONST_1);
methodVisitor.visitIns(Opcodes.ICONST_2);
methodVisitor.visitIns(Opcodes.IADD);

It should be noted that the bytecode specification merely serves as a metaphor, and a Java virtual machine is allowed to translate a program into optimized machine code as long as the program's outcome remains correct. Thanks to the simplicity of bytecode, it is straightforward to replace or to modify the instructions within an existing class. Therefore using ASM and understanding the fundamentals of Java bytecode already suffices for implementing a class-transforming Java agent by registering a ClassFileTransformer that processes its arguments using this library.

Overcoming the bytecode metaphor

For a practical application, parsing a raw class file still implies a lot of manual work. Java programmers are often interested in a class in the context of its type hierarchy. For example, a Java agent might be required to modify any class that implements a given interface. To determine information about a class’s super types, it no longer suffices to parse the class file that is provided by a ClassFileTransformer, which only contains the names of the direct super type and interfaces. A programer would still be required to locate the class files for these types, in order to resolve a potential super type relationship.

Another difficulty is that making direct use of ASM in a project requires any developer on a team to learn about the fundamentals of Java bytecode. In practice, this often leads to the exclusion of many developers from changing any code that is concerned with bytecode manipulation. In such a case, implementing a Java agent imposes a threat to a project’s long-term maintainability.

To overcome these problems, it is desirable to implement a Java agent using a higher-level abstraction than direct manipulation of Java bytecode. Byte Buddy is an open-source, Apache 2.0-licensed library that addresses the complexity of bytecode manipulation and the instrumentation API. Byte Buddy’s declared goal is to hide explicit bytecode generation behind a type-safe domain-specific language. Using Byte Buddy, bytecode manipulation hopefully becomes intuitive to anybody who is familiar with the Java programming language.

Introducing Byte Buddy

Byte Buddy is not exclusively dedicated to the generation of Java agents.  It offers an API for the generation of arbitrary Java classes, and on top of this class generation API, Byte Buddy offers an additional API for generating Java agents.

For an easy introduction to Byte Buddy, the following example demonstrates the generation of a simple class that subclasses Object and overrides the toString method to return “Hello World!”. As with raw ASM, “intercept” instructs Byte Buddy to supply the method implementation using the intercepted instructions:

Class<?> dynamicType = new ByteBuddy()
  .subclass(Object.class)
  .method(ElementMatchers.named("toString"))
  .intercept(FixedValue.value("Hello World!"))
  .make()
  .load(getClass().getClassLoader(),          
        ClassLoadingStrategy.Default.WRAPPER)
  .getLoaded();

Looking at the code above, we see that Byte Buddy implements a method in two steps. First, a programmer needs to specify an ElementMatcher that is responsible for identifying one or several methods to implement. Byte Buddy offers a rich set of predefined interceptors that are exposed in the ElementMatchers class. In the above case, the toString method is matched by its exact name, but we could also match on more complex code structure such as type or annotations.

Whenever Byte Buddy generates a class, it analyzes the class hierarchy of the generated type. In the above example, Byte Buddy determines that the generated class inherits a single method named toString from its super class Object, where the specified matcher instructs Byte Buddy to override that method by the subsequent Implementation instance, FixedValue in our example.

When creating a subclass, Byte Buddy always intercepts a matched method by overriding the method in the generated class. However we will see  later in this article that Byte Buddy is also capable of redefining existing classes without subclassing. In such cases, Byte Buddy replaces an existing method with generated code, while copying the original code into another, synthetic method.

In our example code above, the matched method is overridden with an implementation that returns the fixed value “Hello World!”. The intercept method accepts an argument of type Implementation and Byte Buddy ships with several predefined implementations such as the selected FixedValue class. If required however, it is possible to implement a method as custom bytecode using the ASM API discussed above, on top of which Byte Buddy is itself implemented.

After defining the properties of a class, it is generated by the make method. In the example application, the generated class is given a random name as no name was specified by the user. Finally, the generated class is loaded using a ClassLoadingStrategy. Using the above default WRAPPER strategy, a class is loaded by a new class loader which has the environment’s class loader as a parent.

After a class is loaded, it is accessible using the Java reflection API. If not specified differently, Byte Buddy generates constructors similar to those of the superclass such that a default constructor is available for the generated class. Consequently, it is possible to validate that the generated class has overridden the toString method as demonstrated by the following code:

assertThat(dynamicType.newInstance().toString(), 
           is("Hello World!"));

Of course, this generated class is not of much practical use. For a real-world application, the return value of most methods is computed at runtime and depends on method arguments and object state.

Instrumentation by delegation

A more flexible way of implementing a method is the use of Byte Buddy's MethodDelegation. Using method delegation, it is possible to generate an overridden implementation that will invoke another method of a given class or instance. This way, it is possible to rewrite the previous example using the following delegator:

class ToStringInterceptor {
  static String intercept() {
    return “Hello World!”;
  }
}

With the above POJO interceptor, it becomes possible to replace the previous FixedValue implementation with MethodDelegation.to(ToStringInterceptor.class):

Class<?> dynamicType = new ByteBuddy()
  .subclass(Object.class)
  .method(ElementMatchers.named("toString"))
  .intercept(MethodDelegation.to(ToStringInterceptor.class))
  .make()
  .load(getClass().getClassLoader(),          
        ClassLoadingStrategy.Default.WRAPPER)
  .getLoaded();

Using this delegator, Byte Buddy determines a best invokable method from the interception target provided to the to method. In the case of ToStringInterceptor.class, the selection process trivially resolves to the only static method that is declared by this type. In this case, only static methods are considered because a class was specified as the target of the delegation. In contrast, it is possible to delegate to an instance of a class, in which case Byte Buddy considers all virtual methods. If several such methods are available on a class or instance, Byte Buddy first eliminates all methods that are not compatible for a specific instrumentation. Among the remaining methods, the library then chooses a best match, typically the method with the most parameters. It is also possible to choose a target method explicitly, by narrowing down the eligible methods by handing an ElementMatcher to the MethodDelegation by invoking the filter method. For example, by adding the following filter, Byte Buddy only considers methods named “intercept” as a delegation target:

MethodDelegation.to(ToStringInterceptor.class)
                .filter(ElementMatchers.named(“intercept”))

After intercepting, the intercepted method still prints “Hello World!” but this time, the result is computed dynamically, so that for example it is possible to set a breakpoint in the interceptor method that is triggered every time toString is called from the generated class.

The full power of the MethodDelegation is unleashed when specifying parameters for the interceptor method. A parameter is typically annotated for instructing Byte Buddy to inject a value when calling the interceptor. For example, using the @Origin annotation, Byte Buddy provides an instance of the instrumented Method as an instance of the class provided by the Java reflection API:

class ContextualToStringInterceptor {
  static String intercept(@Origin Method m) {
    return “Hello World from ” + m.getName() + “!”;
  }
}

When intercepting the toString method, the invocation is now instrumented to return “Hello world from toString!”.

In addition to the @Origin annotation, Byte Buddy offers a rich set of annotations. For example, using the @Super annotation on a parameter of type Callable, Byte Buddy creates and injects a proxy instance that allows invocation of the instrumented method’s original code. If the provided annotations are insufficient or impractical for a specific use case, it is even possible to register custom annotations that inject a user-specified value.

Implementing method-level security

As we saw, it is possible to use  a MethodDelegation to dynamically override a method at runtime using plain Java. That was a simple example but the technique can be used to implement more practical applications. In the remainder of this article we will developer an example that uses code generation to implement an annotation-driven library for enforcing method-level security. In our first iteration, the library will generate subclasses to enforce this security. Then we will use the same approach to implement a Java agent to do the same.

The example library uses the following annotation to allow a user to specify that a method is considered to be secured:

@interface Secured {
  String user();
}

For example, consider an application that uses the Service class below to perform a sensitive action that should only be performed if the user is authenticated as an administrator. This is specified by declaring the Secured annotation on the method for executing this action.

class Service {
  @Secured(user = “ADMIN”)
  void doSensitiveAction() {
    // run sensitive code...
  }
}

It is of course possible to write the security check directly into the method. In practice, hard-coding cross-cutting concerns frequently results in copy-pasted logic that is hard to maintain. Furthermore, directly adding such code does not scale well once an application reveals additional requirements, such as logging, collecting invocation metrics, or result caching. By extracting such functionality into an agent, a method purely represents its business logic, making it easier to read, test and to maintain a code base.

In order to keeping the proposed library simple, the contract of the annotation declares that an IllegalStateException should be thrown if the current user is not the one that is specified by the annotation’s user property. Using Byte Buddy, this behavior can be implemented by a simple interceptor, as the SecurityInterceptor in the following example, which also keeps track of the user that is currently logged in by its static user field:

class SecurityInterceptor {

  static String user = “ANONYMOUS”

  static void intercept(@Origin Method method) {
    if (!method.getAnnotation(Secured.class).user().equals(user)) {
      throw new IllegalStateException(“Wrong user”);
    }
  }
}

As we can see in the above code, the interceptor would not invoke the original method, even if access is granted to a given user. To overcome this, many of the predefined method Implementations in Byte Buddy can be chained. Using the andThen method of the MethodDelegation class, the above security check can be prepended to a plain invocation of the original method, as we will see below. Because a failed security check will throw an exception and prevents any further execution, the call to the original method would not be performed if the user is not authenticated.

Putting these pieces together, it is now possible to generate an appropriate subclass of Service where all annotated methods are secured appropriately. As the generated class is a subclass of Service, the generated class can be used as a substitute for all variables of the Service type, without a type casting, and will throw an exception when invoking the doSensitiveAction method without proper authentication:

new ByteBuddy()
  .subclass(Service.class)
  .method(ElementMatchers.isAnnotatedBy(Secured.class))
  .intercept(MethodDelegation.to(SecurityInterceptor.class)
                             .andThen(SuperMethodCall.INSTANCE)))
  .make()
  .load(getClass().getClassLoader(),   
        ClassLoadingStrategy.Default.WRAPPER)
  .getLoaded()
  .newInstance()
  .doSensitiveAction();

Unfortunately, because the instrumented subclasses are only created at runtime, it is not possible to create such instances without using Java reflection. Therefore any instance of an instrumented class should be created by a factory that encapsulates the complexity of creating a subclass for the purpose of instrumentation. As a result, subclass instrumentation is commonly used by frameworks that already require the creation of instances by factories, for example frameworks for dependency-injection like Spring or object-relational mapping like Hibernate. For other types of applications, subclass instrumentation is often too complex to realize.

A Java agent for security

Using a Java agent, an alternative implementation for the above security framework would be to modify the original bytecode of a class such as the above Service rather than overriding it. By doing so it would no longer be necessary to create managed instances; simply calling

new Service().doSensitiveAction()

would already throw our exception when the appropriate user is not authenticated. To support this approach of  modifying a class, Byte Buddy offers a concept that is called rebasing a class. When a class is rebased, no subclass is created but instead the instrumented code is merged into the instrumented class to change its behavior. With this approach the original code of any method of the instrumented class is still accessible after instrumenting it, so that instrumentations like SuperMethodCall work exactly the way as when creating a subclass.

Thanks to the similar behavior when either subclassing or rebasing, the APIs for both operations are executed in the same way, by describing a type using the same DynamicType.Builder interface. Both forms of instrumentation are accessible via the ByteBuddy class. To make the definition of a Java agent more convenient, Byte Buddy does however also offer the AgentBuilder class, which is dedicated to solve common use cases in a concise manner. In order to define a Java agent for method-level security, the definition of the following class as the agent’s entry point suffices:

class SecurityAgent {
  public static void premain(String arg, Instrumentation inst) {
    new AgentBuilder.Default()
    .type(ElementMatchers.any())
    .transform((builder, type) -> builder
    .method(ElementMatchers.isAnnotatedBy(Secured.class)
    .intercept(MethodDelegation.to(SecurityInterceptor.class)
               .andThen(SuperMethodCall.INSTANCE))))
    .installOn(inst);
  }
}

If this agent is bundled in a jar file and specified on the command line, any type is “transformed”, or redefined to secure any methods specifying the Secured annotation. Without the activation of the Java agent, the application runs without the additional security checks. Of course, this implies unit tests where the code of an annotated method can be invoked without requiring a specific setup for mocking the security context. As the Java runtime ignores annotation types that cannot be found on the classpath, it is even possible to run the annotated methods after removing the security library from the application entirely.

As another advantage, Java agents are easily stackable. If several Java agents are specified on the command line, each agent is given the opportunity to modify a class in the order they are put on the command line. For example, this would allow for the combination of frameworks for security, logging and monitoring without requiring any form of integration layer between these applications. Therefore, using Java agents to implement cross-cutting concerns offers an opportunity to writing more modular code without integrating all code against a central framework for managing instances.

The source code to Byte Buddy is freely available on GitHub. A tutorial can be found at http://bytebuddy.net. Byte Buddy is currently available in version 0.7.4 and all code examples are based upon this version. The library won a Duke's Choice award by Oracle in 2015 for its innovative approach and its contribution to the Java ecosystem.

About the Author

Rafael Winterhalter works as a software consultant in Oslo, Norway. He is a proponent of static typing and a JVM enthusiast with particular interests in code instrumentation, concurrency and functional programming. Rafael blogs about software development, regularly presents at conferences and was pronounced a JavaOne Rock Star. When coding outside of his work place, he contributes to a wide range of open source projects and often works on Byte Buddy, a library for simple runtime code generation for the Java virtual machine. For his work, Rafael received a Duke's Choice award.

 

Rate this Article

Adoption
Style

BT