BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Ashley Puls on the How and Why of Java Bytecode Manipulation
Recorded at:

| Interview with Ashley Puls Follow 0 Followers by Barry Burd Follow 0 Followers on Sep 19, 2014 | NOTICE: The next QCon is in San Francisco Nov 13-17, 2017. Join us!
15:00

Bio Ashley Puls is a senior software engineer at New Relic Inc. which is a software analytics company that makes sense of billions of metrics about millions of applications in real time. She works on the Java Agent team in Portland which focuses on instrumenting Java applications.

Sponsored Content

Software is Changing the World. QCon empowers software development by facilitating the spread of knowledge and innovation in the developer community. A practitioner-driven conference, QCon is designed for technical team leads, architects, engineering directors, and project managers who influence innovation in their teams.

   

1. [...] I’m going to give you the exact title: “Living in the Matrix with Bytecode Manipulation”, how intriguing and in what way are software developers living in the Matrix?

Barry's full question: I’m Barry Burd, professor of Computer Science and Mathematics at Drew University in Madison, New Jersey. I’m here at QCon New York, I’m speaking with Ashley Puls. She is senior software engineer at New Relic and she spoke at the conference on, I’m going to give you the exact title: “Living in the Matrix with Bytecode Manipulation”, how intriguing and in what way are software developers living in the Matrix?

Hi Barry, welcome! I think that a lot of application developers they think about Frameworks, they think about Spring, Hibernate and their database, and I think they are relatively unaware of this other layer of bytecode manipulation, they look at their Java files but they never really looked at .class files, a lot of Java programmers out there they’ve never heard of javap, and so in that sense they are living in this world where they haven’t even seen this other world of bytecode and the fact that you can manipulate that.

   

2. And do you want to manipulate them, what reasons do you have for wanting to view that world?

So there is a lot of reasons to manipulate bytecode, if you’ve ever use FindBugs for example it doesn’t necessarily manipulate the bytecode, but it uses the bytecode manipulation framework to analyze the actual bytes. If you want to do code complexity analysis that’s another reason where you might use a bytecode manipulation framework. Frameworks like Spring and Hibernate, they also use bytecode manipulation frameworks to generate Proxies. At New Relic we use the bytecode manipulation framework ASM to add our instrumentations so that we can gather the metrics to help application developers to find and fix the problems in their applications.

   

3. Ok, so this is a lower level that most developers aren’t aware of and don’t deal with, some of the software tools that you mentioned do deal with those lower levels, why would a software developer want to know what is going on inside those tools, those toolsets?

So in many cases there is not a need for application developers to actively manipulate their bytecode, if we look at the Hotspot JVM, it’s pretty good at optimizing our Java code today, however 1) It’s important for them to understand it, and so by manipulating bytecode it gives them a good understanding of how the actual JVM works, they can learn things like the local variables and the operand stack and frames, concepts that they are probably not aware of, and 2) they can create tools, in my presentation for example I demoed a logging application where you could create an audit log using a bytecode manipulation framework and if you didn’t want to create do or perform the audit logging, then you just take out your Java agent that was performing the bytecode manipulation and you won't have the logging cost.

   

4. You mentioned a bytecode manipulation framework, I think what I’m hearing from that is that there is more than one? Can you tell me something about those?

Yes, there are several frameworks out there that help you manipulate your bytes. One of the most common ones is ASM, it was open sourced in 2002 and it’s very low level, you are essentially adding the actual OpCode to your methods, then there is several others that lay on top of that, Javassist is one for example, where you don’t actually have to understand the OpCodes or the local variables, you get to write pure Java and say: “Insert this at the beginning of my method, or insert this at the end of my method”. Another one is CGLib that is used by Spring for example for proxy generation, so there is several out there that you can use to help you manipulate your bytecode and they're probably a lot easier than actually trying to do it yourself.

   

5. [...] If I challenge you by saying: “Wow, manipulating bytecode seems like working on a problem where you won’t be able to have a chance of understanding what it is that the overall program is doing”, what would you say to that?

Barry's full question: I’m sure there are because one of the things that comes to mind when I think about manipulating bytecode is how difficult it was to manipulate assembly language Instructions and then we’ve got away from that if I challenge you by saying: “Wow, manipulating bytecode seems like working on a problem where you won’t be able to have a chance of understanding what it is that the overall program is doing”, what would you say to that?

So at the bytecode level you do have to understand exactly what your local variables are and exactly what’s on your stack and how big your stack is going to grow and that is why we have languages like Java so you don’t have to understand that and that’s also why not every Java developer needs to understand bytecode, but at the end of the day I always think it’s good to understand the building blocks that you are working on even at a general concept level.

   

6. How does a bytecode manipulation apply to the kinds of problems that New Relic Software tackles?

Yes, New Relic is a software analytics company and our core product is Application Performance Monitoring and in particular I work on our Java Agent which does the instrumentation to grab the metrics in Java applications, so that when you login to our UI, you can find and fix the problems in your application. So in order to do the instrumentation, we don’t have access to your Java source code files, but we need to do things like time your methods and so therefore we use bytecode manipulation to insert some code, so that we can perform that timing information.

   

7. It’s bytecode manipulation useful for most developers or is it applicable only in specific scenarios?

I think there is a set of developers that use bytecode manipulation, all of the profilers you use for example that is going to be bytecode manipulation, any static code analyses, those developers are going to use bytecode manipulation, so in the general sense for most applications you are not going to need to manipulate your bytecode.

   

8. What about optimization, can you get some?

You can optimize at the bytecode level, however I’d say I wouldn't go to that first, there is a lot of other things you can look at and examine to before going to optimizing your bytecode, you really have to be looking for extreme, extreme optimization before going there.

   

9. Are you able to hire new developers with skills in the use of bytecode or do you have to bring new hires up to speed?

You know there is a small subset of people out there that do know about bytecode manipulation and do it on a regular basis, however the majority of the people don’t have a lot of experience with bytecode manipulation. And so in the majority of cases we do need to bring people up to speed.

   

10. And can people shoot themselves in the foot? Seems like an easy thing to do when you are working at such a low level?

Yes, you can shoot yourself in the foot but generally with bytecode you learn very quickly whether it works or it doesn’t work, there are also validators, the JVM is going to validate your bytecode, you can also when using ASM, ASM has a validator that you can run to verify that your bytecode is valid.

   

11. Let’s consider that the developer who doesn’t know much about bytecode manipulation and wants to learn more, of course the developer can start reading, but reading without practice is a shallow experience, what else should a developer do to begin to come up to speed?

So reading is a great start, the ASM documentation is very thorough for example and has a lot of examples in and within it, but like you suggested reading, you need to do more than that and the best way to learn it is to try to put together some sample apps, it just like I demoed in my talk, you can do audit logging or something pretty simple like that to practice your bytecode and work on that skill.

Barry: Very good, thank you, Ashley. Thank you very much for coming!

Yes, thank you very much Barry, it was a pleasure!

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT