BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Performance Guru Kirk Pepperdine Reflects on Results of RebelLabs' Performance Survey

Performance Guru Kirk Pepperdine Reflects on Results of RebelLabs' Performance Survey

This item in japanese

Bookmarks

RebelLabs published their Developer Productivity Report, the result of a survey started in March 2015, where they polled the Java development community on Java performance and performance testing methods.

The report shows that when performance problems are found, it is the development teams that usually fix them. Profiling and testing for performance issues was found to be a reactive activity, with over 40% indicating that code is profiled only when issues arise.

Of the 1562 responses to the survey, 65% were software developers, 27% architects, team leads and project managers, and a small percentage (1.5%) dedicated performance engineers. Most applications (70%) were web applications and had an average of 118 screens.

The survey found that most Java developers used VisualVM to do application profiling. VisualVM ships with the JDK and has lightweight profiling capabilities. Almost half of those that took the survey claimed to use more than one tool to do their profiling.

Tools for Application Profiling

The most common source for performance problem discovery was user feedback (31%), followed by performance monitoring tools (25%) and system faults and crashes at 20%. This shows that many are not testing their code or discovering performance issues before deploying.

The root cause of most performance issues was found to be slow database queries (55%), followed closely by inefficient application code (51%). HTTP session bloat was only a problem for a minority (8%) of responders.

Typical Root Causes

The report concludes that projects with the most satisfied end users work in small teams, do performance testing earlier and are more efficient and proactive. They're faster at diagnosing issues and 40% more likely to profile on a daily or weekly basis.

To see how these numbers line up with a real world experience, InfoQ spoke with Kirk Pepperdine, CTO at JClarity and well-known performance expert.

InfoQ: What are your thoughts on RebelLabs' latest Developer Productivity Report on Java Performance?

Overall I found the report to be an interesting read. The report does indeed point out some best practices. That said, some of the suggested best practices are what I'd call "tuning by folklore". That is, tuning practices based on what worked in the past rather than a revelation of the situation based on current data. This is a bias that can lead one to false conclusions. This report reflects a number of biases that we have, in that it completely leaves out some of the more common problems facing applications today. This isn't the fault of the report authors. It's simply a case where developers, due to their bias, are not reporting on what isn't visible to them. This is exactly why this report is a great summary of the current "state of the art" in performance tuning. I'd like to explain this by stepping back and discussing the finding in a broader context.

In my experience, when a developer speaks of performance tuning, they often tangle the diagnostic aspect with the algorithmic or what I like to call, coming up with clever ways to get more work done faster. I find it useful to separate the discussion of the diagnostic process from the creative aspect of tuning. Diagnosis is simply finding out what went wrong. Once you know what went wrong, you can move to devising a clever way to solve it. In other words, before you make an adaptation to some aspect of your application so it is better suited for the run time conditions it must cope with, you need to understand why it's unable to perform in the first place.

The thing I've observed is that while diagnostic work and clever algorithm development draw from approximately the same knowledge base, they rely on two completely different skills sets. In my experience, it is rare to find both sets of skills in the same person. I tell all of my workshop attendees this at the beginning and they all reply "not me", just before 95% of them fall over on the first couple of tries with relatively simple problems. Strangely enough, the better a developer your are, the more likely you will have poor diagnostic skills. This is not to say that good developers cannot develop good diagnostics skills, it's only that they generally don't have them to begin with. More than 90% of that same group were able to diagnosis very complex problems once they've been given instruction.

One thing that stands out in the report is the tendency for developers to make an undirected or arbitrary choice of which profiler to use. Think of a profilers as the blind men examining an elephant and you've got the idea that every profiler will show you something different.

Number one problem stated in the report is DB query performance. I rarely see a database that is performing poorly. More often than not, the problem is not with the DB but in how the application is interacting with the DB. In my most recent experience with a "the database is slow" engagement I found DB response times to be about 2ms. However the application was making about 20,000 calls per client interaction. What I was asked to do was find out why one call in a million calls would take up to a few seconds to complete. If you do some math you'll see that the one in a million event happened once every 50 client interactions so it was a real problem. But the application only needed to make 3 calls to the database. A one in a million event with three calls translates to one long query every 333,333 client interactions. The latency of 20,000 calls was 40 seconds whereas the three calls could be completed in under 10ms. I've seen so many variations on this theme that my common response to the database is slow is; are we "tuning by folklore"?

Performance tuning is not a dark art. You can follow a process. I talk about this process in my talks "The (not so) Dark Art of Performance Tuning" and "Performance Tuning with Poor Tools and Cheap Drink". The basis of the methodology is to listen to your machine. It will tell you what is wrong. All that you need to do is map that information back to application behavior. If you follow these steps, you should be able to isolate performance bottlenecks very quickly.

InfoQ: Most respondents report they use VisualVM for profiling. Would you say that is the right way to start in general?

From my answer to the first question you probably realize that I'm fairly tool agnostic and I think the numbers reflect this. Full disclosure, I happen to be a member of the NetBeans dream team not because of NetBeans itself but more for my work evangelizing VisualVM. It's been around for quite some time, it has a very low barrier to entry, it's very easily extensible which translates to: it's a great container for home grown performance tooling, so I'm not so surprised at it's popularity. Most of the people that attend my workshop have some experience using VisualVM. As an aside, the NetBean's profiler, jFluid, is also the VisualVM profiler so that means that almost 55% use jFluid. I'd expect that JMC will continue to become more popular as more people find out about it, and it matures to support plugins, and the licensing issues get sorted out.

InfoQ: How do you recommend teams do performance testing as part of the development/QA process?

Wow, this a huge question. Performance testing can best be described as a buyer beware activity. Just how deeply you want to dive into this activity really depends on how important performance is to your organization. Does it matter if that batch job runs in three hours instead of two or maybe even one hour? That's a question only a business case can answer. If performance is important, then you have to question how important, to decide if testing is a regime that needs to be integrated into the entire development life-cycle. You have to recognize that in effect you are setting up a series of benchmarks that run at different scales. Most organization simply do not understand how much it costs to run a benchmark that will produce meaningful results. I say this because I often tell people how much it's going to cost and they don't believe me until you start asking questions that expose all of the hidden cost.

With any benchmark, the most expensive activity in terms of time is validation. If you've not validated your benchmark, you don't really understand what it is you're measuring and hence it's hard to understand if the problems you are exposing are real or imagined. Again, you can see this in the data in this report in terms of gains. Most of the performance tuning gains reported are quite modest. Of course they may have been good enough in that the application now meets its performance requirements. Validation requires that you run and analyze the results from a number of experiments. You want to make sure that you are testing exactly what you think you are testing and if not, what adjustments need to be made to ensure that you are testing what you should be testing. These types of tests can often take weeks to set up and validate. But they are important.

I worked with one team that set up all the right tests but didn't properly validate. They delayed releasing their product for three months because it wouldn't meet their performance requirements. The environment didn't show anything obvious so they ended up fixing "phantom problems" that of course resulted in no change in the results. When we finally got to validate, we found the environment was at fault and once that was fixed, the application was plenty fast enough. This is a common theme that I run into time and time again.

InfoQ: The study suggests load testing is often an afterthought. What tools and practices do you recommend for planning and performing load testing?

Yes, unfortunately load testing is not part of the normal workflow. People are trying to piggyback load testing on the back of continuous deployment and I think that is the way to go. Take a build a day and load test it. Here are the gotcha's. If your load test isn't set up right you'll end up chasing either phantom problems or won't be placing enough pressure to create any meaningful problems. Back to the question of tooling, there are a number of tools out there that can help you place load on your application. The commercial tools are a treat to use but they tend to cost a lot of money. If you are web facing there are services that you can use. They tend to work fairly well as long as you don't expect too much from the provider.

I tend to use Apache JMeter. It's a tool that is not without it's faults. A lot of the ideas for coordinated omission (CO) that Gil Tene speaks about have come from his experiences load testing applications with JMeter. There are ways to work around some of the problem with JMeter and I should add that JMeter isn't the only tool that suffers from CO. In fact, all tooling that I know of suffers from CO. That said, as long as you are aware of CO you can cope with it. My worst experiences comes with teams that have tried to home grow their own load testing tools. Apache JMeter, the Grinder, Gatling and all of these other like tools generally work on a fire-and-response type threading-model so unless your communications are truly an async, I'd prefer to not roll my own load injector. Load injectors seem like simple things to write but the details will kill you. In fact our examination of the fundamental problems with Apache JMeter concluded they could only be fixed with a complete rewrite so we opted for using workarounds instead.

InfoQ: How should development teams identify and remediate performance issues unique to cloud deployments?

Cloud is not much different than a bare metal deployment. That said, there are differences. A VM is a VM, it's not real hardware. You cannot virtualize yourself into more hardware than you have. That means you sometimes have to understand what your neighbors are up to. While I don't want to downplay the importance of monitoring steal time, CPU availability is often not the problem. For example, I worked with one client that was in cloud deployment. There was nothing obviously wrong with their application. However their neighbor was an Oracle DB and there was only a single network card in the box. Wanna bet on what the problem was?

InfoQ: Can you give us any parting advice on performance?

If you are not meeting your performance goals, listen to the hardware, it will tell you why. All you need to do is map why back into your application.

Rate this Article

Adoption
Style

BT