BT

Iterative, Automated and Continuous Performance

Posted by Kirk Pepperdine on Nov 16, 2007 |
Our industry has learned that if we deliver intermediate results and revisit functional requirements we can avoid delivering the wrong system late. We have learned that if we perform unit and functional tests on a regular basis, we will deliver systems with fewer bugs. And though we are concerned with the performance of our applications, we rarely test for performance until the application is nearly complete. Can the lessons of iterative, automated and continuous that we've applied to functional testing apply to performance as well?

Today, we may argue that a build that completes with unit testing should be performed on an hourly, daily, or weekly basis. We may argue on 100% coverage vs. 50% coverage. We may argue and discuss and ponder about specific details of the process. But, we all pretty much agree that performing automated builds completed with unit testing on a regularly scheduled basis, is a best practice. Yet, our arguments regarding performance testing tend to be limited to, don't.

Premature or Just in Time

There are several reasons why performance testing gets put off to the end. Many of these reasons are very similar to why we rarely, if ever, automate the testing of our applications. Setting up an automated build takes time, effort and commitment. To justify to the business that it is in their best interest to make this commitment is simply difficult . After all, we are programmers and we are expected to crank out features and not spend our time testing. Testing is for the testers. Writing unit tests takes time, time which is better spent developing features and so on.

However, we have been able to sneak this into our development process, as organizing test code into unit tests only formalized what we were already doing. Thus the incremental investment needed to support this formalization wasn't all that large. Once businesses started to see the benefits, things have only gotten better. As much as one might believe that extending this to performance testing would be a natural progression, it simply hasn't happened. The investment needed to support performance testing is viewed as being much larger and the potential benefits are seen as being much smaller. After all, we can't do performance testing on a system that is under development as there is nothing to test; after all, isn't performance just a matter of more or better hardware?

There are a couple of reasons why the investment is viewed as being larger for performance testing. Unlike unit testing, performance testing isn't something that developers are already doing. This implies a new activity rather than the formalization of something that is already being done. Yet the unit testing that we do today is much more than the informal testing done prior to unit testing becoming a formal discipline. In this regard, there is a difference between the perceived and the actual investment needed to introduce formal performance testing into the development cycle.

There are other arguments against early performance testing: it is premature, there is nothing to test, very little can be gained by it, it is a micro-performance tuning, it is too granular to be useful as we can only performance test complete systems, setting up a performance test is too complex and takes too much time, the process is fickle and so on. These reasons are not without substance. If you talk to a manager from almost any performance testing group, you'll hear that the biggest consumer of time is just getting the application operational in a test environment. This task can be so arduous that it actually limits the number of applications they can test. Some have whispered to me that they can performance test less than 50% of all applications they've deployed.

There is no question that one should almost always void premature optimizations. This is especially true is the optimization is complex, time consuming to implement and the corresponding returns are unknown. For example, if we are sorting a list quite often a simple bubble sort is all we'll really need. We only need more complex sorts if the sort time is critical and the quantity of data warrants it. If we don't have a good handle on either of these requirements implementing a more complex sorting strategy would be premature optimization.

Testing Components

With continuous performance testing we need to focus on more granular aspects of our systems, components and frameworks. Just as is the case with unit testing, we can only expect to find certain classes of problems when we test these artifacts in isolation. A case in point is the contention between components of misuse of frameworks resulting in response times higher than expected; these are things that will only come out in a full integration test. However understanding how much CPU, memory, disk and network I/O we need, can help us predict and take preventive action (rather than apply a premature optimization).

On the question of cost, there is no doubt; performance testing will add cost of developing. Unlike functional testing, performance testing is not something that developers regularly check so there isn't a clear path to formalization as there was with functional testing. However there are two types of costs being considered: direct cost for the effort and the hidden cost of having to fix all of the performance problems as they randomly appear in the final build. The immediate economic reward (in terms of both money and time/schedule) is to performance test only at the end of the project's development cycle. But this is a false economic reward. It is said that with less testing you need fewer man hours to develop your application. Yet it does nothing to account for risk. You may have more money in your pocket if you drive with no auto insurance, but if you ever get into an accident you've lost. Given the number of "car wrecks" we witness in this industry, not testing is like driving without insurance.

Mocks for Performance

But there are things we can do to help reduce costs. Developers create mocks and other things needed to unit test. While the mocks will most likely not include the things that are needed for a performance test, in most cases they can be easily modified to do so. Take the mock for a credit card service found in listing one.

public class MockCreditAuthorizationServiceProvider
implements CreditAuthorizationServiceProvider {

private double rejectPercentage;
private Random random;
public MockCreditAuthorizationServiceProvider() {
// set the rejectPercentage from a property
random = new Random();
}

public void authorize(AuthorizationRequest request) {
if ( random.nextDouble() > rejectPercentage)
request.authorize();
else
request.deny();
}
}

Listing 1. Mock credit card authorization with denied simulation

The mock is setup for functional testing. It adheres to the functional requirements and it should validate a transaction according to some adjustable rate. This mock is good enough to test the functional requirements for the handling of both accepted and rejected credit card authorizations. However to test for performance we also need to mock the service level agreements that we have with the authorization service. The mock must not only authorize; it must do it in the time it normally takes to perform and authorize. If the system will only consider 5 authorization requests at a time, then this also needs to be encoded into the mock. These requirements have been added to our original mock as seen in listing 2.

public class MockCreditAuthorizationServiceProvider
implements CreditAuthorizationServiceProvider {

private double rejectPercentage;
private Random random;
private Expondential serviceDistribution;

public MockCreditAuthorizationServiceProvider() {
// set the rejectPercentage meanServideTime from a property
random = new Random();
this.serviceDistribution = new Expondential( meanServiceTime);
}

public void authorize(AuthorizationRequest request) {
try {
Thread.sleep( this.serviceDistribution.nextLong());
} catch (InterruptedException e) {}


if ( random.nextDouble() > rejectPercentage)
request.authorize();
else
request.deny();
}

Listing 2. Mock credit card authorization with denied and service time simulation

Yet another not so insignificant challenge is simply getting the application running in a suitable testing environment. But this has also been an issue for those doing functional testing and they've worked out a solution - do it continuously. The obvious solution for those wanting to do performance testing is to piggyback off that effort.

Tooling

In the beginning we had JUnit, a neat little tool that helped us organize our tests, execute them and show us the results. We had ANT, a tool written in anger of the complexities of Make. From these humble beginnings we are witnessing an explosion of tools to support continuous builds and unit testing. Yet there is seemingly little support for continuous performance testing. While it is true that none of the existing tools advertise support for performance testing, it does exist. As the lack of advertising may suggest, this support is limited.

The first limitation is in the type of testing supported. Currently we have ANT, Maven, and CruiseControl, but virtue of their integration with ANT, all have plug-ins to support the automated running of Apache JMeter. Apache JMeter came out of the need to performance test HTTP servers and applications. It supports other types of testing but this is limited to a few well defined components that include JMS, WS, and JDBC. However Apache JMeter is quite extensible and if we are to test our components, this is exactly what we'd have to do: extend Apache JMeter. Not an ideal solution in many cases. The only other choice is to hand roll our own stress testing harness. Once again, a less than desirable solution. While tool support may be weak, we expect that it will improve over time just as tool support for continuous testing has improved over time.

Case in Point

Should lack of tool support delay a push to continuous performance testing? The answer will depend a bit upon how adventurous your organization is willing to get. But before completely dismissing the idea of introducing it, consider this. There are a few organizations that have instituted a continuous performance program and though the evidence may be antidota,l the results have been promising. In one case, the end product was composed of the efforts of 6 different development teams. The performance tuning team asked that each of the teams run performance test during the development process. The component with the most performance difficulties was delivered by the one team that did not comply with the request.

Conclusion

Dave Thomas writes about broken windows. Just as continuous builds and unit testing fix "broken windows", continuous performance testing will also work to fix "broken performance windows".

Kent Beck has described continuous testing of automated builds using the analogy of driving a car. As you move down the road your eyes tell you what micro-adjustments need to be made in order to stay in the center or your lane. You wouldn't think of driving with you eyes closed, opening them only for a second to see where you are for fear of missing a curve or drifting out of your lane. When you are first learning it is hard, but it becomes easier over time. What they are saying is that by being iterative, automated, and continuous you are developing with eyes wide open.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Maybe if you stop calling it performance testing.... by William Louth

Hi Kirk,

The real problem is that people fail to see that performance is engineered and not tested. Most people write code and then test. Why would this be any different for performance.

If you have not designed and developed for performance then you are effectively making a judgement call on the risks associated with the eventually delivery of a poorly performing (and possibly scaling) solution.

Performance testing in the early stages can be useful to validate performance models and to monitor the resource consumption behavioral changes of the software under construction across releases even into production but testing should be based on both a software and system execution model otherwise all you are proving is that you can bring down a system with a particular load. Hey that is easy but what does one really learn of the underlying behavior. Not much especially as most tools including those you referred do not offer any correlation with activities, paths, and resource usage and service delays.

Most continuous performance initiatives fail because the incentives and controls (lack of) invite developers to set parameters that pass the tests. What typically happens is that management focuses more on passing the tests even if there bear little resemblance to reality and less on extracting knowledge and engineering best practices. The kind of automation I see divorces the engineer from the engineering activity. It becomes a competition focused on beating the clock under the tested conditions and ignoring the implications for different workload patterns.

Performance testing works when people first start with performance engineering. Performance engineering != tuning or premature optimization.

regards,

William

Re: Maybe if you stop calling it performance testing.... by Kirk Pepperdine

Hi William,

You make some very good points and clarifications in your comments. I talked to a project manager just prior to authoring this article and I asked his opinion regarding performance testing during development. He thought it was a complete waste of time because of the bias that I know that you face on a day to day basis. In reviewing the article, Heinz commented that this was something that we'd never see. I guess this means we need more case studies from people like yourself that can clearly demonstrate the benefits of testing for performance early on in the applictions development lifecycle.

-- Kirk

Re: Maybe if you stop calling it performance testing.... by William Louth

Hi Kirk,

Yes indeed performance testing during development can be a complete waste especially when the application is largely incomplete, likely to change drastically, and the team has very little data whatsoever regarding the workload patterns, deployment topology and system hardware. We test to validate and verify the software but without a model testing is meaningless. What is is purpose in terms of performance testing. This is where I am reminded of the line in Batman Begins were Bruce's father asks him why do we fall? So "we can LEARN" to get back up. More performance testing activity I see conducted pays very little attention to the knowledge acquired and this is probably because the tools and processes create a disconnect between the test cases and the underlying software execution model and the results are simply boolean logic. Knowledge acquisition is largely absent or given lip service.

The level of performance engineering applied to a project should reflect the risks - risks viewed from a business perspective and not by the development team or project manager. If there are sufficiently high risks then one must construct models and ensure there is sufficient instrumentation performance to collect the required data to validate the model and monitor the software.

You can construct a large proportion of the software execution model by not testing but simply tracing and metering possible resource usage ranges for system and component boundaries. From this you can already make assessments whether the eventual software will meet performance objectives. Too many roundtrips between client-server-database(or msg backend) is not going to delivery sub-second response times not matter what you throw at it.

Performance testing as in the context of load generators is important when constructing the system execution model which is focus more on bottlenecks related to concurrency.

Performance testing is viewed negatively today because testing teams rarely extract knowledge of the execution model and make this available across the complete life cycle. Performance testers have the potential to provide enormous benefit in resolving issues in operations were there is very little application knowledge if only they focused less of the load and more on the flow. Understanding the execution flow and relating this to incidents in product allows them to better pinpoint possible faulty or overloaded components and systems not easily detected by existing low level health monitors.

regards,

William

Refactoring to improve performance.... by Amr Elssamadisy

What level of performance is a requirement for your project? Will your architecture/design support it? If not, by using this approach then you will have to refactor it in later. The type of refactoring to fix performance issues is usually large and extremely expensive.

So there is a cost trade-off: pay less now and if the performance is not met refactor later for much more time/money, or build it in upfront and pay the price of design carry.

Testing is a great tool to let you know when things are broken, but many of the cross-cutting issues are not simple to refactor in later.

In an Agile setting the customer/product owner should make this decision after being fully informed of the technical costs of both solutions.

Testing isn't the only thing that's needed! by M. Edward (Ed) Borasky

Well, yes, you *do* need to have continuous performance testing, both single-user (performance "unit testing") and multi-user load and scalability testing (performance "integrated testing"). But there are lots of *other* things you need to be doing!

1. Up-front software performance engineering. Build performance into the application, build metrics into the application. Do a Google search for "SPE Software Performance Engineering".

2. Modeling!! Testing takes time -- lots of it. Modeling is fast and will get you a good idea of whether you're going to sink or float. Try a Google search for "Guerilla Capacity Planning".

3. Post-deployment performance monitoring. You have to continuously monitor resource usage on your servers and the response time your customers are seeing.

Performance engineering is a full-time job -- there's no point in the application life cycle where you *don't* have to pay attention to performance. If you "just test", you aren't doing it all. And if you aren't even testing, well ....

Re: Testing isn't the only thing that's needed! by William Louth

Hi Edward,

It is very refreshing to hear someone also discuss performance engineering with an emphasis on modeling and data collection. I was started to think I was the imega man of SPE at least on Java related websites such as InfoQ and TSS.

---------------------------------------------------------------------------

[All]

I hope the following graphic posted on my blog shows the major activities of the discipline which as has been pointed out do not necessarily take up so much time as the testing activity.

blog.jinspired.com/?p=38

It is important to note that SPE spans the complete life cycle and ensures that during the construction of the software that monitoring concerns of operations are already factored in. The benefits of SPE are not confined solely to the performance of the system once one starts to see that it is a knowledge acquisition exercise that provides a common and sufficiently high enough level of the software and system behavior to be used in verifying system behavior, validating assumptions, and performing root cause analysis of issues arising during all phases especially in PRODUCTION were this knowledge becomes paramount for fast problem resolution.

Kind regards,

William Louth
JXInsight Product Architect
CTO, JINSPIRED

www.jinspired.com

Re: Testing isn't the only thing that's needed! by Kirk Pepperdine

Performance engineering is a full-time job -- there's no point in the application life cycle where you *don't* have to pay attention to performance. If you "just test", you aren't doing it all. And if you aren't even testing, well ....


Agreed and these are topics that I've covered in other publications. That said, they are more accepted than continuous performance testing.

And William, I don't think you'll find that they've been ignored on InfoQ or TSS, they just need someone to write about them. And if you're keen..... ;-)

Kirk

Re: Testing isn't the only thing that's needed! by William Louth

Hi Kirk,

I am always writing about software performance engineering (instrumentation, data collections, performance models, metric analysis,...). It just happens to be on my blog which for some strange reason has never been referenced (linked to) by a well known Java performance tuning site that publishes related articles each month. Maybe this is indicative of the "performance testing" and "adhoc tuning & troubleshooting" mentality prevalent in the industry.

regards,

William

Re: Testing isn't the only thing that's needed! by M. Edward (Ed) Borasky

It's refreshing to hear people actually talk about complete life-cycle performance engineering. For example, Neil Gunther is so enamored of the back-of-the-envelope modeling approach that he gives short shrift to the rest of the tools and techniques with comments like:

"Performance modeling is also about spreading the guilt around.

"You, as the performance analyst or planner, only have to shine the light in the right place and then stand back while others flock to fix it."

www.perfdynamics.com/Manifesto/gcaprules.html

Now I'm personally a big fan of queuing models, and Gunther appears to be the only one outside of a university that teaches that approach. But the problem with queuing models is that they're difficult to understand. Difficult to understand means difficult to validate and test.

But as far as I'm concerned, if you're talking about an application development framework, like, for example, Rails, it ought to come with a complete set up performance engineering tools built in. It ought to be able to measure the end-to-end response time users are seeing, it ought to be able to look at system and process resource usage on the servers and integrate it for capacity management purposes, etc. I shouldn't have to go buy a separate performance monitoring tool set.

But that's exactly what I have to do for many Java-based frameworks. I've lost count of how many of those tools are out there for Java applications. And they're expensive. You can spend hundreds of thousands of dollars on these tool sets.

Re: Testing isn't the only thing that's needed! by William Louth

Hi Edward,

Well I must be fortunate because Neil actually published a performance analysis report on virtualization / Hyperthreading referencing JXInsight as the tool that was used to collect profiled traces. I have Neils book and it is very accessible, providing a good introduction to queueing systems for the purpose of performance modeling.

I do not think that a framework likes Rails also has to be a performance monitoring tool. I think it should provide specific extension points into the framework and derived applications that enables various levels of performance monitoring (metrics, traces, diagnostics, and metering) to be introduced by other companies (open source or commercial). This ensures that innovation in performance management can be introduced and keeping the Rails team focused on other important aspects that cannot be readily designed and delivered by others.

Maybe in the Ruby/Rails world your suggestion might work because there is effectively only one dominant framework but in the Java world this is not the case. Thus we need a set of APIs that work up and down the technology/framework stack and across platforms and middleware. It would be wasteful for each framework to offer their own custom diagnostics API set (but I am sure this will happen such is life in Java world).

Not all Java monitoring solutions are expensive. JXInsight is a far superior offering (IMHO) to most of the +100,000 USD Java performance monitoring & problem diagnostics solutions and yet it is very affordable.

regards,

William

Re: Testing isn't the only thing that's needed! by William Louth

I wanted to point out that a performance model need not be a very sophisticated queuing based model. A software execution model which focuses on detailing the (common) paths / flows of execution through a system can be invaluable during the early stages in identifying potential bottlenecks before even load/stress testing is performed.

A software execution model can be as simple as a catalog of use cases with a corresponding set of mapped process & component level execution flows that include typical resource usage/consumption ranges. For this it should be possible to identify the number of possible roundtrips between client->server->data(grid|base|bus) as well as the number of component boundaries crossed. This information not only helps with performance engineering during development it can with defining deployment topologies as well as identifying possible failure points in a production application for operations to investigate when an alert or incident is reported.

regards,

William

Re: Testing isn't the only thing that's needed! by Kirk Pepperdine

Hi Edward,
Not all Java monitoring solutions are expensive. JXInsight is a far superior offering (IMHO) to most of the +100,000 USD Java performance monitoring & problem diagnostics solutions and yet it is very affordable.


Agreed, you don't need to spend 10s of thousands of $$$ on this. My performance tuning course is 100% open source. This isn't to say that products like JXInsigh are not worth the $$$. Once you get going you may find the investment in a commercial too be worthwhile. More to the point, you can get quite a bit done with the OSS tools that are available.

Kirk

Re: Testing isn't the only thing that's needed! by Kirk Pepperdine

My performance tuning course is 100% open source.


Should say, based 100% on open source tools

- Kirk

Re: Testing isn't the only thing that's needed! by William Louth

Just to be clear none of the open source tools above would actually help you derive a performance model like I described above. They may facilitate testing (generating a load) during a tuning activity but they are not a performance engineering solution. To understand the concepts of SPE one does not even need to use a tool though it does helps in setting the context for various activities.

SPE is much MORE than performance testing or tuning.

JXInsight is not just a performance tuning tool it is a comprehensive runtime analysis solution that offerings problem diagnostics via object & request runtime state imaging ((JXInsight Diagnostics), distributed profiling & tracing (JXInsight Trace), remote system & component state inspection (JXInsight JVMInsight), extensible resource metering (JXInsight Probes), service management monitoring (JXInsight Metrics), and true database transaction analysis (JXInsight Transact - JDBInsight).

Performance is admittedly an important aspect of execution flow but it is not the only one. Understanding the state changes caused by flows especially exception generating flows is important for reliability and availability as well as system test verification (delta debugging).

regards,

William

Leia com atenção!!!!! A conclusão. by Luiz Otavio Ribeiro

fabriciodiogenes@ig.com.br

Re: Maybe if you stop calling it performance testing.... by Alois Reitbauer

Hello,

I agree with your point regarding performance engineering. That is why validation your architecture continuously during development is so important. Will brought up good points like massive DB calls. There would be much more to add. This is definitely not premature optimization. It definitely makes no sense to use micro benchmarks to test for production performance. Every phase in the lifecycle has to test for different types of problems.

As many performance and I think almost all scalability problems are architectural problems solving them always means changes to the architecture. Doing this late in the development process increases risk and cost. My experience is that a lot of performance problems (about 50 percent) can already be found during development and CI testing. This is also what our customers are telling us.

On the other side I fully agree that automation is the key here. If I have to do everything manually performance testing get's too expensive. Therefore automation and integration into an existing toolchain are so important. Delivering reports that validate architectural rules automatically based on testing transactions helps people to identify potential problems fast.

While implementing performance management (I call it that way as it has a wider scope than just testing) imposes some costs, it saves you a lot of money later on. If testing cycles and production troubleshooting can be minimized you get your money back fast and even save a lot.

I know a number of companies which are doing it already and they say they profit from continuously managing performance across the application lifecycle.

- Alois

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

16 Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT