Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Pragmatic Techniques for Maintaining a Legacy Application

Pragmatic Techniques for Maintaining a Legacy Application

A legacy application seldom looks like this: 

In this beautiful graph, layers and blocks are clearly separated and communicating through well-defined channels. Blocks can be moved, replaced, or added easily, supporting those important “ibilities”: extensibility, scalability, maintainability…

In real life, a legacy application probably looks more like this:

And if you are an unfortunate developer who is tasked to maintain the legacy application, you can sometimes feel like the mice in the maze; every time you turn, there are more unexpected nooks and corners, and even death traps.

I’ve been in charge of maintaining such a legacy application for over two years, and in this article, I want to share my experiences on how to pragmatically maintain a big legacy application.

I stress the word “pragmatic”; since a legacy application can have a lot of technical debt, it is not feasible or economical to pay off all technical debt; one has to be strategic in choosing the right battles.

Spying on enemies

A brief description of the massive legacy application that I am maintaining:

The big house is JBoss 4.0. For some reason, JBoss 4.0 was tweaked in some mysterious ways, making it hard to upgrade. So already, the house was not built very strongly (hence it is raining). The two small houses represent two web applications (two WARs) deployed onto JBoss. One house is much smaller, and serves a smaller purpose. Every single page has to be served by two houses. There are three connection pools, two transaction mechanisms, one home-made cache, one home-made cluster mechanism, one home-made RMI mechanism etc.

The first step for maintaining a legacy application is to understand it. It is impractical for us to understand every detail of the application, but we need to understand the big picture:

  1. Why are there two WARs for serving one single page? How do the two WARs interact with each other? What is the overhead? How can we merge them?
  2. How are transactions handled? Is there a risk for transaction corruption with the interaction between two WARs, three connection pools and two transaction mechanisms?
  3. How do we know if the performance bottleneck is in the database or in the code? If it is in the code, how can we drill down?

Analyzing code statically is either inadequate or inaccurate. We developed several tools to spy on the application at runtime to answer these questions. We took care to implement these tools as add-ons: they are not entangled with the application code, so they are not extra code that we have to maintain. For more technical details of these tools, check out my blog.


This tool can be turned on for every single web page, and tracks which SQL queries are issued, how many times each, and elapsed time. And very importantly, it generates tkprof for these queries for performance troubleshooting purpose.

Technical details

SQLTracer spies on JDBC operations using the following logic:, when a page request arrives a javax.servlet.Filter checks if tracing is turned on for that page and stores the page information in a ThreadLocal. When connections are fetched from a connection pool, their actions are intervened by AspectJ advices, which track SQL statements, invocation times and elapsed time. The connections are marked with the same identifier using Oracle dbms_session.set_identifier(), and their actions are traced using dbms_monitor.client_id_trace_enable(). This technique allows tracing different connections.


PerfSpy is a runtime logging and performance monitoring tool. We use it to spy on individual pages, it will log method invocation, elapsed time, method parameters and return values – in short, it does step-debugging and stores everything for later inspection without you having actually to do step-debugging in an IDE. It does runtime code analysis and figures out performance bottleneck. It has a UI application that shows the method invocation as a tree, and provides ways to manipulate the tree – hide nodes, search nodes, mark nodes etc. Method parameters and values are also presented as trees. Here is a screenshot:

With this tool, we are able to answer these big questions:

1) How do the two WARs work together? We can conclude that there is no need for two WARs; it complicates deployment and adds memory and performance overhead, and it is quite easy to merge the two WARs together.

2) How do the three connection pools and two transaction mechanisms work together? We originally believed that when some code initiated a transaction, it would pass the transaction downward and hence all operations would be done in a single transaction. However we discovered that in some scenarios, transactions are not passed down and operations are not done in a single transaction, which explains many data corruption issues we’ve seen in users’ instances. Perfspy is able to provide the insight because it logs detailed information of method parameters and return values, including their system hash codes. So in the method invocation tree we can see when and where a new connection or Hibernation session, or a new transaction is started.

3) What is the performance bottleneck for a certain web page? SQLTracer shows performance issues on the database side, PerfSpy shows performance issues on the code side by showing duplicated invocations and elapsed time.

Technical details

PerfSpy uses AspectJ to do runtime monitoring. PerfSpy has an abstract aspect that does logging, monitoring and tracking. Once this aspect kicks into action, it consults a configuration file to decide how much information it should gather and what code analysis it should perform. To use PerfSpy you write an aspect that extends from the abstract PerfSpy aspect, in which you specify which code flow and what methods in the code flow you’d like to capture.

Typically an application uses some framework to serve generic purposes, and the application extends that framework to perform specific actions. For example, Struts 1.2 has org.apache.struts.action.Action, which can be extended to serve web page actions. PerfSpy is designed to spy on such frameworks. For example, to use PerfSpy to spy on Struts actions, one can write an aspect which extends from the PerfSpy aspect:

public class PerfSpyStrutsAsepct extends AbstractPerfSpyAspect {
        //specifies which code flow to spy on 
        @Pointcut("cflow (execute(* org.apache.struts.action.Action.execute(..)))")
        public void cflowOps() {

        //specifies which methods in the code flow to spy on
        @Pointcut("execution(* com.myCompony.myPk1.myPk2.*(..))")
        public void withinCflowOps() {


The configuration file can specify which concrete Struts action class to spy on. In this way, one can turn spying on or off for individual pages.


BLSpy stands for business logic spy. Often times, our users (and sometimes even ourselves) can become confused about the numbers produced by our legacy application; they’d like to know how the application calculates those numbers. Users can turn on spying for individual types of business numbers.

With this tool, users are now able to solve such confusion without contacting us. And we ourselves have even detected some calculation flaws with this tool.

Technical details

BLSpy is developed on top of PerfSpy. PerfSpy allows us to capture the calculation code flow, but we want to present the calculation in business terms to users. For example, we would like to convey messages such as “to calculate the human resource costs for this week for department A, we fetched the time records of the human resources from department A, and found out Jason worked on database maintenance for 20 hours, and his hourly salary is $30...”. To do so we annotate the meaning of the methods that are involved in a calculation flow, and at runtime, extract the meanings and combine with the methods captured by PerfSpy, and present meaningful information to users.

Picking battles

Legacy applications can be like a mine field. Every place you care to dig, you might find a bigger problem than you expected. It is not feasible or economical to plug up every hole. Usually management likes to shrink the task force maintaining the legacy application ever smaller and smaller, so every investment has to be calculated.

There is a natural metric that measures defects per function area. The application is used by many users and when they encounter issues they log tickets to us. In the ticket they select the functional area where the issue occurs. This metric is inadequate however because:

  1. This metric doesn’t provide a broad view of architectural level flaws. It is easy to categorize functional issues. On the other hand non-functional issues such as performance issues, scalability issues, and stability issues, although they may manifest themselves in functional areas, their root cause may cross functional area boundaries. For example slowness in creating an order in the order module or slowness to raise a complaint in the complaint module might be caused by the same underlying queuing mechanism.
  2. It creates the motivation to turn around the numbers using “quick and dirty” tactics. For example, our transaction mechanisms results in a lot of dirty issues. The usual “quick and dirty” tactics would be to delete the dirty data directly from the database or work around the dirty data in code, which can turn around the metric quickly but doesn’t improve the system health at all.

We decided to create a metric on top of this natural metric, which we call the “high impact” metric. This metric essentially highlights the most impactful problems plaguing the legacy application and asks for upper management’s support for improving the legacy application’s architecture. We define “performance”, “stability”, “dirty data” and some other non-functional problems as “high impact”, because they either take a longer time to solve or often incur user escalation. We categorize every user reported problem into these categories. Below is a simplified illustration of this metric:

In this diagram, although functional problems occur most often, they are easier to solve and users don’t often escalate them; on the contrary, stability problems, although they occur rarely, are often escalated, and hard to solve. Users tend to be more provoked by stability issues, because they happen randomly, and even tolerant users might not know when and how to prepare for random events. Dirty data is one type of stability issue, we separate it out because dirty data issues used to happen a lot.

User-wrath (escalation) is a good way to draw upper management’s attention to technical debt, and to convince them to invest on improving the architecture of the legacy application.

Fighting battles

We know what we want to tackle using the “high impact” metric, and we know how to dig deeper into the dark secretes of the legacy application through the various spying tools, we are ready to engage the fight.

Refactoring legacy applications can be scary as legacy applications tend to have little unit tests. Our approach is described in this article.


Maintaining legacy applications is an on-going battle. Using the “high-impact” metric, we choose the next big battle to fight. Using the various spying tools we can discover the ins and outs of the enemies. And using the testing framework which we are building and enriching along the way, we are able to conquer the enemy (refactor old code).

About the Author

Chen Ping lives in Shanghai, China and graduated with a Masters Degree in Computer Science in 2005. Since then she has worked for Lucent and Morgan Stanley. Currently she is working for HP as a Development manager. Outside of work, she likes to study Chinese medicine.

Rate this Article