InfoQ Homepage Articles To Execution profile or to Memory Profile? That is the question.

To Execution profile or to Memory Profile? That is the question.

Dec 04, 2013 9 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.Get in touch

I recently had a group of developers performance troubleshooting a problem-riddled application from my performance workshop. After dispensing with a couple easy wins, the group was faced with a CPU that was running very hot. The group reacted exactly in the same way that I see most teams do when faced with a hot CPU; they fired up an execution profiler hoping that it would help them sort things out. In this particular case, the problem was related to how the application was burning through memory. Now, while an execution profiler can find these problems, memory profilers will paint a much clearer picture. My group had somehow missed a key metric that was telling them that they should have been using a memory profiler. Lets run through a similar exercise here so that we can see when and why it is better to use a memory profiler.

Profilers work by either sampling top of stack or instrumenting the code with probes, or a combination of both. These techniques are very good at finding computations that happen frequently or take a long time. As my group experienced, the information gathered by execution profilers often correlates well with the source of the memory inefficiency. However it points to an execution problem which can sometimes be confusing.

The code, found in Listing 1 defines the method API findByName(String,String). The problem here isn't so much in the API itself but more in how the String parameters are treated by the method. The code concatenates the two strings to form a key that is used to look up the data in a map. This misuse of strings is a code smell in that it indicates that there is a missing abstraction. As we will see, that missing abstraction is not only at the root of the performance problem, but adding it also improves the readability of the code. In this case the missing abstraction is a CompositeKey<String,String>, a class that wraps the two strings and implements both the equals(Object) and hashCode() methods.

public class CustomerList {

  private final Map customers = new ConcurrentHashMap();

  public Customer addCustomer(String firstName, String lastName) {
    Customer person = new Customer(firstName, lastName);
    customers.put(firstName + lastName, person);
    return person;
  }

  public Customer findCustomer(String firstName, String lastName) {
    return (Customer) customers.get(firstName + lastName);
  }

Listing 1. Source for CustomerList

Another downside to the style of API used in this example is that it will limit scalability because of the amount of data the CPU is required to write to memory. In addition to the extra work to create the data, the volume of data being written to memory by the CPU creates a back pressure that will force the CPU to slow down. Though this bench is artificial in how it presents the problem, it's a problem that is not so uncommon in applications using the popular logging frameworks. That said, don't be fooled into thinking only String concatenation can be at fault. Memory pressure can be created by any application that is churning through memory, regardless of the underlying data structure.

The easiest way to determine if our application is burning through memory is to examine the garbage collection logs. Garbage collection logs report on heap occupancy before and after each collection. Subtracting occupancy after the previous collection from the occupancy before the current collection, yields the amount of memory allocated between collections. If we do this for many records we can get a pretty clear picture of the application's memory needs. Moreover, getting the needed GC log is both cheap and with the exception of a couple of edge cases will have no impact on the performance of your application. I used the flags -Xloggc:gc.log and -XX:+PrintGCDetails to create a GC log with a sufficient level of detail. I then loaded the GC log file into Censum, jClarity's GC log analysis tool.