BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Improving .NET Performance by Reducing Memory Usage

Improving .NET Performance by Reducing Memory Usage

Leia em Português

This item in japanese

Bookmarks

A commonly misunderstood concept in .NET performance tuning is the importance of avoiding memory allocations. It is thought that since memory allocations are fast, that they rarel,y if ever, have an impact on performance.

To understand the cause of this misunderstanding, one must go back to the era of COM programming as seen in C++ and Visual Basic 4 thru 6. With COM, memory was managed using a reference counting style garbage collector. Each time an object was assigned to a reference variable, a hidden counter was incremented. If the variable is reassigned or falls out of scope, the counter is decremented. And if the counter reaches zero, the object is deleted, freeing the memory to use elsewhere.

This system of memory management is “deterministic”. By careful analysis, you can determine exactly when an object will be deleted. This in turn means that you can automatically free resources such as database connections. Contrast this with .NET, where you need a separate mechanism (i.e. IDisposable/using) to ensure the non-memory resources are released in a timely manner.

There are three major downsides to reference counting garbage collectors. The first is that they are susceptible to “circular references”. If two objects reference each other, even indirectly, then it is impossible for the reference count to drop to zero and a memory leak occurs. Code has to be carefully written to either avoid circular references or to provide a deconstruct method of some sort that break the loop when the objects are no longer needed.

The other major drawback occurs when working in a multi-threaded environment. In order to avoid race conditions, some sort of locking mechanism (e.g. Interlocked.Increment, spinlock, etc.) is needed to ensure that the refence counts remain accurate. These operations are surprisingly expensive.

Finally, the list of available memory locations can become fragmented with a lot of small, unusable spaces between live objects. Memory allocation often involves walking a linked list of chain of free locations, looking for a spot large enough for the desired object. (Memory fragmentation can also in .NET on the “Large Object Heap” or LOH.)

By contrast, allocating memory in a mark-and-sweep style garbage collector such as .NET or Java is a simple matter of a pointer increment. And assignments are no more costly than assigning an integer. It’s only when the GC actually runs that the real cost is paid, and often that’s mitigated by using a generational collector.

Back when .NET was new, many people complained that the non-deterministic performance of .NET’s garbage collector would harm performance and be difficult to reason about. Microsoft’s counter-argument at the time was that, for most use cases, the mark-and-sweep garbage collector would actually be faster despite the intermittent GC pauses.

Unfortunately, over time the message has become somewhat garbled. Even if we accept the theory that mark-and-sweep garbage collection is faster than reference counting, that doesn’t mean it’s necessarily fast in an absolute sense. Memory allocations, and the associated memory pressure, are often a cause of hard to detect performance problems.

Furthermore, the more memory being actively used, the less efficient your CPU caches are. While main RAM is so large that using disk-based virtual memory is almost unheard of for most use cases, the cache in the CPU is tiny by comparison. And the time it takes to populate the CPU cache from RAM can take dozens or even hundreds of CPU cycles.

In a recent article, Frans Bouma identified several techniques for optimizing memory usage. While he was specifically looking at improving ORM performance, the suggestions are useful in a variety of situations. Some of his suggestions include:

Avoid params arrays

The params keyword is useful, but expensive compared to a normal function call because it requires a memory allocation. APIs should provide non-params overloads for commonly used parameter counts.

An IEnumerable<T> or IList<T> overload should also be provided so that collections don’t need to be unnecessarily copied into an array before calling the function.

Pre-size data-structures if you add data immediately afterwards

A List<T> or other collection class can be resized multiple times while being populated. Each resize operation allocates another internal array than then needs to filled by the previous array. You can often avoid this cost by providing the collection’s constructor with a capacity parameter.

Initialize members lazily

If you know that a given object won’t actually be needed most of the time, then you should use lazy-initialization to avoid allocating it prematurely. Usually this is done manually, as the Lazy<T> class itself requires allocations.

Back in 2011, we reported on Microsoft’s efforts to reduce the size of Task by using similar techniques. They reported that they saw a 49 to 55% reduction in the time it takes to create a Task<Int32> and a 52% reduction in size.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • I miss the elephant in the room maybe

    by Ciprian Khlud,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Java has a reason to be still used as a high performance development: the .Net's CodeGen is typically a bit lightweight, to say nicely and the GC is a bit slower.

    Removal of many allocations (via escape analysis) and aggressive devirtualization (trough Class Hierarchy Analysis) and the tiered compiler of Java makes tight loop coding to run around 2x times faster in my experience (and it looks like the TechEmpower benchmarks do give similar numbers in the higher end of numbers).

    I found that Microsoft seem to split into "high performance C++" and "adequate performance .Net" which makes to me a bit unsecure to see .Net as a high performance platform.

  • Re: I miss the elephant in the room maybe

    by Jonathan Allen,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    We used to say that careful use of structs could achieve the same as escape analysis. But lets be honest, almost nobody has the time for that. And even if we did, the libraries we're using usually don't have both a class and struct version of each type.

    So yes, I agree that .NET could really use escape analysis.

    ***

    As for devirtualization, I see that as more of a nice-to-have. .NET APIs don't have nearly as many virtual functions as Java APIs. So as long as you don't go crazy with the abstract interfaces, that's usually not a problem.

  • re "An IEnumerable or IList overload"

    by Bill Woodruff,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    "An IEnumerable<T> or IList<T> overload should also be provided so that collections don’t need to be unnecessarily copied into an array before calling the function."

    I've read the Bounma blog entry cited, and I can't connect this statement with the blog, or with the rest of your article.

    I'd appreciate your clarifying this.

    thanks, Bill

    p.s. I consider your articles, and reportage, on C# and .NET to be exemplars of deep analysis and thoughtful exposition.

  • Re: re "An IEnumerable or IList overload"

    by Jonathan Allen,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    It's a different example, but the same concept. Would you rather call this?

    DoSomething(myList);

    Or this?

    DoSomething(myList.ToArray());

    If I only offer a DoSomething( params int[] values) version of the function, you have to convert everything into arrays. Which is inconvenient and potentially expensive.

    ****

    Futures note: Microsoft was considering allowing the params keyword to work with thinks other than arrays. Unfortunately I don't know the status of that proposal.

  • Re: I miss the elephant in the room maybe

    by Ciprian Khlud,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Not sure if it was clear, but I appreciate your answer Jonathan!

    This is a fair and honest view. I was tracking .Net's tracking issue of perf and it was eventually closed which means that many (enough) issues were solved:

    github.com/dotnet/roslyn/issues/10378

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT