Function Memoization in C#

| by Jonathan Allen 140 Followers on Jan 29, 2007. Estimated reading time: 2 minutes |

Pure functions, those that always return the same value for a given input, have several advantages over other functions. One of these is that their result can be saved or "memoized" so they do not need to be recalculated. Wes Dyer of the C# compiler team demonstrates a generic way to do this with C# 3.0 and closures.

A closure is essentially a function pointer packaged with a pointer to some data. Syntactically, a closure is usually created as an anonymous inner function that captures one or more of the outer function's local variables. Because of this capturing, the local variable becomes a heap-allocated variable that can be accessed from both the outer function and the delegate pointing to the closure. This allows the local variable to persist long after the function that contains it is gone.

Closures make it easy to do some otherwise tedious tasks quite easily. In Wes Dyer's example, he shows us how to use closures to memoize the results of a recursive function.

Func<int, int> fib = null;fib = n => n > 1 ? fib(n - 1) + fib(n - 2) : n;fib = fib.Memoize();

The first line of this example creates the variable to hold the function pointer and indicates that it should have one parameter of type integer and return an integer.

The next line creates the function itself. It is important to realize in this context a function is not just static code, it is an object that can be referenced as well. The syntax "n =>" is used to name the parameter, which is then used inside the function.

While the syntax is fairly clean, using recursion to calculate Fibonacci numbers is incredibly inefficient. Calling fib(5) will result in 14 functions calls, including two calls to fib(3), three to fib(2), and five calls to fib(1).

If the fib(n) function could somehow remember that it was already called with a given value of n, it wouldn't have to recalculate it each time. Traditionally this involves creating new class to hold the results of each call, but Wes shows an easier way. Just wrap the function with a call to the Memoize extension method.

The details of the Memoize method are shown below.

public static Func<A, R> Memoize<A, R>(this Func<A, R> f){  var map = new Dictionary<A, R>();  return a =>    {      R value;      if (map.TryGetValue(a, out value))        return value;      value = f(a);      map.Add(a, value);      return value;    };}

The memoize method creates a variable called map, and then it wraps both the variable and the original function in a new function. This new function is what gets returned from the memoize method.

From that point on, every time fib is called this new function is run instead of the original. If the result cannot be found in the map variable, then the original is used.

But wait you say, isn't the original the one with such poor runtime performance. Well yes, but the original calls the fib function pointer, and that now points to the new version. Essentially the original code has been redefined in addition to being wrapped.

For more background on closures, you can read Steve Schafer's post in comp.lang.functional.

Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Lost content

The sample code in your article has lost some of its content during formatting. You must have supplied template arguments inside angle brackets, and these have been stripped off. As it stands, without the template arguments showing, your article does not make sense.

Also, it's "memoize" (look it up) not "memorize."

Re: Lost content

Ah the joys of beta software and new terminology. Thanks for the correction on the spelling, and I?ll have our IT staff look into why the formatting keeps disappearing.

Huh?

I'm not quite sure what you meant by this:

Well yes, but the original calls the fib function pointer, and that now points to the new version. Essentially the original code has been redefined in addition to being wrapped.

Maybe I'm just having a lapse, but the way I read this is wrong. The original function is passed into the Memoize extension-method and captured with the lambda, which is also a closure. The "reference" is never replaced. It doesn't matter what assignments occur outside of the Memoize method. So the old "pointer" always references the original function. The deal is that the result of that long-running operation is cached, so if it's called again, it can just be retrieved from the hashtable.

Memoization is a really powerful technique, but it's not really broadly applicable since the result of each call is stored for the life of the application unless you cache in an external store you can flush. It makes for a good example, but a Fibonacci function would actually probably be a pretty bad usage since the cache could become very large pretty quickly. A good use case might be during processing monthly files of credit-card transactions. You could memoize date-parsing knowing that the process will only ever have up to 31 possible dates, saving a lot of work.

It's also worth noting that Memoization can be whipped up in c#2 pretty quickly too. It's only a couple sugary syntax snippets here that make it c#3 specific. Get rid of the lambda and use an anonymous-method, get rid of the extension-method and use a static class method and you're done.
Close

by

on

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

3 Discuss

Login to InfoQ to interact with what matters most to you.