High Performance Immutable Arrays in .NET
In the newest drop of Immutable Collections for .NET we get ImmutableArray<T>, a faster alternative to ImmutableList<T> in read-only, indexed scenarios. ImmutableList<T> takes a balanced approach to its design. Because of its complex internal structure, adding a new item to it is only an O(log n) operation. Likewise, reading an item by its index can take O(log n) time.
ImmutableArray<T> isn’t complex at all. It is literally just a struct that wraps an array. According to ildasm, there aren’t any other fields. This means that reading from the immutable array occurs in O(1) time. Conversely, adding to the immutable array requires a full copy of the underlying array, making it an O(n) operation.
Immo Landwerth makes these recommendations:
Reasons to use immutable array:
- Updating the data is rare or the number of elements is quite small (<16)
- you need to be able to iterate over the data in performance critical sections
- you have many instances of immutable collections and you can’t afford keeping the data in trees
Reasons to stick with immutable list:
- Updating the data is common or the number of elements isn’t expected to be small
- Updating the collection is more performance critical than iterating the contents
Being a value type, ImmutableArray<T> can be created without being explicitly initialized. When that happens the struct detects the internal null pointer and acts as if it were an array of zero length.
Immutable collections are a work in progress and breaking changes occur from time to time. This time around we see the Create<T>(IEnumerable<T> items) function being renamed to “From”.
We discovered that the overload that takes the IEnumerable<T> can have surprising results. You’d think you can use the overload that takes IEnumerable<T> by creating collections from other collections:
Instead of creating an ImmutableList<string> you end up creating an ImmutableList<List<string>> because overload resolution prefers the params overload over an implicit conversion from List<string> to IEnumerable<string>.
For that reason we’ve decided to remove the ambiguity by renaming all factory methods that operate over IEnumerable<T> to From.
Originally IImmutableList<T> included a ValueComparer property and matching WithComparer methods. In order to keep ImmutableArray<T> a simple wrapper, it was necessary to drop those from the IImmutableList<T> interface.
The extension GetValueOrDefault used to accept an IDictionary<TKey, TValue> or an IReadOnlyDictionary<TKey, TValue>. This was causing compiler errors when the underlying class implemented both interfaces, so it was replaced with one that only accepted an IImmutableDictionary<TKey, TValue>.
TryGetValue was added to IImmutableSet<T>. This is needed when working with comparer such as StringComparer.OrdinalIgnoreCase and you want to know the actual value in the set instead of just whether or not an equivalent one is available.
Immutable collections are still in preview and thus are not licensed for production use. They are currently offered for.NET 4.5, Windows Store, Windows Phone 8, and Portable Class Libraries.
Performance Notes on Arrays
Jon Skeet recently did some performance testing with arrays and discovered some interesting results. Putting a wrapper around an array can actually make it faster to write to it. Over 100 million reads of a 100 item array, the string array took 40.865 seconds while the wrapped string array only took 29.338 seconds. Reading from the two was comparable, with the string array taking 12 seconds and a wrapped array 11.843 seconds.
The reason for this dates back to Java. In Java arrays are covariant, meaning that you can pass a String to any parameter or variable that expects an Object. The CLR, .NET’s runtime, was designed to support Java so it too has covariant arrays. And because of that, the CLR needs to perform a type check each time an array is written to.
Jon Skeet’s array wrapper is not like the one used in the ImmutableArray we mentioned above. Internally it uses a struct based wrapper around each item in the array. Since this is a struct that just contains a pointer it is no larger than a normal reference stored in the array. But the design of it allows to the CLR’s JIT compiler to omit the type checks.
Jon Skeet's array performance
Ronny Kohavi Dec 12, 2013