Andrew Arnott

News from my corner of the Visual Studio Project & Build team, programming tips, and solutions to common programming issues.

Immutable collections with mutable performance

In my last post, I detailed the differences among read/write, read only, frozen and immutable collection types.  I described how immutable collections come with a hit to the garbage collector due to the garbage they generate during mutations.  I have a very positive update on that topic.

My previous implementation for the immutable collections used a truly immutable binary tree as its backing data structure.  As a result, every individual mutation required a spine rewrite where (log n) nodes were recreated.  For a batch of m changes, this meant m log n objects were created and immediately discarded.  That’s a lot of garbage.  And all those memory allocations (and GC interruptions) hurt performance to the tune of about being twice as slow as mutable collections.

By changing the collections to use a freezable binary tree data structure, spine rewrites retain new nodes that are mutable until the (bulk) operation has completed.  As a result, only log n nodes are created instead of m log n.  And those new nodes aren’t garbage: they’re the important “new” part of the updated collection.  By eliminating all garbage from intermediate steps of mutations, I’ve measured a 91% drop in overall memory allocation (the remaining 9% is not garbage) and a 50% improvement in performance.  This brings performance and memory load of immutable collections on par with the mutable collections, while retaining all their immutable properties that make them so attractive.

Is there anything worse about the immutable collections then?  Yes, if you have rapid changes to the collections that you make individually (you don’t batch them with AddRange or some similar method) then you still may produce a lot of garbage assuming you release each version as you update it, whereas a mutable collection may have vacant memory slots it can fill when you add elements, immutable collections do not, and thus allocate memory for every element addition.  But as discussed in my previous post, this “slack space” is also a point against mutable collections as it wastes memory that isn’t reclaimable until to release the entire collection.

We expect (not promise) to ship these highly tuned immutable collections as a public CTP release soon.  I’m stoked.