Caching Implies Policy

A colleague of mine once said those words in a meeting and they really hit a chord with me. I think there's a lot of meat in those three words.

We often reach for caches to improve performance. However, it is vitally important to make a deliberate, thoughtful, and justifiable-on-the-numbers choice about policy. It is often the case with framework components that you find yourself too “low” in the architecture stack to understand the usage patterns, or to be experiencing uniform ones. Hence you cannot make excellent choices about caching policy which ultimately dooms your cache to mediocrity. Under those circumstances it's almost always a bad idea to do implicit caching.

In the managed world, caching has three hidden costs:

Cached Object Age

This one is very hard to avoid. In order for your cache to be useful it is highly likely that you are saving some object or objects under a cache key. Rather than allow those objects to die when the whatever code is using them no longer needs them, the cache keeps them around at least a bit longer in case those very same objects are used again. The danger of doing this is that by letting the objects age you may be letting those objects get into an older generation, increasing the cost of reclaiming that memory. Depending on what the cache-hit rate, and the volume of objects going through the cache, that could turn into a very bad idea. Extra generation 2 collects could easily erase all your savings.

To do the best job, you need to have a good idea what the lifetime the objects should be and choose your cache policy accordingly.

Cached Object Finalization

This is sort of the same as the previous one. Many caching schemes use weak-references and finalizers to arrange for the recycling of objects. This isn't automatically a bad idea, but again the presence of finalizers causes objects to live longer (and creates work for the finalizer thread). Additionally, because at least one more thread is involved (the finalizer thread), it may be necessary to add synchronization features to your class that could otherwise be avoided. See this posting for more thoughts on finalizers

Transparency of Implementation

Once you decide to put implicit caching into your class, you may be stuck with it forever, or you may find your hands tied on the policy. The darned thing about class features is that customers tend to use them -- the nerve :) -- if you've made a choice that turned out to be not everything you wanted, you might have to live with it because changing it would affect some of your customers very negatively. 

On the other hand, if your caching policy choice happens far enough up the architecture stack then there's a good chance you had it right in the first place, because you have more context about what your customer needs at that point, and also a good chance you can change it for the better later, for the same reason.

Cache Wisely.