Heap Sample: Comparing the new collection classes with the old

The first step in updating the Heap class to Heap<T> was to take a look at the new collections provided in Visual Studio 2005 Beta 1 to get an idea what the generic interfaces for collections look like.  (The original Heap project was to implement a different collection type in a way as similar to the built-in collections as possible).

My assembly browser of choice is Lutz Roeder's excellent .Net Reflector, so I fired it up and dug in to mscorlib.dll. As a side note, this tool is an excellent example of the value and responsiveness of a smart client application using self-updating technology to cope with a relatively unstable environment such as previews and betas of a new CLR version.

The original Heap implements the interfaces ICollection, IEnumerable, ICloneable and  ISerializable.  Looking at the new List<T> class (which behaves rather like a strongly typed ArrayList), it implements IList<T>, ICollection<T>, IEnumerable<T>, IList, ICollection, IEnumerable

So there are three new generic interfaces to explore.  As with the original, IList<T> is not appropriate for our proposed new Heap<T>, as a heap doesn't exhibit the indexed behaviour that IList denotes.  We'll implement the other two.

You can see that List<T> still implements the same non-generic interface set as ArrayList.  This is handy in a couple of ways.  Firstly, it means that List<T> objects can often be plugged into existing code where the interface to the list rather than the concrete type were specified.  Secondly it means that you can use generic types in your implementation even if you need to produce an assembly that exposes only CLS-compliant APIs.  Although generic types aren't categorized as CLS compliant in this release of Visual Studio, you can implement code that internally uses a List<T> instance and then expose it via its IList interface.

However, for the sake of brevity of my sample code, I chose to just implement ICollection<T> and IEnumerable<T>.  If this had been production code, I'd have chosen to support both.

ICloneable is notably absent from the List<T> type.  This interface has been deprecated by the CLR team for Visual Studio 2005.  Unfortunately, when it was designed, the contract never specified whether the copy was deep or shallow and provided no way for callers to signal which they wanted.  This leaves the collection writer with a small dilemma.  Follow the new advice and move the functionality to some new custom interface with a better defined contract (perhaps inventing IDeepCloneable and IShallowCloneable) or keep the old implementation to give minimal code change in Heap clients that might have coded to the semantics documented for my implementation.  I chose the latter - your mileage may vary.

Finally, the original heap implemented ISerializable and I chose to keep the implementation in Heap<T> as it's so handy when testing.  Serialization of collection classes is a thorny issue.  In many cases the serialization exposes the implementation details in a most undesirable way and leaves application brittle to change causing old persisted serializations to fail when later read in.  The serialization implementation in Heap is naïve in exactly this manner as it exposes private members like "version"; this is nice for testing as you can roundtrip the exact in-memory state of the object but its not the sort of data you want in some long-term file format when a bug fix could alter the way versioning is done entirely, removing or changing the type of the version member.  Of course, the fact that you're using a heap for storage is equally going to be just an implementation detail of the next higher level of data structure in such an application and even more vulnerable to change.  In production I'd only likely use this type of implementation for transient storage of a large collection that would be relatively expensive to re-heap.  A clipboard operation might be an example of such a case.

Next time we'll look at the differences between ICollection and ICollection<T>.