Should an interface declare invariants that it can't enforce

Article
05/16/2004

I was thinking about the clear method on my ICollection interface. i.e:

/// <summary>

/// Removes all the elements contained in this collection.

/// </summary>

void Clear();

Now, it's interesting that that method takes no arguments and returns no value. So it is effectively immeasurable. It's a ()->() function. Whee! Of course, in an earlier post i talked about the relation between methods on an interface and certain invariants that you could declare. For example, i could add the following invariants to Clear:

For all collections S, S.Clear() implies S.Empty
For all collections S and any element X, S.Clear() implies !S.Contains(X). (Kind of redundant since S.Empty is equivalent to the last part. However, since i didn't state it before, it's worth stating now)
For all collections S and any element X, S.Clear() implies !S.Remove(X).

However, consider the implementation of ArrayCollection.Clear:

public void Clear()

{

array = new A[0];

count = 0;

}

Initializing array back to an empty array is unnecessary. The implementation of ArrayCollection will still work perfectly when you set count to 0. However, this will be a very badly behaving collection. Say you instantiate an ArrayCollection and you add 1 million heap allocated items to it. One would expect the reference to those items to be released so that any unused items could be reclaimed by the GC. However, nothing in the interface said that that would be the case.

When we look back at the clear method we realize it doesn't really say much. All that we really have in a signature is the arugments, the return type, and the name. However, is the name really important? Could i have called it “void SquashedCockroach();” instead (credit to Max Mintz)?. I think the answer to that is “no”. While i would have certainly been allowed to call it that, it would have lost the meaning that i intended for it and it would have made no sense in the context of ICollection. Unfortunately, once we enter this realm we realize that we're bringing english into the game. What does “clear” really mean. A quick hop over to dict.org shows only about 30 definitions for clear.

Lets go back to my implementation of Clear:

Benefits: the collection lets go of all references and returns to its initial state. The implementation is also very fast.

Drawbacks: Adding elements will initially incur high costs as the array grows to contain them all.

Imagine the following code pattern:

ArrayCollection<string> a = new ArrayCollection<string>();

for (int i = 0; i < 10000; i++)

{

for (int j = 0; j < 10000; j++)

{

a.Add(”” +j);

}

a.Clear();

}

This type of code will incur a high cost for growing the array because of the clear call. If i changed my code to just set the count to 0 then there would be no array growth costs. However, the code pattern of:

for (int j = 0; j < 10000; j++)

{

a.Add(”” +j);

}

a.Clear();

Would end up leaking those 10,000 elements. Consider another implementation of Clear written like this:

public void Clear()

{

for (uint i = 0; i < count; i++)

{

array[i] = default(A);

}

count = 0;

}

Now we have a O(n) clear where we used to have O(1), but regrowing the array isn't a problem and we do let go of references. So many tradeoffs to make :-(

The issue is that the semantics of Clear are anything but (har har). Does it mean “ReleaseAllReferences” or does it mean “ReturnToSomeStartState”. Or does it mean both? If there are two very different concepts here maybe we shouldn't be mixing. Should the interface instead be:

bool

RemoveAll();

void Trim();

With Clear being provided as a shortcut to that method? Any thoughts?

Should an interface declare invariants that it can't enforce

Additional resources