Should an interface declare invariants that it can’t enforce


I was thinking about the clear method on my ICollection interface.  i.e:


        /// <summary>


        /// Removes all the elements contained in this collection.


        /// </summary>


        void Clear();


 


Now, it’s interesting that that method takes no arguments and returns no value.  So it is effectively immeasurable.  It’s a ()->() function.  Whee!  Of course, in an earlier post i talked about the relation between methods on an interface and certain invariants that you could declare.  For example, i could add the following invariants to Clear:




  1. For all collections S, S.Clear() implies S.Empty


  2. For all collections S and any element X, S.Clear() implies !S.Contains(X). (Kind of redundant since S.Empty is equivalent to the last part.  However, since i didn’t state it before, it’s worth stating now)


  3. For all collections S and any element X, S.Clear() implies !S.Remove(X).

However, consider the implementation of ArrayCollection.Clear:


 


        public void Clear()


        {


            array = new A[0];


            count = 0;


        }


 


Initializing array back to an empty array is unnecessary.  The implementation of ArrayCollection will still work perfectly when you set count to 0.  However, this will be a very badly behaving collection.  Say you instantiate an ArrayCollection and you add 1 million heap allocated items to it.  One would expect the reference to those items to be released so that any unused items could be reclaimed by the GC.  However, nothing in the interface said that that would be the case.


 


When we look back at the clear method we realize it doesn’t really say much.  All that we really have in a signature is the arugments, the return type, and the name.  However, is the name really important?  Could i have called it “void SquashedCockroach();” instead (credit to Max Mintz)?.  I think the answer to that is “no”. While i would have certainly been allowed to call it that, it would have lost the meaning that i intended for it and it would have made no sense in the context of ICollection.  Unfortunately, once we enter this realm we realize that we’re bringing english into the game.  What does “clear” really mean.  A quick hop over to dict.org shows only about 30 definitions for clear. 


 


Lets go back to my implementation of Clear:


 


Benefits: the collection lets go of all references and returns to its initial state.  The implementation is also very fast. 


Drawbacks: Adding elements will initially incur high costs as the array grows to contain them all.


 


Imagine the following code pattern:


 


            ArrayCollection<string> a = new ArrayCollection<string>();


            for (int i = 0; i < 10000; i++)


            {


                for (int j = 0; j < 10000; j++)


                {


                    a.Add(”” +j);


                }


 


                a.Clear();


            }



 


This type of code will incur a high cost for growing the array because of the clear call.  If i changed my code to just set the count to 0 then there would be no array growth costs.  However, the code pattern of:


 



                for (int j = 0; j < 10000; j++)


                {


                    a.Add(”” +j);


                }


 


                a.Clear();


 


Would end up leaking those 10,000 elements.  Consider another implementation of Clear written like this:


 


        public void Clear()


        {


            for (uint i = 0; i < count; i++)


            {


                array[i] = default(A);


            }


 


            count = 0;


        }


 


Now we have a O(n) clear where we used to have O(1), but regrowing the array isn’t a problem and we do let go of references.  So many tradeoffs to make 🙁


 


The issue is that the semantics of Clear are anything but (har har).  Does it mean “ReleaseAllReferences” or does it mean “ReturnToSomeStartState”.  Or does it mean both?  If there are two very different concepts here maybe we shouldn’t be mixing.  Should the interface instead be:


 


bool RemoveAll();


void Trim();


With Clear being provided as a shortcut to that method?  Any thoughts?


 


Comments (9)

  1. saifee says:

    default(A) What does this mean? I’ve never seen this usage before?

  2. Saifee:

    i cannot write:

    array[i] = null;

    Because it’s possible that the type A has no null values (i.e it’s a value type, not a reference type).

    The syntax default(A) (which used to be A.default) means:

    a) If A is instantiated as a reference type, then default(A) is null

    b) If A is instantiated as a value type, then default(A) is new A()

  3. AT says:

    What is a reason for "array = new A[0];" inside Clear() ?

    It’s yet another useless object for GC.

    for(int i=0;i<1000000;i++) a.Clear(); // <– not a regular use – but will fill heap

    Are your code rely so heavily on array.Length in addition to count ?

    Anyway you are recreating "array = new A[1]" for zero length array in Add().

    As for your second solution reset to default() – your array will constantly grow and will never return memory to system.

    There is two possible workarounds in a middle – reset to array = null (will cause several reallocations then adding values) or even better to initial capacity (as this is an hint for minimum size supplied by developer and expected to be used).

    If you are thinking about effective memory footprint/use – consider array shirk itself to minimum allowed size – not simply a 0.

  4. AT: your code: for(int i=0;i<1000000;i++) a.Clear();

    will not fill the heap. When the GC kicks it it will reclaim those unused references

  5. AT:

    I’m not sure what you mean by "Are your code rely so heavily on array.Length in addition to count ? "

  6. AT:

    "As for your second solution reset to default() – your array will constantly grow and will never return memory to system."

    I agree. The benefit was that if you happened to use that space again you would do so without memory copying.

    I was trying to determine what the benefits/tradeoffs were between the different implementations. Not releasing the space used by the array would be a downside. But if then provided a Trim method it would be alleviated.

  7. AT: What is a good "minimum allowed size" I couldn’t think of anythin non arbitrary so I stuck with 0. I could assign null into that position in that case and I might end up doing that.

    Note: I have written my Remove method to shrink the array if less than 1/4 elements are used in teh array. This goes with what CLR says for good amortized performance.

  8. AT:

    "Anyway you are recreating "array = new A[1]" for zero length array in Add()."

    You are correct here. A[0] is basically a sentinal value. Because there is no reason to use up heap space for it I could use null instead. A[0] simply made the code easier to understand.

  9. John Stewien says:

    Just a thought, but what if on removal all the elements need some operation to be called on them like Dispose() in the case of a control that contains a whole lot of child controls.

    Some custom collection classes I’ve written have some interesting functionality. One style is where I have a large collection of elements where the elements themselves are used all over the place in a large application I’ve written. When the user deletes one of these elements in a custom control, it’s removed from the whole application. These elements have a base class that has a Remove() function. This fires an event RequestRemoveFromCollection which generally holds a reference to the parent collection, which itself can be removed from the main datastructure. When an element is removed from the custom collection, the reference to the collection is removed from the elements events, and the collection also fires an ElementRemoved event which allows other collection event listeners in the application to remove the element from their controls. I guess in this case I would want to use a RemoveAll() function in Clear() but I would probably create a custom collection class that inherits from ICollection<myElementBaseType> and overrides the the Clear() method just to make sure I get what I want. I would need to create a custom type anyway as I would need to override Add(myElementBaseType element) to add an event handler to the RequestRemoveFromCollection event, and override OnRemove() to fire the collection event ElementRemoved.