How do you decide what goes in an interface?

I’ve been having an interesting talk with Doug McClean about appropriate locations of methods. We’re discussing if the ICollection<A> interface should include the Transform method. Specifically should the interface look like this:

public interface ICollection<A> {

ICollection<B> Transform<B>(IBijection<A,B> bijection);


Doug thinks (feel free to correct me Doug and I’ll update this page) that Transform isn’t really appropriate on the main collection interface because it carries a burden on consumers of the ICollection interface to understand it. Instead it should be pushed down a specialized subinterface (like ITransformableCollection) and a dummy implementation on a helper class. i.e.

public static class CollectionHelpers {
public static ICollection<B> Transform<A,B>(ICollection<A> collection, IBijection<A,B> bijection) {
ITransformableCollection<A> tc = collection as ITransformableCollection<A>;
if (tc != null) {

return new TransformedCollection<A,B>(collection, bijection);

(where TransformedCollection is a specialized class that uses the bijection to pass values to and from the underlying collection).

I was thinking about this and trying to determine how you decide what should go on an interface. One could make the argument that you should keep the interface as simple as possible and only provide the bare minimum of methods that give you full funcationality. However, if I were to take that argument to it’s logical conclusion than the entire ICollection interface would look like:

public interface ICollection<A> {
B Fold(Function<B,Function<A,B>>, B accumulator);

Using that one could get every other bit of functionality that is in the list interface. You could implement ‘ForAll’, ‘Exists’, ‘Find’,’Count’, ‘Iterate’, ‘Transform’, ‘FindAll’, ‘Filter’, and everything else in the current interface on top of Fold. However, this wouldn’t be a very convenient interface to use. All you’d ever see was folds and you’d always be asking yourself “what is this fold doing?” Chances are a lot of those times you’d be doing something like ‘counting’ or ‘finding an element’. And so one could argue “certain actions that people commonly do on a collection should be pushed into the interface”. So what happens when you run into a method like ‘Transform’. I’ve already run into cases where I’ve needed it. However, is it common enough that other people will run into it? Or is it something that will be used all of 0.01% of the time. Is that enough to keep it in the interface? Should that even be part of the criteria?

I’d love thoughts on this problem, addressing this issue if possible, or other cases where you’re run into this.

Comments (12)

  1. Doug McClean says:

    Good summary of my position, Cyrus, I have no objections to that characterization.

    It’s obviously a matter of opinion, but I think that the existing FCL design of ICollection is fine (count, synchronization, and enumerability). Copying to an array is questionable, because it’s easily defined on top of enumerability and Count, but I think that almost all or all likely collection designs can provide more performant implementations and the burden of understanding is low.

    Agreed that the fold-only definition is the logical conclusion, but even that can be defined on top of IEnumerable. Basically ICollection is just IEnumerable with a known Count. If you know more, such as that an ordering exists, IOrderedCollection could allow easily finding the Max and Min. ITransformableCollection could allow you to provide a more performant implementation of Transform than standard.

    Defining the Collection by the fold-only method basically inverts a principal design decision of .NET collections, which was to use external iterators through the IEnumerable interface. Fold is essentially the general purpose internal iterator method.

    My head is about to explode, I need to stop thinking about this for a while and watch some baseball. :)

  2. Doug: Actually, my collections don’t have a Count property. Otherwise they wouldn’t be able to represent infinitely large collections. I’ve left out count because of that.

  3. Doug McClean says:

    Yeah, I’ve had issues with infinitely large collections too. Maybe ICollection just doesn’t exist, and we should mix and match IEnumerable, IFiniteCollection, ITransformableCollection, …

    One abstraction at a time.

    Also, this isn’t a big deal, but my name is McClean not McLean (in the article).

  4. Doug McClean says:

    Something else to think about here (which could go on the language feature request thread too) is that at this level of granularity, which I think is conceptually useful, it is difficult to properly type your parameters if you require some features of more than one interface. This can sometimes be worked around with generic methods and constraints, but I’m wondering if there isn’t a more general solution.

    In general, a set of interface types A = {I1, I2, .. IN} defines another type TypeA that only allows values that are of types that implement every interface in A. Could we use types like that as parameter types?

  5. Doug: Could you explain the last bit about the set of interface types?

  6. Doug McClean says:

    Sure Cyrus. You were actually talking about it before in a really old post, about how having lots of small interfaces is unwieldy if you need access to features from some set of them in order to write a method.

    Postulate these types:

    interface IEnumerable<T> // as defined in mscorlib now

    interface IFiniteCollection<T> : IEnumerable<T> {

    int Count {get;}


    interface IOrderedCollection<T> : IEnumerable<T> {

    IOptional<T> MaximumValue {get;}

    IOptional<T> MiniumValue {get;}

    IEnumerator<T> GetOrderedEnumerator();


    Suppose you desire to write a method that will return the third quartile of a collection of some numeric type that can be subtracted and divided. Since we haven’t worked out the operators yet, lets just suppose the type is double, so that we have this:

    SomeFiniteOrderedCollectionType<double> theList; // initialized somehow

    // note that SomeOrderedCollectionType<T> : IFiniteCollection<T>, IOrderedCollection<T>, …

    Our problem is, we need access to the count, from IFiniteCollection<T>, and the maximum and minimum values from IOrderedCollection<T>, but we don’t want to require the concrete type of the parameter be anything specific. Ooops. Now we are up a creek without a paddle. We could type the parameter as one or the other, and throw an exception if it wasn’t both. Or type it is IFiniteCollection<T> and fallback to searching for the Min and Max if it happened not to be IOrderedCollection<T>, but suppose that we don’t want either of those solutions. (I am trying to make a simple example of a case where we need access to two distinct interfaces.)

    Suppose we had a syntax like this:

    double ComputeThirdQuartile({IFiniteCollection<double>,IOrderedCollection<double>} list) {

    // in here, the compiler knows that list implements both IFiniteCollection<double> and IOrderedCollection<double>, and requires callers to pass something that is both those things

    return (list.MaximumValue – list.MinimumValue) / list.Count;

    // TODO: handle the case of empty list and hence unwrapping the IOptional<double> instances. Consider this psuedocode to get my point across.


    This obviously could be refined, or we could change the syntax, or whatever, but my point is that we should think about this because it removes the number one drawback to using very granular interfaces, namely that they can’t be freely mixed and matched.

  7. Doug: I see exactly what you mean now. Another alternative is this:

    double ComputeThirdQuartile<T>(T list) where T : IFiniteCollection<double>, IOrderedCollection<double> {


    But it’s certainly verbose.

    I think what I’d prefer is if C# moved to having a system where we combined type inference and structural subtyping so that you could just do:

    double ComputeThirdQuartile(list) {

    uint count = list.Count;

    IOptional<double> max = list.Max;


    and then we’d figure out all the types.

  8. Doug McClean says:

    Yeah, that’s what I had in mind when I wrote "This can sometimes be worked around with generic methods and constraints, but I’m wondering if there isn’t a more general solution." Maybe it can always be worked around, but I have a feeling I had a counter-example once. I might find it in my notes somewhere. Declaring a return type like this, would be one complication.

    I like the type inference, to a point. But there are serious versioning issues, I think, as well as issues with abstract and virtual declarations. It could be a good approach for some languages, but it doesn’t seem in keeping with the C# style in some ways.

    This area seems worth some more thought. I’m not sure any of our three approaches so far is the way to go entirely.

  9. Luke Stevens says:

    This is fascinating stuff, guys!

    Some thoughts—

    – The pattern of querying for optional interfaces and falling back to a general implementation in terms of other interfaces, as you demonstrate with Transform<A,B>, is *very* handy. Follow my link for more musings on this. Letting an implementer derive from IEnumerable<T> alone, while giving a caller a convenient way to call Count, is the best of all worlds, far superior to using abstract base classes as some suggest. The main downside is that you need a *static* function Count to encapsulate the query-call/fall, which is a little harder when you want to type l-i-s-t-dot and find the proper method in IntelliSense. But we already face that as, for example, with Array.Sort (thank goodness!). This was one of the great things about STL—with static functions doing all the algorithmic work, one person can add things like stable_partition while another adds things like hash_set and neither has to know about the other. And, returning to C#, if a caller really needs an IFiniteCollection<T> to pass to something else, one can always offer a method to build one off an arbitrary IEnumerable<T>, with the caveat that you can’t query it for other interfaces on the original object (I muse about this problem too).

    – So I say: at least offer an interface with the absolute bare minimum (Fold or whatever). Then offer other interfaces, possibly derivative, in case someone decides, for example, that there is a better way to count the elements in an array than by using enumeration & iteration (duh). But we must be clear that these other interfaces are for specialization (maybe hide them on the class as with IConvertible), otherwise users will demand to see them everywhere, and we’ll end up with interface bloat yet again.

    – It is possible to define an interface that combines two other interfaces and adds nothing (e.g. interface IBoth : ILeft, IRight {}), but unfortunately implementing this interface is not considered the same as just implementing the two base interfaces. Or, you often see interfaces that inherit from other interfaces for no sound reason besides convenience, like IDataReader, which is interface bloat again. If you start down this slippery slope then you end up with monstrosities like IFiniteOrderedTransformableCollectionThatCanMakeToastAndWalkYourDog, for any combination of any two or more interfaces anywhere in the framework.

    – Users get frustrated when they *know* a class implements each of two independent interfaces, but they can’t use a single reference of either interface type without letting go of the other one and having to cast later on; so, they end up abandoning interfaces altogether and using the class interface directly, which is no better than an anonymous instance of the aforementioned monstrosity (worse, IMHO).

    – In cases like ComputeThirdQuartile where you use two independent interfaces, can’t you just accept two parameters? You then lose the constraint that they be on the same object, but when do you ever really need that constraint? Saying ComputeThirdQuartile(list, list) does look kind of strange, though.

    – The idea of {I1, I2} as—how do you say?—an implicit aggregate interface is very interesting. Then, I suppose, any class that implements both I1 and I2 will automatically implement {I1, I2}, and you can have a List<{I1,I2}> if you like. Besides making certain things a lot simpler, I think this does solve some real, though perhaps obscure, problems not yet covered by generics; for example:

    interface I1 { int F1(); }

    interface I2 { int F2(); }

    class Cat : I1, I2 { void Meow() {…} int I1.F1() {…} int I2.F2() {…} }

    class Dog : I1, I2 { void Bark() {…} int I1.F1() {…} int I2.F2() {…} }

    interface IFoo<T> { int Bar(T x); }

    class Foo<T> : IFoo<T> where T : I1, I2 { int Bar(T x) { return x.F1() – x.F2(); } }

    class Test { public int TryMe<T>(IFoo<T> i) where T : I1, I2 { return i.Bar(new Cat()) – i.Bar(new Dog()); } }

    Then you try to create a Foo<T> to pass in to TryMe and realize that no choice of T will work, because T must derive from both I1 & I2 and yet be a base of both Cat & Dog. The only way we solve this now is to slide down that slippery slope, to anticipate the problem when writing Cat & Dog and explicitly create & use a third interface serving no other purpose than to aggregate I1 & I2. What we really need is to make the aggregation of I1 & I2 a valid type in its own right. Or am I missing an obvious better way?

  10. Doug McClean says:

    Another thing enabled by types like {I1, I2} is a language syntax like this:

    SomeType x; // init somehow



    when(x is I1) {

    // variable x has the type {SomeType, I1} in this scope



    This would be much easier to use and more readable than the ubiquitous:

    I1 xAsI1 = x as I1; // it’s usually difficult to name this new variable

    if(xAsI1 != null) {



    Code of this type is a common source of mistakes or inefficiencies in beginner’s C# programs also, and syntax support for this scenario would go a long way to changing that.