I Don't Like Arrays

Article
06/26/2006

The number one reason that I dislike arrays in .NET is the fact that they implement IList<T> explicitly, thereby burying useful members like IndexOf behind a cast or equally ugly calls to static methods on System.Array, and needlessly renaming the Count property to Length. As a result, it's unduly difficult to change code that operates on an array to use a more versatile collection in its place.

In fact, I don't like explicit interface implementation (EII) much at all. In my opinion, there are only two valid reasons to use EII: to hide members which unconditionally throw NotSupportedException, or to implement a poor man's return type covariance or parameter type contravariance. I consider EII harmful in all other cases. [Update: I overstated things, there are other cases where EII is acceptable, see the comments for details.]

The second reason that I dislike arrays is that they show up too often in public API, which further exacerbates their impedance mismatch with other collections. (To be fair, this isn't an issue with arrays themselves so much as how they are used in the wild.)

I pointed out to colleagues this morning that the same arguments that we provide for avoiding List<T> in public API, mainly that it's designed for performance rather than extensibility, can be applied to T[].

As a follow-up to that conversation, we decided to change one of our API which returns string[] to return Collection<string>. This saved us a truly unnecessary call to List<string>.ToArray() and allows us to further customize the API in the future, perhaps by making it lazy and more efficient in the exceedingly common case where it is used only for enumeration.

I was happy with this change until it broke one of our test cases which called String.Join on the resulting array. There's no reason why String.Join couldn't take IEnumerable<string> or at least IList<string> in place of string[], but it doesn't, and we were forced to write our own Join method to work around the problem.

One of the main arguments in favor of typing parameters as arrays is to enable the addition of the params keyword. I don't dispute the convenience of params, but there's a way to have your cake and eat it too:

     public void DoSomething(params string[] arguments) {
        DoSomething((IEnumerable<string>)arguments);
    }

    public void DoSomething(IEnumerable<string> collection) {
        ...
    }

In fact, it would be nice if C# supported params IEnumerable<T> and implemented it as above, or even better would be to save the unnecessary overloads by training our compilers to make sense of ParamArrayAttribute for all trailing parameter types to which there exists an implicit conversion from the strongest array type that can hold the trailing arguments.

This actually brings me to another topic which gets a lot of press these days: dynamic vs. static typing. I've recently rekindled a fondness that I developed in university for the Scheme programming language and I've also been familiarizing myself with Ruby, which I like a lot. The most important thing that I've learned from Ruby (which might be obvious to the Smalltalkers of the world, but was news to a Java-educated punk like me), is that static typing can actually get in the way of polymorphism and object-orientation. For example, imagine if String.Join didn't declare the type of its array argument, but instead just used the features that it needed. I could then pass it a collection which happens to quack just like the array that the developer had in mind and everything would just work...

That's not to say that there's no value in static-typing. For one thing, it can help produce faster code. For another, it helps drive statement completion features like IntelliSense. And finallly, it makes my day job writing static code analysis much easier. :)

Let me summarize my thoughts as follows:

If we ever build a new framework, let's make sure that arrays and collections are syntax-compatible from day one.
Don't abuse explicit interface implementation.
Where possible, prefer abstractions like IEnumerable<T> and IList<T> over T[] for parameter types in public API.

I smell some new FxCop rules lurking. What do you think?

I Don't Like Arrays

Additional resources