Why Is The Return Type Parameter Last?


The generic delegate type Func<A, R> is defined as delegate R Func<A, R>(A arg). That is, the argument type is to the left of return type in the declaration of the generic type parameters, but to the right of the return type when they are used. What’s up with that? Wouldn’t it be a lot more natural to define it as delegate R Func<R, A>(A arg), so that the R’s and A’s go together?

Maybe in C# it would, but in this case, it’s C# that’s the crazy one.

When we speak it in English, the argument type comes before the return type. We say “length is function from string to int”, not “length is a function to int from string”.

When we write it in mathematical notation, we say that a function’s domain and range are defined as f:A→R – again, the “return type” comes last.

And in many languages, the return type of a function comes in the sensible position. In VB it’s Function F(arg As A) As R.  In JScript .NET it’s function F(arg : A) : R.

And finally, consider higher-order functions; say, a function from A to a function from B to C. You want to think of this as A→(B→C); do you really want to write that as Func<Func<C, B>, A> ? This is completely backwards. Surely you want A→(B→C) to be represented as Func<A, Func<B, C>>.

C# gets it wrong because C# inherits the basic pieces of its syntax from C, and C gets it wrong. Well, no, rather, it would be more fair to say that C is a non-typesafe, non-memory-managed language where it is vitally important that the code maintainer understands the lifetime and type of the data in every variable. Given that unfortunate situation, it makes sense to emphasize the storage mechanism first, and then the semantics second. Therefore in C you put the storage metadata first (static int customerCount;) rather than the semantics first (it could have been var customerCount: static int;). Once you’re in the position where the type comes first on variable declarations, it makes sense to apply the same rule to all other kinds of declarations – methods, formal parameters, and so on.

It might have been nicer back in the early days of C# to say “you know, we have a type-safe, memory-managed language, let’s do what VB does, de-emphasize the type mechanism and put the type as an annotation on the end”. We could then make that consistent throughout the language so that Func<A, R> referred to delegate Func<A, R>(arg : A ) : R. But that ship has sailed and we’re stuck with the declaration syntax we’ve got.

Comments (26)

  1. jtenos says:

    I can’t say that I agree with you on what C# should have done – maybe it’s just because I’ve used the C#/Java syntax for so long that I find it awkward to think of variables and function declarations the opposite way…

    But my comment is specifically on Func<T, TResult> part of your post.

    Since C# declares things like:

    string ConvertToString(int i) {}

    and VB.NET declares it as:

    Function ConvertToString(i as Integer) as String

    It would make sense that C# people would want it as Func<TResult, T> and VB people would want it as Func<T, TResult>.

    I can see one of two things happening:

    1) The person in charge of deciding the order of the type parameters asked the C# dev lead and the VB.NET dev lead to help him decide – they couldn’t come up with a consensus, so they played rock-paper-scissors two out of three to decide which order they would be in.  VB guy won.

    I was not there for that decision, but knowing the principles who were there, this strikes me as highly unlikely. I think your premise is incorrect. I don’t think anyone on the C# team actually thinks that “return type first” would be better. I suspect that consensus was reached early. — Eric

    2) Microsoft felt sorry for the VB.NET team, since C# seems to be Microsoft’s favorite language (although I’m sure they’ll never admit it) – there just seems to be a little bit of favoritism whenever Microsoft is demonstrating or talking about .NET, it always seems to favor C#.  They figured they’d be nice to the VB.NET folks this one time and throw them a bone.

    I hear this supposition a lot. The idea that VB and C# are in a zero-sum game, and that any benefit to one is a loss of the other is not at all how things work around here. Nothing could be further from the truth. The VB and C# teams have of course had friendly rivalries over the years, but we are ultimately all the same big team and get a lot of support from each other. I suspect that any observed bias is either (1) actually the bias of an observer who is likely to interpret support of C# as lack of support for VB, as though it were a zero-sum game, or (2) a result of the fact that most people at Microsoft are simply more comfortable working in a C-like language than a BASIC-like language. That fact has nothing whatsoever to do with the level of investment that the corporation is making in the future of VB. — Eric

  2. Now this is a hint when I try to remember the signature of the Func delegates.

  3. pminaev says:

    >>> Well, no, rather, it would be more fair to say that C is a non-typesafe, non-memory-managed language where it is vitally important that the code maintainer understands the lifetime and type of the data in every variable. Given that unfortunate situation, it makes sense to emphasize the storage mechanism first, and then the semantics second. Therefore in C you put the storage metadata first

    Or maybe Kernigan & Ritchie were just Algol-60 fans (it had type-first variable declarations)? ;)

    On a more serious note. It’s interesting that many languages from the curly-braces family seem to be moving away from the traditional C-style "type first" declarations to Pascal-style "identifier first". The obvious example is those EcmaScript dialects which mimicked the stillborn ES4 – such as JScript.NET:

     function foo(x: Number) : Number { … }

    But more recently there’s also C++0x, at least for function return types:

     auto Foo(int x) -> int { … }

     [](int x) -> int { … }

    It seems that type-first declaration is inconvenient in many ways – not just for humans to read for the more complicated cases, but also for compilers to parse, and for cases such as generics. We still have this odd bit in C# where a generic type parameter has to be used before it is properly defined, in generic method declarations::

     T Get<T>(int i);

    I would imagine it isn’t exactly friendly to the parser, but also wrecks IntelliSense when writing out that method declaration.

    It’s also interesting to see how C# dodges the bullet for return type of anonymous delegates / lambdas by not giving any way to explicitly specify it at all. C++0x guys had to stick to the "(TArgs…) -> TResult" syntax there…

    Then, also, for languages with ability to declare structural types in-line, when you are defining a function that returns a list of tuples of function pointers, type-first declarations tend to be unreadable (you know, all those C/C++ interview questions such as "write the definition of a function returning an array of pointers to functions…").

    >>> But that ship has sailed and we’re stuck with the declaration syntax we’ve got.

    So, the question then is – is there any remote chance that a "new-style" declaration syntax would be introduced in some future version of C# (obviously, while keeping the old one)?

  4. Christian Kaiser says:

    One of the ‘good’ sides is that you can scan your code faster. Often I am looking for a specific function (for example the function that returns the "type" of a derived class, returned by an overwritten virtual function of derived instances), and I can scan for the return values (String, certain enum, …) much faster as they are all aligned at the left side.

    For "reading clarity", I prefer the "short" part (return value) to the left, so that I can align the name and arguments starting at the same row below each other.

    Maybe it’s just that I’m used to it, but I came from Pascal to C (C++) and I had no problem either way. A programming language is not a ‘natural’ language and need not behave that way.

    Christian

  5. I wonder whether borrowing syntax from a language as arcane and archaic as C for any new language, possibly with vastly differing concepts and paradigms is ever a good idea. The only benefit you have is that programmers familiar with C-like languages may have less trouble understanding or writing simple code, but as soon as you’re getting to more advanced things you’re lost in any case. I certainly didn’t know how to read and understand lambdas before I read about how they work and are defined, for example.

    Having some known syntax might be good but ultimately it probably restricts what you can do with the language.

  6. Weeble says:

    I personally prefer the Pascal style, but I would point out that I find C# declarations a significant incremental improvement on C++. I really hate the way type information is arranged in C++:

    template <class T> ReturnType ClassName<T>::methodName(int *pi, T other) const

    We’ve got type data scattered all over the place – the template parameters, return type and class on the left, the argument list and const attribute on the right. The method name is utterly buried in amongst all that other stuff. I’ve not even gotten into the way "*" and "[]" work in variable or function declarations. I am very glad that C# puts all the type information for a variable in one contiguous string of text. I would have preferred return types to go on the right in method declarations, but I’m already pretty glad for what we’ve got.

    Also, I don’t think it even *occurred* to me that it should be Func<R,A>.

  7. LA.NET [EN] says:

    Eric Lipert has another very interesting entry which explains why the team decided to use the last type

  8. ASPInsiders says:

    Eric Lipert has another very interesting entry which explains why the team decided to use the last type

  9. carlos says:

    There’s another reason.  C++ was already using this convention, e.g:

    std::binary_function<typename Arg1, typename Arg2, typename Result>

    Programming .Net generics in C++ using a different convention would have been super-confusing.

    Though this doesn’t explain why C++ did this.  For the answer, see above.

  10. Stefan Wenig says:

    Eric,

    since you just ruled out the free infoof pony, is this now the time to request a free unicorn? Yes? Well, then, I’d like to ask for

    1) A -> B -> C syntax in addition to Func<A, B, C>. e.g. Func<IEnumerable<TSource>, Func<TSource,TResult>, IEnumerable<TResult> just makes my brain try to escape through my ears. how beautiful would that be: IEnumerable<TSource> -> (TSource -> TResult) -> IEnumerable<TResult> *)

    2) automatic currying: (A -> B) -> C should be equivalent to A -> (B -> C)

    3) some kind of variadic type argument capability (maybe building on currying? not thought through though)

    4) equivalence of A -> B -> C to any delegate type, not just Func<A,B,C>

    5) type inference: (int x) -> x.ToString() should be of type int -> string automatically, where we have to provide an explicit type today or resort to tricks like Func<…> MakeFunc (Func<….> f) { return f; }

    Right now, C# is in an uncomfortable spot. Very complicated for beginners and people who don’t get FP, but too limiting for people who really want to embrace FP. In a way, C#3 just made us horny, but it often leaves us unsatisfied. A fine currying-like syntax and variadic type args would bring us a lot closer to FP heaven, don’t you think?

    I agree, those are all awesome features. I’ll see what I can do, but no promises; we have a LOT of good suggestions for future features and very constrained budgets these days. — Eric

    *) I remember trying to grok monads in C#: http://blogs.msdn.com/wesdyer/archive/2008/01/11/the-marvels-of-monads.aspx (and I failed: I was able to implement the IO monad in C#, but I did not understand the code I created on a global level, only line by line, but I digress).

    the only way for me to follow the explanations was to translate the Func<A,B> types to A -> B using a pencil on the edge of the paper. Really, use any generic type in a Func<…> type and you’re lost in angle brackets. I might as well code in XSLT. :-(

  11. Tim Long says:

    My personaly bias, but I think C#’s strong typing is its one of its most important attributes. Under those circumstances, it seems right to emphasize type in the declaration syntax. Storage mechanism attaches fluently to type, so it makes sense to put that first, too. I think C# is just fine the way it is. If we make C# too much like VB, then what point is there in having two different languages? Vive la différence!

    I agree with you that static typing is important. I note though that “strong typing” basically is meaningless — it pretty much means “a type system I like”. Remember, there’s a difference between “static types” — that is, every expression and variable’s type is known by the compiler, and compile-time checks are made on the basis of that knowledge — and “explicit types” — that is, the text of the program explicitly contains the type information for each variable. C# is statically typed but is no longer explicitly typed, thanks to “var”. F# is statically typed but almost never has explicit types. — Eric

     

  12. Rick Dailey says:

    One nitpick with moving the type to the end is the amount of needless typing / extra characters:

    static int customerCount;

    var customerCount: static int;

    There’s a "var " and ": " in there that I don’t really feel a really strong urge to type in every time I declare a variable.  Of course, it’s not like a little typing killed anyone, but I don’t know that this semantic difference is worth it.

  13. James Curran says:

    But, your argument about right & wrong syntax is merely looking at function *declarations*.  Why don’t we look at function *use*?

    int returnVal = Func(parm1, parm2);

    And that syntax is essentially the same in C# and VB.Net and pretty much every other major language, and go back to high school math (i.e, we say “Let B equal the number of berries picked” rather than “let the number of berries picked be B”).

    Then, we can say that VB.NET does it “wrong” because it’s declaration does not match it’s use. (which can be explained as strict typing in VB is a hack added in well after the original syntax was set)

    I see your point but I do not believe that it generalizes to anything other than assignment. What about method calls? If you believe that the return value logically “goes to the left” then shouldn’t x = Foo().Bar() be written as x = Bar().Foo() ? Because the returned value of Foo() gets “pumped to the left” into Bar, and then the return value of Bar gets pumped to the left into x. You like this? With chained method calls, the result “goes to the right”. When you have a method Foo:A –> B, and a method Bar:B –> C, you chain them together as Foo().Bar() because (A –> B) –> C, to the right.

    What about multiplication? When you say x = Foo() * Bar(), again, the returned result of Foo gets “pumped towards the right”, where the multiplication is about to happen. It doesn’t get pumped to the left, into the assignment.

    The issue highlighted here is actually that assignment is a weird operation; it would be more consistent if we had defined an operator x –> y which means “stuff the value of x into y”. Then assignment would “go to the right” like its supposed to.

    – Eric

     

  14. James Curran says:

    The multiplication example is a false lead — one can equally argue that return value of Bar() “pumped to the left” into the multiply which is than pumped into the assignment.  I know the C & C++ (and I assume C#) Standards allow compilers to evaluate either side of the multiplication sign first.

    C# strictly defines the order of evaluation. Subexpressions in C# are always evaluated left-to-right, period, regardless of associativity or precedence.– Eric

    The chaining example is harder to fight, but it is based more on the semantic of the dot operator than on method return.

  15. Greg says:

    - Return type to the left of the function name makes it possible to text search for all functions that return type ABC as well as see the return type of function DEF.  This helps where opening a class browser or equivalent does not find all references to a particular type (a problem with legacy code or mixed language projects).   Large bodies of code ( more than 500,000 lines) benefit most from this when you also force each line of code onto a single line to facilitate text searching.

    - Return type on te left of the function name makes it easier to visually see functions that have the same signature or almost the same signature and are candidates for being combined.   Removing 10 and 20 lines of duplicate code at a time will, when done consistentantly over a many month period, will shrink the number of lines in an application by percentage points.  This helps redude the code bloat we’ve seen in offshore written applications which may have had productivity measured in lines of code instead of code quality.

    - .NET development tools in the near future should have easy 1 click convert code from format X to format Y or convert from C# to VB given the large amounts of metadata output by the compiler.   This would let VS 2014 have developers using a generic metadata as a programming language with C# or VB.NET silently generated and compiled into the application.  This would be a bonus to reusing/reworking old applications in that one could get them to compile under the new VS, run ‘extract metadata code’ and then work on the extracted code instead of the older format legacy code.

    - C, maybe pre-Posix, let you return and dereference a pointer on the left hand side of an asignment statment.

  16. George Spofford says:

    I doubt there is a "right" here. There is simply a distinction: arguments and return. I suspect (no research here) that K&R found the grammar easier to simplify (or terser: otherwise you may be compelled to use the "AS" or -> token) by putting the return type first- no performance, resource management or type-safety gain by putting that information first or last, as there is no semantic significance to choosing one sequence over another.

    That conventional "math" notation tends towards args first isn’t semantically significant either, but your exposition is helpful for setting mental analogs. I liken this to driving on the left side vs right side: either is fine, but remember which country you’re in and respect the rules of the road when you’re there!

  17. Stu says:

    Bah!

    @Greg nobody designs a language for easy grep’ing, it’s just a side-effect.

    @George +1

    I would disagree that either C or C# gets it wrong. It’s just a convention, and it wouldn’t be called C# if it didn’t resemble C / C++. As @George says, there is no inherent type-safety or resource management gain.

    Mathematicians don’t always agree on their symbols – it’s not like Euclid set everything in stone eons ago and since then mathematical notation has remained unchanged.

    Things evolve. Decisions were made, and language designers attempt to add new features without breaking existing code. You know this well, and not all decisions are based on their pure mathematical counterparts. If mathematical purity was the top of the list, then Linq or other post-Unicode languages would use the Unicode mathematical symbols for operators Union (U+222A), Subset (U+2282), or Intersection (U+2229), etc – but to my knowledge nobody really cares that much.

    All sorts of programming constructs are done for a variety of reasons, and I would expect terse but expressive as well as ease of maintenance are never far from the minds of the language designer, not to mention the task of writing the compiler itself.

    For what it’s worth, I find the Func<T,TResult> to be annoying and counterintuitive even if allegedly mathematically superior.

    When you provide overrides to Func<T1,T2,TResult> it seems more natural (from a C# background) to want to do Func<TResult,T> and Func<TResult, T1, T2> based on the almost universal assignment

    x = Func(a);    // Func<x,a>, but C# does Func<a,x>

    x = Func(a,b); // Func<x,a,b>, but C# does Func<a,b,x>

  18. pminaev says:

    @Greg

    >>> Return type to the left of the function name makes it possible to text search for all functions that return type ABC as well as see the return type of function DEF.

    For C#, maybe (I’m not 100% sure), but for C/C++, try to write a regex that would handle e.g. functions returning function pointers…

    In practice, for tasks such as one you describe, you really need proper tooling. E.g. IntelliJ IDEA has a "Java code search" feature with a pattern matching language that lets you match code constructs (rather than text).

    >>>  C, maybe pre-Posix, let you return and dereference a pointer on the left hand side of an asignment statment.

    Not sure what you mean here – you can dereference a pointer on left side of assignment in virtually every language that has pointers and assignment; including e.g. C#.

    @George

    >>> I suspect (no research here) that K&R found the grammar easier to simplify (or terser: otherwise you may be compelled to use the "AS" or -> token) by putting the return type first

    More likely, they just followed the type-first scheme for variable declarations, and those they have sort of inherited from B (not as types, but as storage modifiers).

    Terser by design is also likely; after all, we’re talking about guys who have decided what = and == should mean (i.e. which one should be assignment, and which comparison) by calculating the frequency of each operation, and using the shorter token for the more frequent operation (which turned up to be assignment).

  19. Greg says:

    - Text searching application source code works well when you have a large code base of different applications.

    - VS’s tools do not let you find all references to a function both in the current solution and all of the solutions you have in all of the other applications making up your entire code base

    - Text searching finds references to your function in places that are not searched by VS find all references ( strings, scripts, dynamically compiled code, code invoked by finding a function name in metadata, etc.).  This helps to identify and fix areas where the developer overengineered an application (e.g., creating and using a web service that is only called from one application -> solution is to move the web service inside of the calling application)

    - Large scale code refactoring is eased with code formatted for easy text searching.  Brute force refactoring for duplicate/near duplication function finding is easier via text (findstr | sort | findstr then look at the sorted list in a text editor.  Reflection could help but requires compiled code which may be too costly to build).

    If you come into a client’s office, develop 50% of an application and then leave before it is in production for 3 months, you will not need to do text searching.  Taking the same application through 4 or 5 major development cycles and supporting multiple different versions in production use by your customers requires text searching.  Text searching is most beneficial for code that was partially developed and never supported post production by consulting companies.

  20. I’d like to second Stefan Wenig’s syntax suggestions. And add T* as shorthand for IEnumerable<T>

    So we could declare the type “function that maps a list of customers to a list of addresses” like this:

       Customer* -> Address* getAddresses;

    Equivalent to:

       Func<IEnumerable<Customer>, IEnumerable<Address>> getAddresses;

    And then there’s lifting, which already happens for Nullable, but it would be cool for other things to have that expressiveness. How about:

       e..Foo()

    as shorthand for:

       e.Select(i => i.Foo()) assuming e is a T* (“lifting”)

    Allowing us to init the function we declared before:

       getAddresses = customers => customers..GetAddress();

    As (in line with the Linq query keywords) the .. lifting operator would expand to call Select, so you could define how lifting would work on other interfaces besides IEnumerable<T> by writing your own Select extension method.

    Which reminds me, how about also letting the operator overloading map to special method names just like the linq query keywords do? So, assuming nothing better is available (i.e. this would fail to compile in C# 4.0):

        var result = a + b;

    But in some future version the compiler would (as a last resort) try:

        var result = a.Plus(b);

    Obviously the current C# static operator overloading system is more ideal for the cases it already handles, because of the symmetry imposed by the lack of virtual dispatch on the LHS, as Eric has previously discussed.

    However, by allowing the operators to also map onto special method names, as a fallback mechanism, we’d get some nice advantages added to the language in a backward-compatible way. And it only seems reasonable given that the linq keywords work that way.

    e.g. define an extension method:

       public static T* Plus<T>(this T* source, T* other)
       {
           return source.Concat(other);
       }

    We’d now be able to concat two sequences with: a + b.

    Or with this:

       public static TOut Pipe<TIn, TOut>(this TIn source, Tin -> TOut maybe)
       {
           if (source == null)
               return null;
           return maybe(source);
       }

    We could say:

    return Music.GetCompany(“4ad.com”)
           | company => company.GetBand(“Pixies”)
           | band => band.GetMember(“David”)
           | member => member.Role;

    And if any of those steps produced null, the returned result would be null, i.e. maybe monad.

    I know it’s a long shot, but no harm in asking, eh? :)

    Indeed, no harm. These are good ideas.

    I think if we were designing C# from scratch knowing what we know now, we’d probably go with a pattern-based approach to operator overloading like we do with query comprehensions.

    “*” is nice, but unfortunately it is too easily confused with the pointer syntax. We considered making “T{}” a shorthand for IEnumerable<T>, which is nice because it has a good symmetry with T[]. Unfortunately it did not meet the bar for C# 4.0. Maybe in a hypothetical future version.

    And finally, it is a little-known fact that C# does have a “..” operator. It only works when you type an expression into the watch window in the debugger! (UPDATE: Whoops, I am mistaken. See below.)

    – Eric

  21. Stefan Wenig says:

    @Daniel T* is a pointer to T already, you’d have to find another symbol.

    besides that, monads already inspired LINQ, and I don’t see why they should not inspire other new C# features. just as long as C# does not try to embrace the generic concept of monads. if that’s what you want, you should defintely consider switching to another language.

  22. Joren says:

    @Greg

    Proper searching is of course very useful and important, but I think it’s a problem for tools to solve, and not something the language design should consider.

  23. Good to know that these things are being discussed.

    Personally I’d be happy to leave unsafe blocks with only the verbose syntax, as they’re (hopefully) rare beasts anyway.

    (I also thought about suggesting -> for the lifting member access operator instead of .. to correspond with C/C++, but I think that would have been maliciously confusing as I was was also used that symbol Haskell-style func-type declarations in the same post!)

  24. Omer Mor says:

    Eric,

    Can you give us hints about the double-dot (..) operator you mentioned?

    I found no documentation about it, was unable to google it, and failed to use it in a watch expression.

    Help! :-)

    Omer.

    Whoops, turns out I was wrong. Apparently we cut that feature a long time ago I must have missed the memo. We were planning on having a feature in the debugger where you could say “myArray,[x..y]” in the watch window and then the debugger would show you just the portion of the array that you’d specified. Very handy for displaying chunks of large arrays. Looks like it was cut for lack of testing resources. Sorry to get your hopes up! — Eric

  25. holatom says:

    I’m not haskell or anything other than C# guy, so/and I don’t understand syntax from Stefan Wenig’s new feature proposal. If I have lamda expression e.g.

    (source, selector) => source.Select(selector)

    of type

    Func<IEnumerable<TSource>, Func<TSource,TResult>, IEnumerable<TResult>>,

    I would expected simplified notation of that type name something like

    (IEnumerable<TSource>, (TSource -> TResult)) -> IEnumerable<TResult>.

    Why/what is IEnumerable<TSource> -> (TSource -> TResult) -> IEnumerable<TResult> i. e. “->” symbol between two input parameters?

    This is called “currying”. Any function of two parameters can be turned into two functions of one parameter. If you have f = (x,y)=>x+y, and you call it f(2, 3), then you can turn that into g = x=>y=>x+y, and call it g(2)(3).  That is, the first call returns “y=>y+2″. Technique is named after Haskell Curry, same guy the language is named after. I don’t believe I haven’t blogged about this yet. I’ll get right on it. — Eric

  26. holatom says:

    Thanks for explanation Eric, I’m looking forward to your next great posts already.