Ref returns and ref locals


“Ref returns” are the subject of another great question from StackOverflow that I thought I might share with a larger audience.

Ever since C# 1.0 you’ve been able to create an “alias” to a variable by passing a “ref to a variable” to certain methods:

static void M(ref int x)
{
    x = 123;
}

int y = 456;
M(ref y);

Despite their different names, “x” and “y” are now aliases for each other; they both refer to the same storage location. When x is changed, y changes too because they are the same thing. Basically, “ref” parameters allow you to pass around variables as variables rather than as values. This is a sometimes-confusing feature (because it is easy to confuse “reference types” with “ref” aliases to variables,) but it is generally a pretty well-understood and frequently-used feature.

However, it is a little-known fact that the CLR type system supports additional usages of “ref”, though C# does not. The CLR type system also allows methods to return refs to variables, and allows local variables to be aliases for other variables. The CLR type system however does not allow for fields that are aliases to other variables. Similarly arrays may not contain managed references to other variables. Both fields and arrays containing refs are illegal because making it legal would overly complicates the garbage collection story. (I also note that the “managed reference to variable” types are not convertible to object, and therefore may not be used as type arguments to generic types or methods. For details, see the CLI specification Partition I Section 8.2.1.1, “Managed pointers and related types” for information about this feature.)

As you might expect, it is entirely possible to create a version of C# which supports both these features. You could then do things like

static ref int Max(ref int x, ref int y)
{
  if (x > y)
    return ref x;
  else
    return ref y;
}

Why do this? It is quite different than a conventional “Max” which returns the larger of two values. This returns the larger variable itself, which can then be modified:

int a = 123;
int b = 456;
ref int c = ref Max(ref a, ref b);
c += 100;
Console.WriteLine(b); // 556!

Kinda neat! This would also mean that ref-returning methods could be the left-hand side of an assignment — we don’t need the local “c”:

int a = 123;
int b = 456;
Max(ref a, ref b) += 100;
Console.WriteLine(b); // 556!

Syntactically, ‘ref’ is a strong marker that something weird is going on. Every time the word “ref” appears before a variable usage, it means “I am now making some other thing an alias for this variable”. Every time it appears before a declaration, it means “this thing must be initialized with an variable marked with ref”.

I know empirically that it is possible to build a version of C# that supports these features because I have done so in order to test-drive the possible feature. Advanced programmers (particularly people porting unmanaged C++ code) often ask us for more C++-like ability to do things with references without having to get out the big hammer of actually using pointers and pinning memory all over the place. By using managed references you get these benefits without paying the cost of screwing up your garbage collection performance.

We have considered this feature, and actually implemented enough of it to show to other internal teams to get their feedback. However at this time based on our research we believe that the feature does not have broad enough appeal or compelling usage cases to make it into a real supported mainstream language feature. We have other higher priorities and a limited amount of time and effort available, so we’re not going to do this feature any time soon.

Also, doing it properly would require some changes to the CLR. Right now the CLR treats ref-returning methods as legal but unverifiable because we do not have a detector that detects and outlaws this situation:

static ref int M1(ref int x)
{
  return ref x;
}

static ref int M2()
{
  int y = 123;
  return ref M1(ref y); // Trouble!
}
static int M3()
{
    ref int z = ref M2();
    return z;
}

M3 returns the contents of M2’s local variable, but the lifetime of that variable has ended! It is possible to write a detector that determines uses of ref-returns that clearly do not violate stack safety. We could write such a detector, and if the detector could not prove that lifetime safety rules were met then we would not allow the usage of ref returns in that part of the program. It is not a huge amount of dev work to do so, but it is a lot of burden on the testing teams to make sure that we’ve really got all the cases. It’s just another thing that increases the cost of the feature to the point where right now the benefits do not outweigh the costs.

If we implemented this feature some day, would you use it? For what? Do you have a really good usage case that could not easily be done some other way? If so, please leave a comment. The more information we have from real customers about why they want features like this, the more likely it will make it into the product someday. It’s a cute little feature and I’d like to be able to get it to customers somehow if there is sufficient interest. However, we also know that “ref” parameters is one of the most misunderstood and confusing features, particularly for novice programmers, so we don’t necessarily want to add more confusing features to the language unless they really pay their own way.

Comments (60)

  1. Travis says:

    I don't really see myself ever using something like this. I'm sure others would find it useful, but I imagine the majority would not.

  2. Random832 says:

    How would such a detector work?

    Specifically, how would it detect this as invalid but not _also_ detect the same M1 and M3 as invalid if  M2 were:

    class intbox { public int value };
    ref int M2()

      intbox y = new intbox { value = 123 };
      return ref M1(ref y.value);
    }

    Or if it's M2 that would be detected as invalid, what if M1 returned an (unrelated) intbox.value?

    If it's conservative enough that it will reject it even with either [or both] of these changes, what cases _will_ it accept?

    Good questions. When researching this prototype we did a sketch of how such a detector might work. In the scenario you describe there is no problem because no ref to y is ever passed to M1, and therefore M1 cannot possibly return a ref to y. y.Value is a variable on the heap somewhere, so it is perfectly safe to pass around arbitrarily. The actually dangerous scenario is

    struct intbox { public int value }; // NOW A STRUCT
    ref int M2()

      intbox y = new intbox { value = 123 };
      return ref M1(ref y);  // Now M1 takes a ref intbox.
    }

    M1 might be returning a ref to y.value, which is on the stack and about to die.

    The detector would have to keep track of what local variables of value type were being passed by ref, and if the returns coming back could possibly be interior to those locals. If you had:

    ref double M2() // now returns a ref double

      intbox y = new intbox { value = 123 };
      return ref M1(ref y);  // Now M1 takes a ref intbox and returns a ref double
    }

    then no problem; there's no way M1 could be returning a ref double that came out of the storage of y, because intbox doesn't have any field of type double.

    Basically you just need to do a little local flow analysis on every ref local, and see if any ref return can possibly be returning the local or a portion of it. It's not that hard a problem. — Eric

  3. Mario Vernari says:

    A little off-topic…

    On a desktop system, rich of resources, I'd cut the "ref" accessor. I don't see any good reason to keep, especially now that the world is going toward async, and immutability is getting more importance.

    Anyway, C# can run even on a compact and micro frameworks, where the resources are like the water in the desert.

    Now, consider an array of structs (e.g. Point) and a loop to translate all of them of an offset (also a Point):

    for (int i=0; i<N; i++)

    {

     pt[i].X += offset.X;

     pt[i].Y += offset.Y;

    }

    Well, in this trivial case the ref is important, and would be even important if I were able to use "inline". That is because that loop is poorly performing: it has to access twice an indexer.

    If I add this helper:

    function Adder(ref Point pt, ref Point offset)

    {

     pt.X += offset.X;

     pt.Y += offset.Y;

    }

    the performance rises a lot more, because there's only one indexing, and none is copied.

    My question is: would be a valuable task the ability to inline-"ref" a struct of an array without having to write a separate function?

    Thanks a lot.

  4. Jeff C says:

    Please don't add this to C#.  I don't want to maintain code that uses it.

    I hear you, but I should point out that we get that feedback from customers for pretty much every single feature we propose adding. People basically say "Well I will know how to use this feature correctly, but my idiot coworkers are going to mess it all up and then I'm going to have to clean up their godawful code, so please don't give those bozos any more power." We got that feedback for generics, LINQ, dynamic, async, you name it.

    We take very seriously the fact that features can be misused; we want C# to be a "pit of quality" language, where the language naturally leads you to write the high-quality soluition and you really have to climb out of the pit to write something low-quality. But we also trust our customers to follow good practices and to learn how a powerful tool works before they start building with it.

    What scares me about this feature is that it makes it easier to write programs that have lots of variable aliasing in them. Aliasing is hard on the compiler because it greatly complicates analysis. And if the compiler is having a hard time, humans are going to have a hard time as well. But if hypothetically we did this — and like I said, we're probably not going to — it's not like we're going to go down the C/C++ road and allow you to pass back references to dead variables. The feature will still be memory-safe. — Eric

  5. Sam says:

    I'm with Jeff; I think maintaing this on methods might be hard.

    However, I wouldn't mind it on properties; it would be nice to be able to return a struct (such as Point) and you can change a property of that struct without having to copy the struct to a local, change the local and then set it back to the original property.

  6. David V. Corbin [MVP] says:

    A passionate plea to NOT add this.

    @Mario, there are a number patterns/use cases where "ref" parameters really make good sense. For example if you are implementing an immutable system and need to update the caller with multiple new instances. Also it is a very handy paradigm when initializing read only fields and you want to factor this out of the constructor body itself. [although I wish the definition of readonly was changed…but after 5 years, I have given up even asking]

  7. jader3rd says:

    I can see how it appeals to those who like tricky unreadable code, but we're not counting every byte in application as much as we did back when C was invented.

    I prefer having readable code and letting the compiler/Jitter working out the "tricks".

    I find that this feature would quickly result in bugs because I feel that it doesn't follow the principle of least surprise.

  8. mayank.kumar@live.in says:

    This sure would reduce the readability/maintainability of code. But that doesn't mean that it should not be implemented. Experts who need this kind of power may use it.

  9. Shuggy says:

    For the life of me I can't think of any uses for this in my day to day usage. I can't even think of uses for it in extreme performance scenarios when I would be willing to tolerate the conceptual complexity incurred.

    I'd therefore go with the 'no' option unless someone could point out some compelling use cases I think would benefit me :)

    The hit to the reflection/generic layer would also be quite unpleasant (especially since you don't even have the ultimate (if costly) fallback of treating it as a (possibly boxed) object.

  10. Jonas says:

    Please don't add this, whenever I see a ref I'm very suspicious of what is going on.

    If you need C++ features use managed C++

  11. Bill P. Godfrey says:

    I wonder what would be the practical differences between ref types as you discuss, and a generic class…

    public class ValueRef<T> { public T Value; } /* Untested. Ctor etc ommitted. */

    The Max function mooted would instead accept two ValueRef<int> object references and return one of those. There's only one copy of the value inside, as long as all access are via x.Value.

    Granted, this isn't as tidy as putting the word 'ref' next to the type name, and I'm also side-stepping the question of what practical uses such an object has.

    (I've not tested any of this or really thought it through. Please be nice.)

    billpg

    So how do you make a ValueRef<int> to the tenth element of an integer array, say?

    Your idea is not so farfetched though. Something I deliberately did not mention in this article is that something like the ValueRef type you propose actually exists! It is called TypedReference and it is a Very Special Type. It is used only for obscure interop scenarios where you need to be able to pass around a reference to a variable of type where the type is not known at compile time. This is a subject for another day. — Eric

  12. Brian says:

    I agree with Jeff.

    I can't think of any situations where I would actually want this.  In the strange event that I need something like this, I'll either maybe use something like Eric's ref class ( stackoverflow.com/…/2982037 ).  If someone trying to port C++ code hits an issue that requires something like this, I'd prefer solutions that avoid introducing extra syntax; it only only serves to add more ways for other people to make code less maintainable.  One of the things I like about all the Marshalling functionality is that it's mostly off to the side and ignorable until actually needed.

    I do see Sam's point and *have* encountered situations where it would have made my code slightly simpler, but every time I've hit such a situation I was able to work around it very easily.  I think supporting even that much would add more problems than it would solve.

  13. practicalvb says:

    Eric, I hope you'll blog about TypedReference. Not long ago, I had to write the equivalent of htmlTextWriter._attrList[index].value = someValue using Reflection. Because _attrList is an array of RenderAttribute structures, I had to get and set the entire array element just to set the value field. It seems like using FieldInfo.SetValueDirect could have made this a little more efficient.

  14. Stephen Cleary says:

    Meh.

    As a long (long, long) time C++ user, I tend to avoid aliases. They're very difficult for the compiler/optimizer to reason with, not to mention humans. If I had a dollar for every bug… (including compiler bugs; I found and developed minimal repros for a couple dozen from Borland, a handful from GCC, and countless ones from Microsoft – no offense).

    Your Max() example triggers neurons in my brain associated with C preprocessor macros – that's what it mentally feels like. Maybe a better example would show how this would be useful, but I can't think of any case where I'd use this (however, I stayed up all night with my sick 23-month-old, so my brain isn't exactly 100% at the moment).

    I'd rather pass everything by value, even going so far as suggesting a Python-esque multiple-return-value syntax:

     (resultA, resultB) = Func();

    so that the "out" keyword is no longer necessary. Though I guess "ref" could still be used if somebody *really* had to pass a large mutable value type (not something I've ever seen recommended. Or something I've ever done after my first week in C#.). You could even treat this as a syntax-only change, converting additional return values to reference parameters under the covers.

    This would be nudging the language in the opposite direction of the "ref return" idea, but to my mind it would result in more clear code.

  15. pminaev says:

    > doing it properly would require some changes to the CLR. Right now the CLR treats ref-returning methods as legal but unverifiable

    In fact, the story is more subtle than that. While Ecma-335 does say that any return by reference is unverifiable, .NET implementation of the spec does a more stringent analysis. In particular, it is verifiable to return a managed pointer to a field of a reference type (i.e. ldflda immediattely followed by ret).

    VC++ actually implements such checks during compilation if compiling with /clr:safe. So:

    ref class Foo

    {

    public:

       int x;

       int% GetX() { return x; }

       int% GetY(int% y) { return y; }

    };

    GetX() will compile successfully and produce verifiable code, but the compiler will bark on GetY().

  16. Steven says:

    I agree with most of the other comments here – I don't think I'd use this if it were implemented, and I don't think it's really necessary.

  17. Josh Smeaton says:

    I was under the impression that the general trend was moving away from mutable types and operations. I've seen ref abused quite thoroughly; mainly by those that don't (and still don't) understand the difference between reference and value types. But that's still not a good reason for rejecting a feature I'll admit.

    I'm more interested in the 'message' such a feature would send. "We're encouraging side-effects".

  18. Simon Buchan says:

    I have wanted a subset of this occasionally – ref returns of array elements or fields transitively of a ref type (eg "ref Ref.Struct.Struct" would be ok) would be quite useful, and (even unreturnable) local ref variables would be nice, but I wouldn't know how useful until I used this. Remember, for better or worse, in a lot of cases the alternative is public mutable fields.

  19. nonoitall says:

    This would definitely open up the way to some nice performance improvements in collection indexers.  For example, right now if you want to increment a value in a dictionary, you effectively have to look up the same key twice.  This can be extremely expensive  in performance-critical areas.  With ref variables though, you could just return a ref on the first lookup and increment the variable that it refers to.  Yes, refs are a little trickier to use than normal variables, but it certainly wouldn't be the most complex feature to graze the C# language.  (I mean, it already has *unmanaged* references and pointers.  It seems a little backwards that *managed* references would be omitted.)

  20. Héctor says:

    Although I see its uses, I don't know if this is something I'd use because of people having troubles maintaining the code, however, if it were up to me, I'd add this feature anyway.

  21. Fabien Barbier says:

    I wrote something like this recently:

    public class SomeClass

    {

    bool processed1;

    bool processed5;

    bool processed8;

    AdditionalData additionalData1/* = … */;

    AdditionalData additionalData5/* = … */;

    AdditionalData additionalData8/* = … */;

    ProcessedData processedData1;

    ProcessedData processedData5;

    ProcessedData processedData8;

    public void ProcessSomething(ThingKind thingKind, ThingData data)

    {

    AdditionalData additionalData;

    TypedReference processedDataField;

    TypedReference processedField;

    // Initialization

    switch (thingKind)

    {

    case ThingKind.Thing1:

    additionalData = additionalData1;

    processedDataField = __makeref(additionalData1);

    processedField = __makeref(flag1);

    break;

    case ThingKind.Thing5:

    additionalData = additionalData5;

    processedDataField = __makeref(additionalData5);

    processedField = __makeref(flag5);

    break;

    case ThingKind.Thing8:

    additionalData = additionalData8;

    processedDataField = __makeref(additionalData8);

    processedField = __makeref(flag8);

    break;

    default:

    throw new NotSupportedException();

    }

    // Actual processing…

    // Throw exceptions if there are problems

    // Then finish

    __refvalue(processedDataField, ProcessedData) = /* something */;

    __refvalue(processedField, bool) = true;

    }

    }

    This code would benefit from official C# "ref locals"…

    (Of course, this specific method could be implemented in other ways not requiring the use of references, but most of them would end up requiring a lot more code (arrays, reflection, delegates…).

    Calling an external (private) method in each case label would however, perfectly work in this simple case, but would not work for more complex ones.)

    I felt a little shameful using those "famous" undocumented keywords, but to me the code feels much cleaner than any other way I could have ended up with.

    Other than this, "ref properties"as mentionned by Sam, would be quite an useful feature.

    The point of regular properties usually is to encapsulate a field, and do additional processing (e.g range checking, …) before setting the value.

    However, one may sometimes not need to verify what is written to the encapsulated field, but return an *efficient* reference to said field instead.

    Being able to write something like this might be interesting:

    public ref byte this[bool b, int i] { ref get { return (b ? array1 : array2)[index – 50]; } }

    (I chose to write "ref get", as it emphasizes the fact that it is part of a "ref property")

    Aditionally, considering you propose to support expressions such as:

    Max(ref a, ref b) += 100;

    Am I right assuming that this means expressions such as this one:

    (boolean ? ref int1 : ref int2) = 12; // would also work ? (Because this would be great to have sometimes…)

    Anyway, I would really like to have such a feature in the language, but I admit I wouldn't use it everyday.

  22. lidudu says:

    It is common in C++ classes to have methods which returns CValueType& (for read-write) or const CValueType& (for read-only). And I did wanted that when I was learning C#. But now I am a C# guy and I changed my mind.

    Firstly, returning CValueType& somewhat breaks encapsulation. It makes the field modifiable to any value by other code at any time without notifying about it. It does not matter for simple case like List<>, but matters in case of Collection<T> derivatives which may need to do some action on change.

    Secondly, as long as CValueType& (ref return) is supported, we'd want const CValueType& (const ref return) to avoid arbitrary modification, which means introducing the tedious const-correctness thing of C++ to C#.

    Thirdly, it adds another way to implement read-only and read-write properties.

    So, I second the opinion that let the compiler/jitter to optimize out the value type overhead instead.

  23. Alex Davies says:

    I don't mind the feature a great deal. I can't see myself using it much if at all, but it's plenty clear what's going, what with all the ref's around the place few people would ever be caught out surprised.

    And I have to say.. I do actually like Fabian's example above:

    (boolean ? ref int1 : ref int2) = 12;

    Despite that for performance reasons I imagine the compiler would ideally turn that into the equivelant if/else without references.

  24. Judah Himango says:

    Please don't add this. As someone who's worked on commercial .Net systems since .Net v1, I cannot imagine a case where I'd need this; neither would I wish to maintain such a system.

  25. pete.d says:

    Well, as long as you're doing the survey… :)

    Honestly, this is a feature I can do without. However, if it existed I would probably use it occasionally. Oddly enough, not the ref return type so much as the ref locals.

    In fact, just today I was cleaning up some argument parsing code where I was thinking it would be nice to be able to do something like:

     string hasoperandvalue = null, otherwithopvalue = null;
     ref string operand = null;
     foreach (string arg in args)
     {
       if (operand != null)
       {
         operand = arg;
         continue;
       }
       switch (arg)
       {
         case "someflag":
           someflag = true;
           break;
         case "hasoperand":
           operand = ref hasoperandvalue;
           break;
         case "otherwithop":
           operand = ref otherwithopvalue;
           break;
       }
     }

    That sort of thing. The general idea being that I've got code that will want to modify some variable, based on some prior condition, and I want to leave the modifying code general-purpose.

    The above isn't the only example of this coming up, but I have to admit, it doesn't come up very often. The other thing is that while the above is perceived by me as more elegant, I suspect that for at least some others it would just make it harder to understand the code. Indirection has always been a bear for many programmers.

    Workarounds include:

     • Using an interface with a property to describe the variable that needs updating. This is overkill unless you're already dealing with somewhat complex types, but in that case it can work fine.

     • Using anonymous methods to capture and set the variables you want to refer to. Code-wise, this actually doesn't look too bad, but it still has the same sort of indirection-caused code-confusion potential as other ways of aliasing.

     • Just do the damn assignment at the point where you can tell which variable to assign. In the above example that means that you need to iterate over the collection by index, so that within the loop you can retrieve more than one value (i.e. the current value tells you the switch, but then you need to advance to the next one before continuing in the loop), with of course range-checking to make sure you don't go out of the indexed collection index range if the arguments provided were incorrect (i.e. missing argument at the end).

    (You can also use a "foreach" loop in that last scenario if you maintain the "prev" value, and do the switch on that one instead of the current one, but that's arguably at least as confusing a way to write the loop as using indirection of some sort).

    Anyway, the bottom line is that while I occasionally do find myself a bit wistful about not having ref locals, fact is the code is never really all that much worse without it, and frankly in terms of readability and maintainability, I'd say it's arguable that the code is _better_ without (C# already offers me plenty of opportunities to be too clever for my own good :) ).

    Interesting example; thanks for posting it. In my prototype this would not be allowed because we ensured that ref locals are always initialized to a valid reference to a variable. There was no "null ref". — Eric

  26. pete.d says:

    (and yes, I forgot to set "operand" to null before continuing in the code above…I hope that does not detract from the comprehensibility of the example :) )

  27. Shuggy says:

    Pete.d's example actually shows a lovely boundary case. By working on the foreach variable you have either:

    Broken the (user) immutability of the variable.

    Added yet another confusing bug trap a la trapping them with closures

    Made the compiler have to spot this case and warn/refuse.

    Admittedly they seem to be trying to change the semantics of the variable in hypothetical vNext, but none the less as it stands I see no way this could end well.

  28. vemv says:

    I'd love this feature. Reference handling is much useful for multimedia apps.

  29. Grant Husbands says:

    I have wanted something similar in the past, when trying to modify fields inside stricts inside property-reads (such as the value of a given entry in a dictionary). Obviously, there are other ways of getting the same effect, but they aren't always as clean.

  30. Apollonius says:

    Although I usually like every low-level feature, I would not like to see ref locals in C#. Looking at pete.d's example, one can easily spot a lot of issues with this feature: Observe that the example relies – as most usages would – on the ability to change the variable that the reference is an alias to. But the aliasing ref local operand is declared outside the loop, which opens the question about what it refers to after the loop has terminated, given that it pointed to a loop-local variable. From the CLR point of view, this is safe, but from the C# point of view it isn't. It also concerns details like whether a loop-local variable lives in the same location in different loop iterations – this is normally true, but a simple lambda referring to the variable can change that.

    I think this shows that introducing ref locals in this way opens a can of worms which should remain closed.

    Init-once ref locals are another issue though – these would resemble C++ references rather than C++ pointers (% rather than interior_ptr if you speak C++/CLI, & rather than * if not). However, this would require a special treatment of variable initialization which C# doesn't have right now ("type local = expr;" is currently equivalent to "type local; local = expr;" – implicitly typed locals aside); I'd prefer not to change the current intuitive semantics.

     

    Indeed, you make an excellent point which I did not call out in my sketch of the feature. In the prototype I wrote up I ensured that ref locals were "init only". There are certainly pros and cons of both ways. — Eric

     

    Also note that such ref local variables are of rather limited use as long as ref returns are not introduced as well, which have their own issues. Nevertheless, I somewhat agree with nonoitall regarding his comment on collection indexers; I have always found the STL indexers with their ref-returning behaviour a bit superior to the .NET approach with completely distinct getters and setters. However, I doubt that this small use case warrants such a major language change.

    I am glad that C++/CLI offers full support for managed pointers via % and interior_ptr; I have found this absolutely useful at times, especially when doing pointer arithmetic in arrays without needing to pin them. However, this is not really the C# way of doing things, so all in all I'd prefer not seeing managed pointer support in C# be extended beyond the current state of affairs (ref and out arguments).

    The only thing that slightly worries me is that this means that methods may be uncallable for C#; if I want to call a managed-pointer-returning method right now in C# 4.0, I simply get "'Class.Method()' not supported by the language". This is a bit sad, given that C# has always strived to expose every CLR feature, but I do not see a way around this.

  31. Apollonius says:

    It seems I have misread pete.d's example; operand never aliases a loop-local variable. This doesn't change my argument, though.

  32. Ferdinand swaters says:

    I would use us for a custom implementation of an array of structs, when the normal implementation is troubled. An implementation of Binary Decision Diagrams on a 32 bit architecture would be a real world case. As far as BDD's can be seen as real world, that is. Besides that, i have never even felt the desire for ref locals or ref returns, and overall life is better without them. It would be just one more dangerous pit that Jr Programmer could fall into.

  33. pete.d says:

    "But the aliasing ref local operand is declared outside the loop, which opens the question about what it refers to after the loop has terminated, given that it pointed to a loop-local variable."

    I'm concerned that two different readers do not seem to have understood the code example I posted.  The ref local does _not_ refer to the foreach variable (i.e. the "loop-local variable").  I agree that would be a problem, and it's the same problem as if a ref return value aliased a local variable from a called method that has returned.

    In my example, there are two ways "operand" is used: "operand = ref <foo>" and "operand = <foo>". Only the former case assigns the alias. The latter case (which is what's used with the loop-local variable) would dereference the alias and the assignment is made to the variable the ref local is aliasing, not the ref local itself (due to the lack of "ref" on the RHS of the assignment). This is consistent with the proposed syntax in Eric's article (perhaps that highlights yet another problematic aspect: making it clear in code whether one is creating a new alias, or using the existing one…with real pointers, the "*" accomplishes that, but in Eric's examples it's implicit according to usage, which can lead to misunderstandings).

  34. Greg says:

    Please do not put this in C# or VB.  It's applicable in only a small number of cases.  It's rarely used in C++.

  35. pete.d says:

    Another thought: if we can have ref return values, can we also have "ref ref" method parameters? What about "ref ref locals and return values"?

    In C/C++, you can add as many levels of indirection as you like. Indeed, due to the explicitness of reference types in C/C++, it's quite common to have two levels of indirection, and three levels isn't exactly uncommon (e.g. pointer to a pointer to an array of pointers).

    It seems to me that C# currently strikes an effective balance between usefulness and simplicity. The "ref returns/locals" feature is way down at the bottom of the list of things that I as a programmer would appreciate seeing added to the language. It seems like it introduces a whole host of problems (including offering new, exciting ways for a programmer to make their code much more confusing), while addressing real-world, important utility issues in the language in only a tiny percentage of scenarios.

    We came to the same conclusion — there are some narrow scenarios in which these techniques are extremely helpful, but they are sufficiently uncommon that we didn't want to take on the cost. If hypothetically we were to do something like this feature then we would probably not support multiple levels of refness. — Eric

  36. Shuggy says:

    Ah, sorry pete. Chalk that one up to far too confusing a syntax for me then, clearly I didn't read the method properly.

  37. pete.d says:

    Yeah, when I saw two different people misread the code, it was apparent that there was yet another issue with ref locals: the syntax (at least that presented here) is confusing!

    Presumably in the hypothetical case where this feature was implemented, a less-confusing syntax could be contrived (maybe requiring a keyword for dereferencing the alias, a la "*" but C#-ish). Somehow, I suspect it's something the C# language team doesn't have to worry about for quite a while. :)

  38. pminaev says:

    >> Another thought: if we can have ref return values, can we also have "ref ref" method parameters? What about "ref ref locals and return values"?

    I don't think it's reasonable to treat "ref" as analogous to pointers in C++ – they're much more like references (&) in that they are bind-once with no ability to rebind, and implicitly dereferenced. And C++ doesn't have references-to-references either.

    Also, there's no good way to do so while keeping managed pointers (which is what "ref" is) verifiable. The moment you make a "ref ref", you create a possibility for a ref to local to escape the scope of that local. And if refs are no longer verifiable, then how are they different from unmanaged pointers?

    On the other hand, you can have as many levels of indirection as you want in C# already – int*** is a perfectly legal C# type.

  39. stikves@hotmail.com says:

    Of course the most obvious use is foreach:

    foreach(ref var i in list)

     i++;

    This is already possible in C++/CLI. It would be nice to have it in C# as well.

  40. Pent says:

    If it was only allowed in an "unsafe" context then it would be less likely abused and the verifier wouldn't need to be improved. Using "unsafe" for other possibly confusing/complex operations supported by the CLR, yet not supported by C#, could also lower their cost of implementation and support. Things like uninitialized locals, (ref int)obj unboxing, etc.

  41. runefs says:

    I would probably"remove" this feature with static code review tools from the teams I am going to lead. I've been on several c++ projects were the lack of readability caused by this featured costed more than we could ever have gained from it

    In my opinion languages should aim at gathering related information so that you for each algorithm have one locus of knowledge. This feature would distribute the knowledge of what happens to a variable making it very hard to reason about the code.

    Entire paradigms, like DCI (by the grand father of MVC Trygve Reenskaug supported by other noteworthies) are build on the idea of keeping knowledge of a functionality located at one place in the code

  42. Peter J Fraser says:

    Asssuming the implementation would extend to lamba functions. The feature would allow an implementation of

    Algol 60 Call by Name.

  43. Dmitry Zaslavsky says:

    Not sure if I missed it in other comments.

    But the very common usage I would have for that feature is the following code

    double value;

    if(! map.TryGet(someKey, ref value))

    {

        value = map[someKey] = computeSemiExpensiveValue(…)

    }

    without ref return values, the hash lookup is done twice

  44. dmihailescu says:

    Don't do it! Please don't screw our brains!

    your method:

    static ref int Max(ref int x, ref int y)
    {
     if (x > y)
       return ref x;
     else
       return ref y;
    }

    can be emulated by :

    static void Max(ref int x, ref int y, out int z)
    {
     if (x > y)
       z= x ;
     else
       z=y;
    }

    creating new rules for ref read only properties, or ref covariance/contravariance will make my head explode! ;)

    I don't understand how you believe that your "Max" and my "Max" are anything the same. Your Max doesn't even need the arguments to be ref; you've just written the standard implementation of "Max" with an out parameter. — Eric

  45. nonoitall says:

    @dmihailescu:  Your function does not emulate the original function at all.  The original function returns a reference to the larger variable that can be used to modify that variable.  Your function just places the value of the highest variable into a third output variable, which is really no better than just returning the value.

    I don't really understand why people are so opposed to this feature being included in the language.  If you can't wrap your head around it, you don't have to use it!  Just like pointers, sockets, strings, arrays or any other programming feature that some arbitrary programmers can't wrap their heads around.  But that doesn't mean that the feature wouldn't benefit programmers who *can* wrap their heads around it.

  46. dmihailescu says:

    @nonoitall

    I beg to disagree. the out z in my function is the equivalent of the return ref in the original function.

    As one who has done C++ and C++/CLI, I'll take the simplicity and cohesion of C# any day over the C++ syntax and concepts. The only place where C++ is a must is native code or asm blocks, the rest can be emulated by .net constructs.

  47. Alan says:

    @dmihailescu: Actually, nonoitall has a point here. Look at the lines "Max(ref a, ref b) += 100;" and especially "ref int c = ref Max(ref a, ref b);" in the original post. In both cases, you get a *variable* that will be aliased to another by the Max method (its "identity" is "switched" for you). This is not the same as putting a *value* in z, and then working with z. In Eric's code, you will continue to work with either a or b and you don't know which until the Max() method completes.

  48. nonoitall says:

    @dmihailescu:  Yeah, the original function *returns* a reference to A or B that can be used by the caller to modify said variable.  Your function accepts a third reference as *input*, but simply places a value into the variable that it points to.  It does not return a reference that the caller can use to modify one of the two original variables.  There's really no reason for the first two parameters in your function to be refs because the function only reads the values that they point to, and does nothing with the references themselves.

  49. dmihailescu says:

    @Alan

    You are right. My z will be an alias  to another variable on the stack not a reference to an existing variable.

    I was thinking about reference objects not value types. My bad!

  50. Jason Lind says:

    How about for databinding w/in C#:

    public BoundMember<T>

    {

     private ref T _GetValueRef = ref default(T);

     private ref T _SetValueRef = ref default(T);

     private T _GetValue;

     prviate T _SetValue;

     public BoundMember(ref T value, BindingMode mode = BindingMode.TwoWay)

     {

       Mode = mode;

       _GetValue = value;

       _SetValue = value;

       if(mode != BindingMode.OneWayToSource)

          _GetValueRef = ref _GetValue;

       if(mode == BindingMode.TwoWay || mode == BindingMode.OneWayToSource)

         _SetValueRef = ref _SetValue;

     }

     public T Value

     {

       get{return _GetValue;}

       set{_SetValue = value;}

     }

     public BindingMode{get;set;}

    }

    public enum BindingMode

    {

     TwoWay,

     OneWay,

     OneWayToSource

    }

    Maybe even add syntatical sugar:

    int x = 0;

    var boundX :-: x; <=> new BoundMember(ref x);

    var boundX1W :- x; <=> new BoundMember(ref x, BindingMode.OneWay);

    var boundX1W2S -: <=> new BoundMember(ref x, BindingMode.OneWayToSource);

    What do you guys think?

  51. Sander says:

    I would not use it since I have never felt the need for this. At the moment, I can't think of a case where I would want it – out and ref work perfectly for me when I need to do something fancy.

    Indeed, I would be glad to not see this feature, as I feat it would invite people to invent new "clever" (= bad and unintelligible) coding patterns.

  52. fuzzyeric@bigfoot.com says:

    I believe you summed up the answer to your question in "Compound Assignment, Part One" ( Tue, Mar 29 2011 2:24 PM ):

    "we are now mutating the variable containing the copy but we need to be mutating the original."

    I tend to use return by reference when I want:

    * to return an lvalue.  a[5] = 0;  // a.operator[](5) = 0

    * to return this; i.e., to construct a filter chain.  See "method chaining".

    * to act on a singleton.

    * to act on someone else's storage, without having to know that their storage is stack, heap, or TEXT, or that their storage is actually *someone else's* storage, or that the storage is in some complicated container (so I have to be able to consume any random iterator or handle type), or that the storage is in some packed container (like an array).

    * to mutate someone else's consts.  "Yeah…  *You're* not allowed to change it…"

    * to act on a lazily constructed value type without forcing its entire construction at copy to stack.

    * to act on just-in-time data without forcing redundant just-in-time loads to recur at every call in a complex expression tree.

    * to get as close to an lvalue reference optimization as I can.  (I.e.  re-use the shallowest caller's return slot in their stack frame as the return slot for each callee all the way down to the deepest callee.)

    * recently, to avoid repeatedly copying multi-MB frames of video to the stack in recursive de-noising operations.  (Which, thanks to a 3rd party library writer are all value types.)  (This is more important on phones and tablets…)

    Admittedly, some of these overlap.  And, yes, anything can be worked around.  (Your Turing complete language is nor more powerful than *my* Turing ocmplete language…)  But, once one has understood the value of writing functions as filters on someone else's data, rather agnostically on how that data's stored, where that data's stored, or what operations the caller is allowed to perform on the data, it's rather cruel to take that away…  So, yeah, I think references would be a good idea for C#.

    Of course, sometimes, I'd really, really, really, like to know if I've been called with an lvalue or rvalue reference.  See thbecker.net/…/section_01.html for a reasonable explanation.

  53. Daniel Earwicker says:

    @Jason Lind – your example has ref fields in a class, which isn't allowed in the CLR. But ultimately a reference to a "variable" of any kind can be treated as two operations, a getter and a setter. And thanks to lambdas you can express operations very neatly, and they can refer to local variables, so you can already simulate that kind of binding very easily, see smellegantcode.wordpress.com/…/pointers-to-value-types-in-c

  54. Daniel Earwicker says:

    @Eric Towers – Based on the examples you're giving (e.g. realtime video processing), you're not really describing C#'s sweet spot. If safe-mode C# absorbed all the features it needed to replace C as a "portable assembly language", it would be no different from unsafe-mode C#.

    Although I confess that as a recovering long-term C++ user, I've found it shamefully entertaining to learn about rvalue references. But I'm going to keep it as pure entertainment. I think it's going to work best that way for me!

  55. Sergey T. says:

    It seems a little confusing to me that we already have ref returns in array indexers:

    Consider following example:

    struct S

    {

      int i;

      public void Increment() {i++;}

    }

    S[] s = new S[1];

    s[0].i++; // s[0].i equals to 1 because s[0] returns managed pointer to interior array element

    IList<S> si = s;

    si[0].i++; //compile time error, because we can't modify temporary variable

    si[0].Increment(); // compiled successfully but increment local copy

    It's clear why we have ref returns on arrays (it improves performance) but why we have ref returns in arrays but do not have ref returns in BCL collections? It seems like we could add this behavior to existing collections without adding any new features to C# compiler at all.

    P.S. Actually I can't find specification section that stated about returning "managed reference" from arrays indexer. I know that array indexer implemented with its own IL instruction called ldelema, but I can't find official rational about it.

  56. John Payson says:

    Actually, upon further consideration I find myself liking the variadic-delegate-callthrough approach more, and the idea of "heap ref" parameters less.  I would suggest something like the following syntax:

    int foo {ref {

     do_some_stuff();

     ref return x;

     do_more_stuff();

    }}

    The "ref return" statement could use any legitimate lvalue (including automatic variables of the property, if desired) but the routine would have to be written in such a way that execution could not escape the function except via an Exception, without performing exactly one "ref return".  What would be necessary to make this work would be a means of declaring a function which could accept a delegate consisting of a fixed-type ref parameter and an arbitrary number of additional ref parameters, and be callable using its fixed parameters and the appropriate number of additional ref parameters to satisfy the delegate.  For example:

    void foo(long thing1, String thing2, ActionByRef<ref int, …> theAction, …)

    {

     do_some_stuff();

     theAction(x, …);

     do_more_stuff();

    }

    might expand as:

    void foo(long thing1, String thing2, ActionByRef<ref int, ref T1, ref T2, ref T3, ref T4, ref T5> theAction,

       ref T1 r1, ref T2 r2, ref T3 r3, ref T4 r4, ref T5 r5)

    {

     do_some_stuff();

     theAction(x, r1, r2, r3, r4, r5);

     do_more_stuff();

    }

    if there needed to be five reference parameters.  If one could have a function that could expand itself as needed with the appropriate number of reference parameters, one could achieve the benefits of being able to return references, and some more besides.  Among other things, one could wrap the "ref return" in a Try/Catch/Finally block, allowing check-out/check-in semantics to be enforced even in the presence of exceptions.

    Incidentally, while it might in some cases be nice to allow for arbitrary-expanded value parameters, I don't think restricting the expanded area to reference parameters would be a problem.  Even if the function one wanted to call would expect reference parameters, the compiler which was generating the code to use a reference property could generate an appropriate static wrapper function to do the conversion.  The stack required to call a function of N parameters of which M were reference properties would be O(N*M).  If all the stack items in question are variable references (8 bytes for x64), the limit even for M=N=30 would be about 4K of stack space.  If the items were 200-byte value types, though, things could get ugly.

  57. Ryan says:

    I would use it! I need it right now

    Can you give us more details? — Eric

  58. merc1031@sevensinsystems.com says:

    I feel like it would be useful to have ref returns if you could combine this with ref parameters for operators as well. It would allow doing math operations for example on complex types (Matrices) without having to worry about it copying the thing so many times.

    This can of course be inlined by hand, but inling so many adds and multiplies begs for a better way to do it.

    Working with XNA in C# 4 it can be a burden to do Vector / Matrix math when worrying about performance as well, especialy with very complex matrix hierarchies and blended animations using 100's or 1000's or matrix multiplies per model (with many many value copies as well)

  59. qwertman@hotmail.com says:

    Since I see a lot of people saying "Don't add this!" I feel compelled to add my vote to the "yes" side.

    I think of C# as being a better C++. I would love nothing better than to stop using C++ and use C# exclusively, because the safety guarantees, rich BCL, improved intellisense, shorter code, faster compile times (etc.) of C# are invaluable. Unfortunately, I still have to use C++ extensively because C# is not fast enough—especially on .NET Compact Framework which is sometimes more than 10 times slower than equivalent C# code (see http://www.codeproject.com/…/BenchmarkCppVsDotNet.aspx ).

    As mentioned earlier, ref-types could let you get a reference to a value in a Dictionary and modify that value with a single lookup, and ref-types would allow you to write more concise code in some cases (and I agree with Yaron Minsky about the value of concision, see queue.acm.org/detail.cfm ), such as the stated example that you want to modify the result of Max(ref x, ref y).

    Even if this is never allowed in C#, surely it should be officially supported by the CLR standards (not just MS' implementation) so that compilers for 3rd-party languages can "safely" include the feature. (Same argument goes for covariant return types.)

  60. pgeerkens@hotmail.com says:

    If implemented I would use it today, and have used it last night.

    I desire to access the STL/CLR but cannot do so in C# because all the interfaces have prototypes with TValue% return types.  Yet, given the knowledge that all C# reference types and boxed value types are on the managed heap and that the STL/CLR is using 'by-value' semantics, it is safe to convert these to TValue%. The only missing piece is for the C# compiler to recognize this and do its magic, recognizng that the C# implementation

    public  TValue  get_ref() { return _container[_bias]; }

    satisfes the interface prototype

    public  TValue%  get_ref();

    In fact the MSDN documentation here

    msdn.microsoft.com/…/bb302608.aspx

    states that this will happen, but it doesn't.

    Pieter