Fixed statement and null input…


I’d like some input on a change we’re considering for Whidbey.


Consider the following wrapper class:

unsafe class Wrapper
{
    public void ManagedFunc(byte[] data)
    {
        fixed (byte* pData = data)
        {
            UnmanagedFunc(pData);
        }
    }
    void UnmanagedFunc(byte* pData)
   
    }
}

In this class, I’ve fixed a byte[] array so I can pass it to an unmanaged method. I’ve included “UnmanagedFunc()“ in my program to illustrate what it looks like, but I would normally be calling a C function through P/Invoke in this scenario.


The problem with this code is that some C APIs accept null values as arguments. It would be nice to be able to pass a null byte[] and have that translate to a null pointer, but the fixed statement throws if it gets a null array.


There’s an obvious workaround:

 public void ManagedFuncOption1(byte[] data)
 {
   if (data == null)
  {
    UnmanagedFunc(null);
  }
  else
  {
   fixed (byte* pData = data)
     {
     UnmanagedFunc(pData);
   }
   }
 }

and a less obvious one.

public void ManagedFuncOption2(byte[] data)
{
 bool nullData = data == null;
  fixed (byte* pTemp = nullData ? new byte[1] : data)
   {
   byte* pData = nullData ? null : pTemp;
   UnmanagedFunc(pData);
  }
}

Thr problem with the workarounds is that they are ugly, and they get fairly complicated if there is more than one parameter involved.


The language spec (section 18.6) says that the behavior of fixed is implementation defined if the array expression is null, so we could change our behavior so that fixing a null array would result in a null pointer rather than an exception. 


Questions:



  1. Have you written code where this would be useful?

  2. Are there situations where the change of behavior would cause you problems?

  3. Are the workarounds simple enough that this isn’t an issue?

 

Comments (17)

  1. Mark Mullin says:

    Go with implementation defined and change the behavior so a null array nets a null pointer

    Rationale – In C++/C, there are enough cases where you need to distinguish between three states –

    null – I mean nothing, nada, rien

    not null but empty – I specifically do _not_ mean nothing, I specifically mean empty

    something – OK, I do have some data

    Here’s a case in point – I have a CShape class which can be constructed from an array of 3D numbers – I distinguish 3 states

    null – this is a phantom object, used to establish hierarchy relationships – it never has a physical appearance

    empty array – this is a physical object, I just don’t happen to know what it looks like right now (shape computed during run)

    filled array – here is the exact definition

    Most of the calls I make are from C#, not managed C++, but the situation is the same – I’d go looney if I had to do that workaround for each of these cases – and I don’t think I thought this pattern up myself, I believe I’ve seen other examples where code makes specific distinctions between null and empty

  2. Eric,

    1,2: no.

    3: yes… 🙂

    WM_MY0.02$

    thomas woelfer

  3. Juan Felipe Machado says:

    I think the only way you could break something with that kind of changes is when somebody is using exceptions to control the flow of the program, and that was wrong anyway, so I think you should go ahead with the change!! (BTW, I think this will be a GREAT enhancement, I work all the day with interop so this would really help me. I don’t think the workarounds are that problematic at all, but the readability of the code will be improved and that’s a great win for everybody)

  4. Juan Felipe Machado says:

    I think the only way you could break something with that kind of changes is when somebody is using exceptions to control the flow of the program, and that was wrong anyway, so I think you should go ahead with the change!! (BTW, I think this will be a GREAT enhancement, I work all the day with interop so this would really help me. I don’t think the workarounds are that problematic at all, but the readability of the code will be improved and that’s a great win for everybody)

  5. Juan Felipe Machado says:

    OOPS!!! sorry about the double submit…

  6. Don says:

    The workarounds are technically simple, but they both turn trivial one-line function calls into 6-or-more-line blocks of code.

    Turning one simple line into 6+ lines of code is going to cause a huge decrease in code readability, a huge increase in possible error sites, and a great deal of annoyance on the part of every programmer who uses this construct over the next decade (or decades).

    Redefining a fixed pointer to a null array as a null pointer seems like an extremely small (but admittedly breaking) change to correct a poor design decision. If anyone does need to catch the formerly thrown exception (which is probably a vanishingly small number of developers and call sites at this time in history), they can always manually check for a null object and throw the exception themselves.

    -Don

  7. Mattias Sjögren says:

    Change it! The workarounds are ugly, and getting a null pointer is how I’d expect it to behave.

    If you decide to change it, please also consider doing the same for strings, i.e.

    string x = null;

    fixed ( char* p = x )

    // p should be null here

    The current behavior is that p gets the value OffsetToStringData (12), which is pretty useless.

  8. Jeremy says:

    I can’t say that I have had to use the fixed statement yet, but the change certainly seems to make sense to me.

    Mark Mullin made the point that it is a common C/C++ convention to distinguish between null and empty. If the purpose of unsafe and fixed is to interop cleanly with legacy C/C++ code, it makes sense to support the common conventions so long as those conventions are not "bad things" — security risks, poor practices, etc.

  9. Greg Ewing says:

    1. Yes

    2. No

    3. No

    – I think Don’s point about extra lines of code is particularly applicable.

    – The conditionals required for more than one parameter too are extremely ugly and make code very unreadable.

    – I agree with Mattias, it’s what I would have expected and strings should behave similarly.

  10. Adam Merz says:

    1. Yes

    2. No

    3. Yes, but Don’s point still stands… 🙂

  11. Carl Daniel [VC++ MVP] says:

    Change it! Mark & Don have expressed the rationale perfectly.

  12. I agree, go ahead and make the change. But here’s a question? What happens when you try to fix a nullable type?

    Also, 1 suggestion I would like to make, is to support multiple, multiple-Type constructs in the fixed statement. As it is, one may pin 1 or more variables within the same statement if all variables are of the same type. I constantly ( at least whenever I do unmanaged interop ) find myself having to pin many, many vars of differing types at the same time. This results in several fixed statements, and all the actual "meaty" code being indented to the next screen over ( exageration ).

  13. Justin Cummings says:

    While I can see the need, I would hope for a different approach altogether that allows for either:

    – this to be addressed at a framework level (potentially as an attribute class), instead of via a language construct.

    – if the importance of a specific language construct is imperative, implement it as basically an alias for a framework based solution

    Again, I support all efforts to make interop easier, but why solve it once when the other ‘core’ languages suffer the same/similiar problem?

    Keep the basic language clutter-free.

  14. Kannan Goundan says:

    Justin:

    This change won’t add ‘clutter’ to the language. It’s already a special case to decide how a ‘null’ pointer should be handled. The only thing that’s happening here is that the handling of the special case is being changed.

    I haven’t had a need to do this, but on paper it seems like a good idea. Anybody who relies on an exception can easily check for a null pointer instead (since this is probably an uncommon occurence). If there are people who already depend on the existing behavior, it might be best to leave it the way it is, but from the pure language design perspective, I think the proposed behavior is clearly better.

    I think that part of the problem is that when people use ‘fix’, they don’t always directly mean ‘fix-the-address-in-memory’ (in which case the original semantics make more sense). They often mean ‘i-want-to-use-this-in-unsafe-code’. From this perspective, the proposed behaviour is also more natural.

  15. Kannan Goundan says:

    Another reason this is a good idea is that the workaround is easier for the null-should-throw-an-exception people:

    <code>

    if (pData == null) throw Exception();

    fixed (byte* data = pData) {

    UnmanagedFunc(data);

    }

    </code>

    So the new semantics are also could be seen as more elegant because they cover the most common cases with the least amount of modification (was that wishy-washy enough for you?).

    [I hope the ‘code’ tags work]

  16. Brian Grunkemeyer says:

    We should simply fix this, and yes, I’ve run into this in one or two places in the past and was rather frustrated by getting some peculiar exceptions (I think I got a NullReferenceException for null and an IndexOutOfRangeException for an empty array, both of which took me a while to figure out and neither were mentioned in the C# language spec for the fixed keyword). fixed on a null array and an array of 0 bytes should both return null.

  17. Eric,

    Sorry for replying for this late. I was out of town on business.

    I think that the change is a good one, and one that should be done. I think that many people have had issues with this, but don’t understand why it was done this way.

    What would be of great value is to know the reason why this was not done in the first place? Is there a constraint that if changed, would cause problems in other places?

    If you changed it, then I think that there would be some that would have to use workarounds, but at the same time, it would be more "correct" code IMO.