References and Pointers, Part Two


Here’s a handy type I whipped up when I was translating some complex pointer-manipulation code from C to C#. It lets you make a safe “managed pointer” to the interior of an array. You get all the operations you can do on an unmanaged pointer: you can dereference it as an offset into an array, do addition and subtraction, compare two pointers for equality or inequality, and represent a null pointer. But unlike the corresponding unsafe code, this code doesn’t mess up the garbage collector and will assert if you do something foolish, like try to compare two pointers that are interior to different arrays. (*) Enjoy!

internal struct ArrayPtr<T>
{
  public static ArrayPtr<T> Null { get { return default(ArrayPtr<T>); } }
  private readonly T[] source;
  private readonly int index;

  private ArrayPtr(ArrayPtr<T> old, int delta)
  {
    this.source = old.source;
    this.index = old.index + delta;
    Debug.Assert(index >= 0);
    Debug.Assert(index == 0 || this.source != null && index < this.source.Length);
  }

  public ArrayPtr(T[] source)
  {
    this.source = source;
    index = 0;
  }

  public bool IsNull()
  {
    return this.source == null;
  }

  public static bool operator <(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index < b.index;
  }
       
  public static bool operator >(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index > b.index;
  }
       
  public static bool operator <=(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index <= b.index;
  }
       
  public static bool operator >=(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index >= b.index;
  }
      
  public static int operator -(ArrayPtr<T> a, ArrayPtr<T> b)
  {
    Debug.Assert(Object.ReferenceEquals(a.source, b.source));
    return a.index – b.index;
  }
       
  public static ArrayPtr<T> operator +(ArrayPtr<T> a, int count)
  {
    return new ArrayPtr<T>(a, +count);
  }
       
  public static ArrayPtr<T> operator -(ArrayPtr<T> a, int count)
  {
    return new ArrayPtr<T>(a, -count);
  }
       
  public static ArrayPtr<T> operator ++(ArrayPtr<T> a)
  {
    return a + 1;
  }
     
  public static ArrayPtr<T> operator –(ArrayPtr<T> a)
  {
    return a – 1;
  }

  public static implicit operator ArrayPtr<T>(T[] x)
  {
    return new ArrayPtr<T>(x);
  }

  public static bool operator ==(ArrayPtr<T> x, ArrayPtr<T> y)
  {
    return x.source == y.source && x.index == y.index;
  }

  public static bool operator !=(ArrayPtr<T> x, ArrayPtr<T> y)
  {
    return !(x == y);
  }

  public override bool Equals(object x)
  {
    if (x == null) return this.source == null;
    var ptr = x as ArrayPtr<T>?;
    if (!ptr.HasValue) return false;
    return this == ptr.Value;
  }

  public override int GetHashCode()
  {
    unchecked
    {
      int hash = this.source == null ? 0 : this.source.GetHashCode();
      return hash + this.index;
    }
  }

  public T this[int index]
  {
    get { return source[index + this.index]; }
    set { source[index + this.index] = value; }
  }
}

Now we can do stuff like:

double[] arr = new double[10];
var p0 = (ArrayPtr<double>)arr;
var p5 = p0 + 5;
p5[0] = 123.4; // sets arr[5] to 123.4
var p7 = p0 + 7;
int diff = p7 – p5; // 2

Pretty neat, eh?

UPDATE:

A number of people in the comments have asked why the code disallows a pointer “past the end” of the array. In fact the original C code that I was porting did use an invalid “marker” value as the “end of array” marker, and the code did thereby manipulate pointers “past the end” of the array. The original version of the ArrayPtr class that I actually used in the port to C# supported having a pointer one past the end of the array, and threw an exception if you ever tried to dereference it. I thought this detail was distracting from the point of the article so I eliminated the feature from the C# code before I posted it. Perhaps that was a premature optimization.

I have a similar C# wrapper type for strings, where again, I permit a pointer “past the end” of the string where the null-terminating character would be in a C program. That class also supports common C idioms like “strlen” and whatnot. Such types are very handy when porting C code to C#; ultimately of course it is better to use C# idioms in the long run, but in the short run it is very useful to be able to get things working quickly.

————

(*) Were this to be a public type then I’d make the assertions into exceptions because there is no telling what crazy thing the public is going to do; since this is an internal type I can guarantee that I’m using it correctly, so I’ll use an assertion instead.

Comments (31)

  1. Ihar Bury says:

     public static bool operator ==(ArrayPtr<T> x, ArrayPtr<T> y)

     {

       return x.source == y.source || x.index == y.index;

     }

    Looks like there's a bug here 😉

  2. AS says:

    why did you make your == operator

    x.source == y.source || x.index == y.index?

    I would think it should be

    x.source == y.source && x.index == y.index?

  3. Bill P. Godfrey says:

    I've often wanted to be able to write…

    foreach (MyClass x in myarray)    x = DoSomething(x);

    … but this doesn't work because x is a value extracted, no longer linked to the context of the array from whence it came. (So I end up doing a old-school for loop instead.) I wonder if this could be customised to provide a general purpose array itterator that can modify the current item.

    Thankies, billpg.

  4. Brian says:

    @Bill: No it could not.  No matter what trick you do to wrap your foreach, x will be a reference.  What you *could* do is use: foreach (MyClassWrapper x in myarray) x.Value = DoSomething(x.value)

    or even

    foreach (MyClassWrapper x in myarray) x.Value = DoSomething(x)

    The 2nd version would require an implicit cast.

  5. Gabe says:

    Rather than (or in addition to?) implementing this[int index], I'd like to see it implementing IList<T> so that you can enumerate it, sort it, and so on.

  6. configurator says:

    @Gabe: And what does enumerating, sorting, etc. work on? The entire array? The array starting at that point? When you sort it, what happens to the elements before the current index?

  7. Banjobeni says:

    var ptr = x as ArrayPtr<T>?;

    I did not know 'as' can be used with value types when using Nullable<T>. This is amazing!

  8. DRBlaise says:

    I found this equally amazing:  the implicit cast allows equallity to null … didn't know you could do that with a structure.

    var a = ArrayPtr<int>.Null;

    Console.Writeline(a == null); // prints True

  9. Ben says:

    I'm assuming that the reason you don't do bounds checking is so that you can safely do an addition, and then worry later about whether it went off the end.

    If that's the case, a method to get a pointer to the end of the array(or 1 past) might be nice, something like:

    static ArrayPtr<T> EndPtr<T>( this T[] array )

    {

       return new ArrayPtr<T>( array, array.size );

    }

  10. Ben says:

    Now I feel stupic because I realise that the reason you don't do bounds checking is that you do it in the constructor.

    Still, since your dealing with "pointers" going off the end with an addition is not automatically a bad thing. I would leave out that assert.

  11. snarfblam says:

    Shouldn't there be more distribution in the hash codes? A set of pointers to sequential elements will have sequential hash codes.

  12. Shuggy says:

    How is havin a pointer outside the bounds not a bad thing. We're in a managed language here…

    It's better to know about an invalid state where you create it rather than when you use it, the former is vastly easier to debug and fix

  13. Shuggy says:

    @Gabe. IList is perhaps a tad confusing, but IEnumerable<T> would be reasonable.

  14. configurator says:

    @Shuggy: I must reiterate my question… When would enumeration actually return in this case?

  15. jader3rd says:

    In the equals method your chaning the value of x to ArrayPtr<T>?. But if x isn't a ArrayPtr<T>? won't ptr be null and you have a null reference exception in the line?

  16. configurator says:

    @jader: A null Nullable<T> is converted to a Nullable<T> where HasValue = false, so the next line will simply return false in that situation.

  17. John Gietzen says:

    I think that you should assert that the array length is not zero.

    Otherwise, such a condition just leads to all kinds of crazyness.

  18. Ben says:

    @Shuggy: It's not a bad thing because of the way the addition method is written. As written, this class would assert if you go off the end of an array, which would disallow looping through using the pointer. (Ie, it disallows some of the useful kinds of things it was written to allow)

    EG:

    int[] array = {1,2,3,4,5};

    ArrayPtr<int> ptr = array;

    while ( ptr < (some way of getting a pointer to the end) )

    {

     ptr[0] = 0;  

     ptr = ptr + 3; //Assert fires here on the second time through, even though we'll never use the invalid value

    }

    The assert should be on the indexer, not the constructor: even as written, it misses the case where you ask for ptr[1] where it already points to the end of the array, for example. (though this will already throw an exception, so it's possibly by design)

  19. Aaron says:

    Why did you use Object.ReferenceEquals to compare array references? Doesn't the == operator compare arrays as references anyway?

  20. Monsignor says:

    Why do you use assertions instead of exceptions?

  21. Shuggy says:

    @configurator

    An IEnumerator<T> which starts at the position the pointer was at and goes all the way to the end of the array.

  22. Shuggy says:

    @Ben

    Personally I believe modifying the guard (in this case to be <= end – 2) is appropriate.

    If the guard would become very complex the working out what you were going to increment/ decrement and doing a break if it would go put of bounds is also fine.

    I admit it might make the code slightly less easy to read but it also ensures that nobody ever has access to an invalid pointer they might use

    If you cared you could easily write a SafeIncrement method which returned null (or some other sentinel) once you fell off the edge and use the sentinel in your guard instead.

  23. Graham Clark says:

    In your example use code, shouldn't

      var p5 = arr + 5;

    be

      var p5 = p0 + 5

    ?

    Otherwise it doesn't compile.

  24. Shuggy says:

    @Monsignor

    did you miss the footnote?

    "(*) Were this to be a public type then I'd make the assertions into exceptions because there is no telling what crazy thing the public is going to do; since this is an internal type I can guarantee that I'm using it correctly, so I'll use an assertion instead."

  25. Ben says:

    @Shuggy

    I agree that a guard is probably a better idea, but the beauty of allowing pointers to outside the bounds is it retains the "object safety" of the class as written. (ie, you get the assert when you try to compare an invalid value of one array to a valid value of another). I'd possibly go for a hybrid, now that you mention it: have all invalid values compare equal, and disallow any operation other than comparison on them. This would allow you to use invalid values as guards, keeping the safety check of "comparing from the same array", while minimising the possiblity that you'll miss seeing an invalid value misused.

  26. Navaneeth says:

    Nice code. A question on your programming style. Do you use assertions frequently when writing internal classes? I have heard lot of guys talking about avoid to use assertions because they clutter in the actual code. While I think assertions should be used proactively, I would like to get your view also on this.

    Thanks

  27. FedeAzzato says:

    I believe your this[] operator should check for bound issues. Maybe something like this would be enough.

     public T this[int index]

     {

       get { return index == 0 ? source[this.index] : (this + index)[0]; }

       set { if (index == 0) source[this.index] = value; else (this + index)[0] = value; }

     }

  28. Mike Strobel says:

    Your decision to safely cast 'x' as a Nullble<ArrayPtr<T>> in Equals() threw me for a moment, as 'x' (typed as 'object') could not possibly be a Nullable<> (nullables lose their identity as nullables when boxed).  But this technique lets you achieve with a single 'as' cast what would otherwise require both an 'is' check and an unsafe cast.  Very clever.

  29. Ben Voigt [Visual C++ MVP] says:

    In C and C++, taking the difference between two pointers into the same object is well-defined, whether or not that object is an array.  Is it safe to assume that there is no library implementation which would provide equivalent functionality in C#?

  30. Mike Strobel says:

    @Ben: Not safely or with particularly good performance, but you could do something like this: http://pastie.org/1706405

    Please do not use that technique for anything other than academic purposes.  It really is not a good idea.

  31. Alex Davies says:

    There's a minor bug present in that a pointer to a zero array is permitted by the public constructor.. although this is more so yet another argument in allowing pointers to go 1 past the end ;).

Skip to main content