Ptrdiff_t is evil


Well, not really, but here’s a code problem that confounded some really smart devs – and it looks so simple!


void IncPtr( unsigned int cElements )


{


    if( m_pMax – m_pCurrent > cElements )


        m_pCurrent += cElements;


    else


        throw;


}


OK, so here’s the question – if an error has happened, and m_pCurrent is > m_pMax, which implies the difference in the pointers is negative, which code branch do we execute? Assume cElements is a reasonably small number.


Hmmm – the immediate answer would be that the difference gets cast to an unsigned int to be compared with cElements, if it is negative, then it becomes large, and it is not less than cElements, so we throw, so this code is safe, right?


The answer, unfortunately, is a solid maybe. Back in engineering school, I got acquainted with something called dimensional analysis where you worked something out based on dimensions. For example, if you want to know how to get gallons from some number of miles and miles/gallon, figuring (miles / (miles/gallon)) shows the answer is in gallons. A similar approach for integers is often helpful. Let’s look at the types involved in the if statement. First, what is the type of a pointer difference? That’s a ptrdiff_t – which is a signed number that is the same number of bytes as a pointer, which means that on a 64-bit build, it is an __int64, and on a 32-bit build, it is a 32-bit int.


What we now have is:


If( ptrdiff_t < unsigned int)


If you have a 32-bit build, this works out to:


If( int < unsigned int)


Which then implies:


If((unsigned int)int < unsigned int)


The cast gives you an implied check for the lhs being less than zero (assuming reasonably small values for the unsigned number), and negative numbers will now fail, and since this is an error, this is what we want, and life is good. Do note that if you’re compiling with all the warnings on, this will cause a warning, which you’d probably ignore, or cast away, being oblivious to the impending doom that is approaching.


Now consider 64-bit:


If(__int64 < unsigned int)


This gets cast very differently…


If( __int64 < (__int64)unsigned int)


You won’t get a warning on your 64-bit build because the cast from unsigned int (like the cast from unsigned short to int) preserves value, and the assumption is that the comparison will always be correct. Under the error condition outlined here, the problem is that we’re now not catching the error, we’ll add to a pointer that’s already probably bogus, and things will get worse from here.


As you can see, which branch gets executed depends on whether you’re building 32 or 64-bit!


The solutions and lessons are –



  • Be wary of pointer math in porting code to 64-bit – ptrdiff_t is negative, and changes size.

  • Using the bit flipping of negative to very large positive effect is really programming with side effects, and being clever, neither of which are good practices.

  • Explicitly determine which path to take when dealing with negative numbers in a comparison

BTW, SafeInt will be available very shortly on CodePlex – I have just one more internal hoop to jump before I can post it.

Comments (5)

  1. Shane Hatagawa says:

    Is taking _either_ branch of this function a good thing?  You imply that having the function throw is ‘safe’, but the class is not in a valid state, right?  Any attempt to use the class is already going to access an out of bounds element, changing m_pCurrent from one invalid value to another isn’t going to help, but throwing from this function isn’t going to help either.  Ideally this class would be checking invariants, and would have caught this when the error occurred.  As for the secure choice in this circumstance?  I don’t see how you could do anything safe here except terminate (if possible saving as much user state for a better restart/recover).  None of this takes away from your discussion here, but sometimes in the face of broken program state, there is no safe alternative except to just stop the train.

    [dcl] good points, but it is just an example, and what’s the right error path is very dependent on a lot of context we don’t have. The main point is that give the same input, we go down different branches on 32 and 64-bit, which is a really nasty trick.

  2. david_leblanc says:

    Someone wrote me, and pointed out:

    You give the following example:

    void IncPtr( unsigned int cElements )

    {

       if( m_pMax – m_pCurrent < cElements )

           m_pCurrent += cElements;

       else

           throw;

    }

    The logic of this conditional statement seems messed up.  Suppose the variables have been initialized like this:

    static int array[10];

    m_pCurrent = array + 5;

    m_pMax = array + 10;

    i.e. m_pMax points to the end of the array and m_pCurrent points somewhere in the middle.  From the names, I expect this is how the variables would be used.

    Then, something calls IncPtr(2).  Now, m_pMax – m_pCurrent is 5, and 5 < 2 is false, so the function throws (assuming it’s called from a catch clause) as if there had been an error.  However, it would have been perfectly safe to increment m_pCurrent by 2.

    On the other hand, if something calls IncPtr(99), then 5 < 99 is true, and the function happily does m_pCurrent += cElements, incrementing the pointer far past the end of the array.

    Thus, the test should be inverted.  And if you do that, then the behaviour in 64-bit builds, where the unsigned int gets converted to ptrdiff_t without any compiler warnings, will actually become correct; and the 32-bit builds that trigger a warning will exhibit the bug.

    [dcl] This has now been fixed. The overall point that the branch is different depending on build is the important bit.

  3. aaron says:

    Which is why ptrdiff_t and size_t should always go hand in hand.  The error is that cElements should be of type size_t.

    [dcl] That and overloading the bounds check for the pointer difference to rule out negative value by using a 'clever' side-effect.

    Also, I thought that, technically, it's possible for sizeof(ptrdiff_t) != sizeof(void*), but rather you're only guaranteed that sizeof(intptr_t) == sizeof(void*).  In practice, though, you'll probably get by with either.

    [dcl] I think you're right, but as you point out, in practice they're the same.

  4. gvisser says:

    A better comparison is to test the sum vs the max:

     if ( (m_pCurrent + cElements) > m_pMax ) throw;

    [dcl] Except that any pointer addition outside of the buffer isn't defined by the standard, and could be optimized out, which is exactly what gcc did a while back. Regardless of what the compiler does, we shouldn't write non-standard code.

     

    Your issue with

     If ( ptrdiff_t < unsigned int )

    becoming

     If (__int64 < unsigned int )

    is the result of Microsoft's decision to use the LLP64 Data Model, not because ptrdiff_t is evil.

    [dcl] That's incorrect. It's the result of the C standard and how operator casts work.