If you have to cast, you can’t afford it


A customer reported a crash inside a function we'll call XyzConnect:

DWORD XyzConnect(
    __in DWORD ConnectionType,
    __in PCWSTR Server,
    __in PCWSTR Target,
    __out void **Handle);
...
// HACK - Create a dummy structure to pass to the XyzConnect
// function to avoid AV within the function.
int dummy = 0;    
if ( NO_ERROR != ( XyzConnect( 0, L"", L"", (PVOID*)&dummy ) )
{
    TRACE( L"XyzConnect failed." );
    return FALSE;
}
...

The title of today's entry gives the answer away. (The title is also an exaggeration, but it's a pun on the saying If you have to ask, you can't afford it.)

The last parameter to the XyzConnect function is declared as a void**: A pointer to a generic pointer. Note that it is not itself a generic pointer, however. A generic pointer can point to anything, possibly unaligned. But this is an aligned pointer to a generic pointer. Therefore, the memory for the generic pointer must be aligned in a manner appropriate to its type.

But this caller didn't pass a pointer to a pointer; the caller passed a pointer to an int, and an int has different alignment requirements from a pointer on 64-bit systems. (You might conclude that this decision was the stupidest decision on the face of the planet, but that's a different argument for a different time. For example, I can think of decisions far stupider.)

When the XyzConnect function tries to dereference this purported void ** pointer, it encounters an alignment fault, because it does not in fact point to a void * as the type claims, but rather points to a DWORD. A DWORD requires only 32-bit alignment, so you have a 50% chance that the DWORD* is not suitably aligned to be a void*.

Mind you, you also have a 100% chance of a buffer overflow, because a DWORD is only four bytes, whereas a void* is eight bytes. The function is going to write eight bytes into your four-byte buffer.

When this question was posed, one person suggested changing the DWORD to a __int64, since the __int64 is an 8-byte value, which is big enough to hold a pointer on both 32-bit and 64-bit Windows. Then again, it's overkill on 32-bit systems, since you allocated eight bytes when you only needed four. Another suggestion was to use DWORD_PTR, since that type changes in size to match the size of a void*.

Well, yeah, but here's another type that matches the size of a void*: It's called void*.

Just declare void *dummy and get rid of the cast. And get rid of the comment while you're at it. If you do it right, you don't need the cast or the hack.

void *handle = 0;    
if ( NO_ERROR != ( XyzConnect( 0, L"", L"", &handle ) )
{
    TRACE( L"XyzConnect failed." );
    return FALSE;
}

A large number of porting problems can be traced to incorrect casts. The original author probably inserted the cast to "shut up the compiler" but the compiler was trying to tell you something.

Any time you see a function cast or see a cast to/from something other than void* or BYTE*, then you should be suspicious, because there's a chance somebody is simply trying to shut up the compiler.

Comments (41)
  1. Karellen says:

    "an int has different alignment requirements from a pointer on 64-bit systems. (You might conclude that this decision was the stupidest decision on the face of the planet"

    Why might someone conclude that?

  2. dave says:

    A cast is supposed to mean "Yes, I know that’s my foot, but I’m aiming very carefully between my toes".  Unfortunately, most of the programmers who think they know what they’re doing, don’t.

  3. someone else says:

    From what I heard, “attacking Iraq” is currently the worst decision of the 21st century, closely following “shooting Archduke Franz Ferdinand” as stupidest decision ever.

  4. Evan says:

    Not that I would recommend this over a void*, but ‘intptr_t’ can be a good weapon to have in your arsenal too.

  5. Sohail says:

    Ah, so this is why Java was invented.

  6. Nathan_works says:

    The compiler is telling you: "Give Me What I Want and I will Go Away"

  7. Alexandre Grigoriev says:

    Some people are willing to go to great lengths to shoot themselves in the foot.

  8. This leaks the handle, too.

  9. Leo Davidson says:

    Amen to that, especially with function casts which can hide mistakes about calling conventions (that may only cause intermittent crashes on some architectures and thus be very hard to track down but very much a problem).

  10. Mark says:

    Maurits: the documentation for XyzConnect specifies that no handle is created for a NULL connection type.

  11. Anonymous says:

    Leave it to Old New Thing readers to complain about a leak in callers of made-up APIs concocted to illustrate a point.  Be sure to file a bug on the XyzConnect() team. :-)

  12. Maurits says:

    no handle is created for a NULL connection type

    In that case it should allow passing NULL for the output variable.

  13. Peeter Joot says:

    When we did our 64-bit port, the first addition to our coding standards was:

    64-bit rules:

    AVOID CASTING AT ANY COST.

    AVOID CASTING AT ANY COST.

    AVOID CASTING AT ANY COST.

    Oh, did I also mention:

    AVOID CASTING AT ANY COST.

    AVOID CASTING AT ANY COST.

    AVOID CASTING AT ANY COST.

    … I can’t count the number of times I saw this in the port (compile time and run time varients of this).  It was a lesson that should have already been learned, since our product was old enough that it had already been through a 16-bit to 32-bit port;)

  14. John says:

    Peeter: You can’t realistically avoid all casting.  I think a better target would be to avoid casting between non-pointer and pointer types (i.e. int to/from void*); this would probably catch 99% of the issues.  We have so much code that assumes a pointer is 4 bytes; it is going to be a pain if/when we try to build a 64-bit port.

  15. dave says:

    re: Peeter Joot

    Yes, I agree. I operate under "every cast indicates an error of design".

    I know it’s not literally true all the time, especially in C++, and especially when you’re writing low-level code, but nevertheless I find it a useful starting position.

    At least in C++ you get to say why you’re casting (C-style casts are forbidden in C++ code submitted to me review).

  16. Dave O says:

    To say ‘there’s no such thing as a good cast’ is all very well, but it’s a very simplistic view.

    If you’re casting to shut up errors, then it’s clearly bad form, but the quantity of casting I have to do to shut up warnings does disturb me a bit.

    I mean, clearly (to a human) there’s nothing wrong with:

    uint32 a = 0xff837245;

    uint16 b = a & 0xffff;

    (where uint32 == a 32-bit unsigned integer, etc.)

    but sadly the compiler warns on this case. What do I do? I feel uncomfortable inserting a cast, both for the reasons given above and the fear that in the future I might change the behaviour to something that’s actually wrong, but the cast would shut up the error.

    But I either have to cast or disable the warning – nightmare.

    Sure, I’m mostly a fairly low-level coder :)

    Andrew Koenig’s point about the ‘danger’ of warnings really resonated with me when I read it, and it’s worth a look: http://www.ddj.com/blog/cppblog/archives/2007/10/the_hazards_of.html

  17. Yuhong Bao says:

    “Any time you see a function cast or see a cast to/from something other than void* or BYTE*, then you should be suspicious, because there’s a chance somebody is simply trying to shut up the compiler.”

    Even with casts to void*, sometimes you have to be careful too.

    http://blogs.msdn.com/sdl/archive/2009/07/28/atl-ms09-035-and-the-sdl.aspx

    [True, but the false positive rate for void* casts is too high to make a blanket rule out of it. -Raymond]
  18. GrumpyYoungMan says:

    @Dave O

    If a development organization is so inflexible that there isn’t a way for a code review panel to determine it would be okay to turn off a warning for a section of code where it’s needed, then obviously it’s the organization that’s defective, not the compiler warning.

  19. Anonymous Coward says:

    Someone Else, I won’t comment on recent history since that would just invite people to draw unwarranted conclusions about my political views, but I can say that the assassination can’t possibly be classified as the worst decision ever.

    Princip knew what he was doing, he knew the likely consequence of what he was doing, and never regretted a thing. He wanted to rid the world of Franz Ferdinand whatever the cost and he succeeded. He made the right decision.

    Of course, most of Europe was pretty unhappy with the result, and in letting the situation come about errors were certainly made.

    This is a tech blog so I won’t spend too much time on where to hunt for those mistakes, but here are some places where you can get started on your trail, should you be so interested: Franz Ferdinand himself, widespread nationalism, Franz Conrad von Hötzendorf, and of course the various politicians who helped shape the military and socio-economic environment that in amplifying small incidents like the assassination made war as good as inevitable.

  20. Peeter Joot says:

    You can’t realistically avoid all casting.

    Okay, I admit it.  You caught me being overly dramatic;)  I’m actually a realist, and for all the thousands of problematic casts that were fixed in our port (DB2 UDB … a large product) there are probably 10’s of thousands more that weren’t.

      I think a better target would be to avoid casting between non-pointer and pointer types (i.e. int to/from void*)

    When/if you try a 64-bit port you’ll actually find that the compiler doesn’t allow this without first using an intptr_t cast or something else similar in between, so many of your casts will be flagged as wrong for you.  There are also helpful compiler options on many platforms to help identify 64-bit porting issues.

    A bigger problem for us wasn’t actually the fact that we had many places that used 4-byte values for pointer sized things, but that we had many places where long was used for 4-byte values.  Our porting to 64-bit was all initially on Unix where long increases size (unlike Windows where they kept long as 32-bit).  The pervasive long usage in our code ironically dated back to our 16-bit to 32-bit historical roots where "for 32-bit portability" direct use of ‘int’ was not supposed to be used, and nobody had the forsight at the time to use a typedefs for explicit 32-bit values;)

  21. Worf says:

    That reminds me of a list I saw linked from here that was describing horrible APIs, and one was where correct usage was difficult. Like all the socket APIs requiring you to cast sockaddr_in/sockaddr_in6 to sockaddr (and neverming sockaddr_storage).

    Thankfully IPv6 introduced an API that fixes this (getaddrinfo and friends) and that is compatible with other protocols as well, yes, IPv4. But it gets you a sockaddr.

  22. Semi Essessi says:

    I often find myself casting ints to floats or similar to get different behaviour than the automatic type conversions give. The most obvious example is wanting the ratio of two values which must be integral e.g. pixel width/height of a window, for the aspect ratio.

    There are some more other uses of cast I would consider valid too. Lets say you have some "bytes" you want to execute from memory: reinterpret_cast<void(__stdcall *)()>(bytes)(); The only alternative I can think of is some inline assembler.

  23. PVS-Studio (Viva64) diagnose this problem as: "Error V114: Dangerous explicit type pointer conversion. simple.cpp 15"

    Viva64 tool provides detection of errors typical of 64-bit Windows applications. Viva64 is a lint-like static analyzer of C/C++ code. Viva64 integrates into Visual Studio 2005/2008 environment.

    Link: http://www.viva64.com/pvs-studio/

  24. Csaboka says:

    Semi: I don’t think anyone considers numeric casts as dangerous. The worst they can do is chopping off some bits or lose precision, but they will never crash your app.

    Actually, speaking in C++ terms, static_cast and dynamic_cast should be harmless casts. Reinterpret_cast and const_cast, on the other hand, should be used very, very carefully. (The example in the original post would have required a reinterpret_cast if done with C++-style casts. The complier would have rejected a static_cast here since void* and int aren’t related types.)

  25. rs says:

    @Csaboka

    I don’t think anyone considers numeric casts as dangerous. The worst they can do is chopping off some bits or lose precision, but they will never crash your app.

    I thought so, too. Until I discovered this C program (try on a 32 bit system):

    int main(void)

    {

    volatile unsigned int a = 1 &lt;&lt; 8*sizeof(int)-1;
    
    volatile int b = -1;
    
    return (int) a / b;
    

    }

    The int cast causes the program to crash.

  26. Dave O says:

    @GrumpyYoungMan

    If a development organization is so inflexible that there isn’t a way for a code review panel to determine it would be okay to turn off a warning for a section of code where it’s needed, then obviously it’s the organization that’s defective, not the compiler warning.

    Agreed, but in the case in question, the only organisation is me :) . The problem isn’t that I’m not allowed to turn the warning off, it’s that – at least in the particular case I descibe – I don’t want to turn the warning off, because it’s a ‘false positive’; the non-false-positive case might tell me something useful; and disabling / enabling such warnings on a line-by-line case is unspeakably horrible.

    Sure, I want the moon on a stick, too :) .

  27. Dave O says:

    I don’t think anyone considers numeric casts as dangerous. The worst they can do is chopping off some bits or lose precision, but they will never crash your app.

    As above – for example, few people seem to be aware that the integer divide -MAX_INT / -1 will cause an unhandled Integer Overflow exception on x86, and that C / C++ code doesn’t do anything to insulate you from this.

    (   volatile int x = (int)0x80000000;

       volatile int y = -1;

       volatile int z = x / y;

    if you want to give it a try).

  28. Csaboka says:

    rs, Dave O:

    Granted, this is a failure mode I didn’t know about, but it has nothing to do with casting. I can reproduce the same issue without casting. Observe:

    int main() {

    volatile int x = -INT_MAX;
    
    x--;
    
    volatile int y = -1;
    
    x /= y;
    
    return 0;
    

    }

    I still maintain that static_cast and dynamic_cast shouldn’t cause crashes. Incorrect use of the returned values can cause crashes, of course, but that isn’t a problem with the cast per se.

  29. Not an expert says:

    I have cast a pointer offset into a character buffer holding raw file header bytes to extract a value.

    unsigned long byteRate = *(unsigned __int32*)(buffer+28);

    It seems to work but perhaps someone more knowledgeable than I will explain the flaws with this approach, excepting of course that I should have used a C++ style cast.

    [This falls under the “cast from void* or BYTE*” umbrella (BYTE* is unsigned char*). Though you’re in trouble if “buffer” is not on a 4-byte boundary. -Raymond]
  30. lh says:

    I would hear the same things when using things like boundchecker or lint.

    I would also then almost immediately hear from the other devs with ‘years of experience’ ‘Its too noisy and most of those errors are meaningless.’

    I would always say ‘prove it, then you can cast the errors away’.  Almost every single time it was a ‘oh yeah could be very bad’.  I would say 99.9% of the time those warnings are good to look at.

    In my ‘years of experience’ ignore those warnings and errors at your own peril.

  31. someone else says:

    @Anonymous Coward

    “Princip knew what he was doing, he knew the likely consequence of what he was doing, and never regretted a thing.”

    As did Charles Manson.

    I am well aware that the assassination was merely the spark that ignited a politically unstable situation. It’s also well known enough so I wouldn’t have to explain.

  32. Lucas says:

    I currently have an image processing framework being used by students. In it, I perform a reinterpret_cast on Qt’s QImage raw pixel data (and I’ve told the QImage to translate itself to 4-bytes per pixel before doing this) in order to typecast it to a Pixel object, which contains four bytes which have been #pragma pack’d. Red, green and blue channels are used for arbitrary ‘color data’, and the alpha channel is used to determine what sort of data is stored in the other three channels, whether that is r/g/b, y/i/q, h/s/i, or whatever.

    This way, code can operate directly on intensity values in an image (modifying the ‘y’ byte, with a couple of optional wrapper methods like SetIntensity() to check to see if the pixel is in yiq mode) without needing to allocate additional memory.

    It’s more of an odd design decision than a ‘shut up the compiler’ sort of quick-hack. However, I’m not entirely convinced it’s the right way to approach this.

  33. yme says:

    Dave O said: "disabling / enabling such warnings on a line-by-line case is unspeakably horrible"

    Why?

    I mean, the alternative is casting, and that’s also done line by line.

    The best option does seem to me to be disabling the warning for a particular section of code after convincing yourself that the code in question is truly ok.  The warning served its purpose — to get you to look carefully at the code — and now you don’t need it anymore.

  34. Medinoc says:

    The problem is that some functions somehow require you to cast, such as send() and recv(). For some reason, they demand a char* instead of a void* (this might or might not have been the case in the original BSD specification, but it’s definitely not the case in POSIX).

  35. SuperKoko says:

    @yme:

    pragmas to disable warnings are compiler-dependent which make the code hard to read for people who aren’t used to this compiler, and look like noise.

    Explicit type casts are another way to disable warnings and are ugly too.

    Just avoid needing typecasts. This makes the code prettier and more portable. That’s not too hard, except when dealing with badly-designed third party libraries.

  36. avek says:

    @yme:

    > Dave O said: "disabling / enabling such warnings on a line-by-line case is unspeakably horrible"

    > Why?

    Because disabling and reenabling of warnings require many lines each. To put all those push/pop or on/off pragmas there. And guard them by #ifdef _MSC_VER. Casting is much more compact and doesn’t have any side-effects-up-to-the-end-of-file by design.

    If the code in question is not yours, the best possible option to deal with warnings there is simple: you should ignore them.

    One only should fix something in imported code if it has been tested and the tests show that something’s wrong there. (I speak of tests and problems in the most general sense: if it doesn’t compile or link, it qualifies as test failure, too). Otherwise, it works; if it works, don’t touch it. That’s the foundation of modularity and code reuse in real life.

    If the code with warnings is yours, however, the right solution is to rewrite it so it compiles without warnings. Cast can be used in the process, as the last resort. And every use of casts must be marked by a comment explaining the cast, of course.

    @Medinoc:

    >The problem is that some functions somehow require you to cast, such as send() and recv().

    > For some reason, they demand a char* instead of a void*

    The only thing that you can actually transmit over the net is a sequence of bytes. And if you try to send or receive something else, like a C structure or, even worse, a C++ object, that doesn’t have to work in practice, because structure fields size, alignment, byte order, internal compiler-generated fields, etc. are never fully under your control in C. The proper way to send data is to platform-independently serialize it into char[] buffer, then send the buffer.

    So, if you use anything but char[] as buffer argument for send/recv, it’s actually a hack. And non-portable. And your code should at least be marked by something that smells; a cast will stink just strong enough.

    But there are other APIs, Windows full of them, that require dangerous casts. For instance, COMCTL32 controls often let you read and write some parameters through the same struct. For reading, pointers to strings in there are of LPTSTR type and must point to writeable memory buffers. And when you set that string instead of reading it, you must fit it into the same LPTSTR, even if the string’s not writeable (and, according to the documentation, needs not be).

    Enter the const_cast. And since most of the programmers (that I know of) don’t know the const_cast keyword, the dreaded (LPCTSTR)pszText comes in its stead.

  37. porter says:

    Nobody has mentioned

    union

    {

        void *pvIsWhatItExpects;

        DWORD dwIsWhatIWillGiveIt;

    } arg;

    arg.dwIsWhatIWillGiveIt=0;

    XyzConnect(NULL,L"",L"",&arg.pvIsWhatItExpects);

    No casts or warnings were harmed in the compilation of this code! Your milage may vary by design.

  38. Goran says:

    @porter: unfortunately, in the light of the discussion (64 bit issues) that should be:

    union

    {

       void *pvIsWhatItExpects;

      DWORD_PTR dwIsWhatIWillGiveIt;

    } arg;

    Idea is good, though.

  39. porter says:

    > @porter: unfortunately, in the light of the discussion…

    I was deliberately leaving the shotgun on the wall in Act 1.

  40. porter says:

    > @porter: that’s illegal code, you’re only allowed to read the union member you last wrote.

    So is NET_ADDRESS_INFO from IPHlpApi.h :)

    Agreed, however the point was to demonstrate avoiding a cast or compiler error.

  41. Neil says:

    @porter: that’s illegal code, you’re only allowed to read the union member you last wrote.

    You’re always going to need a cast somewhere to get QueryInterface to work, but I can’t help wondering whether a void* result would have made more sense (although it occurs to me that it may have led to confusion when using CComPtr).

    void** version:

    STDMETHODIMP Foo::QueryInterface(REFIID iid, void** ppv)

    {

     if (IID_IBar == iid) {

       IBar *bar = new Foo2Bar(this);

       if (!bar)

         return E_FAIL;

       bar->AddRef();

       *ppv = bar;

       return S_OK;

     }

     // etc.

    }

    STDMETHODIMP nsBaz::GetBar(IBar** pbar)

    {

     void *pv = NULL;

     mFoo->QueryInterface(IID_IBar, &pv);

     *pbar = (IBar*)pv;

     return S_OK;

    }

    void* version:

    STDMETHODIMP Foo::QueryInterface(REFIID iid, void* pv)

    {

     if (IID_IBar == iid) {

       IBar *bar = new Foo2Bar(this);

       if (!bar)

         return E_FAIL;

       bar->AddRef();

       *(IBar **)pv = bar;

       return S_OK;

     }

     // etc.

    }

    STDMETHODIMP nsBaz::GetBar(IBar** pbar)

    {

     return mFoo->QueryInterface(IID_IBar, pbar);

    }

Comments are closed.

Skip to main content