What's wrong with this code, part 4: the answers

As I mentioned yesterday, this is a subtle problem.  Apparently it wasn’t subtle enough for the people commenting on the API, without fail, everyone nailed it perfectly.

But this IS a problem that I run into at least once a month.  Someone comes to my office and says “I’m getting an undeclared external in my application and I don’t see why!”  95% of the time, the problem is exactly the one in this bug.

Here’s the offending line in the buggy header:

NIFTY_API_API int MyNiftyAPI(void);

Not really surprising, heck, it’s the ONLY line in the buggy header.  What’s wrong with the line is actually an error of omission.  The header doesn’t specify the calling convention used for the API.  As a result, in the absence of an explicit calling convention, the compiler assumes that the current calling convention is the calling convention for the API.

Unfortunately that’s not the case.  The calling convention for an API is set by the compiler when the code is built, if every part of the project uses the same calling convention, you’re fine, but if anyone compiles their code with a calling convention other than yours, you’re toast.  Raymond goes into some detail on how to diagnose these problems here, so…  As I mentioned yesterday, he’s written a number of posts on the subject.

The key indicator that there might be a problem was my statement “I’m writing a DLL”.  If this was just a routine in my application, it wouldn’t matter, since all the components in my application are usually compiled with the same set of compiler settings.

But when you’re dealing with DLL’s (or statically linked libraries), the consumer of your code typically isn’t in the same project as you are, so it’s absolutely critical that you specify the calling convention you used to prevent them from using the “wrong” calling convention in their code.

Kudos:

Grant pointed out the problem initially, followed quickly (and more authoritatively) by Borsis Yanushpolsky.  Everyone else posting comments also picked it up.

Grant also pointed out (in private email), and Mike Dimmick pointed out in the comments section that there’s another, equally glaring problem with the header.  It is missing an extern “C” to correctly inform the compiler that the API in question shouldn’t use C++ name decoration.  The code should have been wrapped with:

#ifdef __cplusplus
extern "C" { /* Assume C declarations for C++ */
#endif /* __cplusplus */ 

#ifdef __cplusplus
} /* End of extern "C" { */
#endif /* __cplusplus */ 

 So the full version of the header should be:

// The following ifdef block is the standard way of creating macros which make exporting
// from a DLL simpler. All files within this DLL are compiled with the NIFTY_API_EXPORTS
// symbol defined on the command line. this symbol should not be defined on any project
// that uses this DLL. This way any other project whose source files include this file see
// NIFTY_API_API functions as being imported from a DLL, whereas this DLL sees symbols
// defined with this macro as being exported.
#ifdef NIFTY_API_EXPORTS
#define NIFTY_API_API __declspec(dllexport)
#else
#define NIFTY_API_API __declspec(dllimport)
#endif

#if defined(_STDCALL_SUPPORTED)
#define STDCALL __stdcall // Declare our calling convention.
#else
#define STDCALL
#endif // STDCALL_SUPPORTED

#ifdef __cplusplus
extern "C" { // Don’t use C++ decorations for our API names
#endif

NIFTY_API_API int STDCALL MyNiftyAPI(void);

#ifdef __cplusplus
} // Close the extern C.
#endif

I have a huge preference for __stdcall APIs.  They have all of the benefits of the __cdecl calling convention (except for the ability to support variable numbers of arguments) and they result in significantly smaller code (since the routine cleans its stack, not the caller).   As I mentioned in a comment in the previous post, the savings that NT achieved when it switched its default calling convention from __cdecl to __stdcall was huge – far more than we had anticipated.

There’s still one more potential bug: The header file isn’t wrapped in either a #pragma once, or in #ifdef _NIFTY_API_INCLUDED_/#define _NIFTY_API_INCLUDED_/#endif // _NIFTY_API_INCLUDED_.  Given the current API header this isn’t a problem, but if the header file is included from multiple places, the possibility exists that definitions within the header could result in multiple definitions, which could later result in errors.

McGroarty brought up an interesting point:

I'm no Windows guy, but I'll put a cautious eye to the generic int being subject to the signedness and size of the day.

I hadn’t considered this originally, but he has a point.  Int is the only fundamental C/C++ type that has a variable size, it is guaranteed to be at least as large as a short (which is larger or equal to a char, but guaranteed to be able to hold values from -32767 to 32767).  So an int can be either a 16 bit, 32 bit or 64 bit quantity.