What’s the difference between int and INT, long and LONG, etc?


When you go through Windows header files, you'll see types with names INT, LONG, CHAR, and so on. What's the difference between these types and the uncapitalized ones?

Well, there isn't one any more.

What follows is an educated guess as to the story behind these types.

The application binary interface for an operating system needs to be unambiguous. Everybody has to agree on how parameters are passed, which registers are preserved, that sort of thing. A compiler need only enforce the calling convention rules at the boundary between the application and the operating system. When a program calls another function provided by that same program, it can use whatever calling convention it likes. (Not a true statement but the details aren't important here.) Therefore, a calling convention attribute on the declarations of each operating system function is sufficient to get everybody to agree on the interface.

However, another thing that everybody needs to agree on is the sizes of the types being passed to those functions or used in structures that cross the application/operating system boundary. The C language makes only very loose guarantees as to the sizes of each of the types, so language types like int and long would be ambiguous. One compiler might decide that a long is a 32-bit integer, and another might decide that it's a 64-bit integer. To make sure that everybody was on the same page, the Windows header files defined "platform types" like INT and LONG with prescribed semantics that everybody could agree on. Each compiler vendor could tweak the Windows header file to ensure that the type definition for these platform types resulted in the value that Windows expected. One compiler might use typedef long LONG another might use typedef __int32 LONG.

Okay, but this doesn't explain VOID. Maybe VOID was added for the benefit of compilers which didn't yet support the then-new ANSI C standard type void? Those older compilers could typedef int VOID; and functions that were declared as "returning VOID" would be treated as if they returned an integer that was always ignored. Or maybe it was just added to complete the set, who knows.

In the intervening years, most if not all compilers which target Windows have aligned their native types with Windows' platform types. An int is always a 32-bit signed integer, as is a long. As a result, the distinction between language types and platform types is now pretty much academic, and the two can be used interchangeably. New Windows functions tend to be introduced with language types, leaving platform types behind only for compatibility.

Comments (40)
  1. waleri says:

    If only more suitable names were chosen… like BYTE/WORD/DWORD, altghough strictly speaking they are also incorrect (since word size is CPU dependant). Best choice would be INT32 or UINT16… And we wouldn’t end up with DWORD_PTR…

    Now where is that time machine?

  2. Mark Steward says:

    WORD, DWORD, etc are used for the unsigned variants.

  3. SvenGroot says:

    Yeah, it’s eternally confusing that a WORD is actually only half a CPU word on Win32, and DWORD is a full single word. And the whole _PTR range of types is very useful, but also completely non-semantic when it comes to type naming.

    But unfortunately science hasn’t made any progress on that time machine yet, so what can you do?

  4. Aaargh! says:

    "Now where is that time machine?"

    It’s in the dock, second icon from the right.

  5. JM says:

    It’s very common for people who don’t understand the issues at hand to blame this on the bone-headed C committee, or on the narrow-minded Microsoft developers, or on the stupid application programmers, or…

    The paradoxical situation is that C attempts to maximize portability by not tying the integral types down to specific sizes, and programmers are just as quick to abandon that completely and weave the assumption than an int is exactly X bits into the heart of their programs, since C makes it so easy to write code that assumes these things — even though it’s almost never harder to write equivalent code that doesn’t.

    If you think about it, though, the "maximally portable" approach breaks down rather easily, for exact reasons of ABI and API compatibility. What type should you use for file lengths? int? long? unsigned long? unsigned long long? Which one’s going to be big enough, you think? If your API uses a LARGE_INTEGER struct, how many people are going to want to write glue code for maintaining portability?

    All these API-specific typedefs do have the big drawback that these types seep into perfectly portable code that doesn’t need to be tied to the Win32 API. Windows programmers are often incapable or unwilling of writing a single line of code that doesn’t require "#include <windows.h>" at the top, which is just a shame.

  6. Daev says:

    @SvenGroot

    "Word" as a term for a two-byte quantity goes back a long way in the nomenclature of Intel processors, into the days of the 8080 and Z-80 (whose native data size was a byte).  I remember writing DEFW to store a 16-bit number in TRS-80 Assembler.

  7. Daev says:

    @SvenGroot

    "Word" as a term for a two-byte quantity goes back a long way in the nomenclature of Intel processors, into the days of the 8080 and Z-80 (whose native data size was a byte).  I remember writing DEFW to store a 16-bit number in TRS-80 Assembler.

  8. Yuhong Bao says:

    "Windows programmers are often incapable or unwilling of writing a single line of code that doesn’t require "#include <windows.h>" at the top, which is just a shame."

    The developer of VirtualDub says that there were several bugs in MFC resulting from, say, #define GetCurrentTime() GetTickCount(). I mean, basically what is happening here is that IFoo::GetCurrentTime() become IFoo::GetTickCount(), which was hidden because MFC code usually #include <windows.h>.

  9. Nathan says:

    In the intervening years, most if not all compilers which target Windows have aligned their native types with Windows’ platform types.

    Not to be a nitpicker, but assuming this has gotten more than one programmer in trouble on 64-bit platforms. For this very reason, at the company I work for we use the C99-standardized integer types int8_t, int16_t, int32_t, and int64_t (along with their unsigned brethren) (we have the advantage of not exposing a public C API, so we can make changes like these when necessary–I fully understand that the Windows API functions were written well before the stdint types came along and thus these types are not a universal panacea).

  10. Grant says:

    Now that PowerPC is dead this is also probably academic, but I always figured you could assume the endian-ness of a LONG correctly vs a long if you’re (for example) trying to parse a .bmp on a Mac.

  11. guess says:

    My guess would be that there was no technical reason and the uppercase typedefs were introduced in order to keep the style consistent (all type names are uppercase).

  12. Cooney says:

    The paradoxical situation is that C attempts to maximize portability by not tying the integral types down to specific sizes, and programmers are just as quick to abandon that completely and weave the assumption than an int is exactly X bits into the heart of their programs, since C makes it so easy to write code that assumes these things — even though it’s almost never harder to write equivalent code that doesn’t.

    C is portable assembler, so really, that sort of thing makes sense. It also makes guranatees on the minimum size of an int (or a long), so you can assume that a long has 32 bits you can use.

    In the intervening years, most if not all compilers which target Windows have aligned their native types with Windows’ platform types.

    Or rather, they’ve come to follow the unix way of doing things (more or less). Ints have been 32 bits in SPARC land for a while now.

    Now that PowerPC is dead this is also probably academic, but I always figured you could assume the endian-ness of a LONG correctly vs a long if you’re (for example) trying to parse a .bmp on a Mac.

    If you want to check endianness (or just see if it differs), cash &((int))1 to a char* and see of the first byte has something in it. There’s also the 0xFEFF thing that unicode uses.

  13. me says:

    To waleri:

    But: A byte is not always 8 bit. That’s the reason that network standards (e.g. from IEEE, or the RFCs from the IETF) use the name "octet" instead.

  14. Ulric says:

    It has been a problem for us actually, in the places where there is inconsistent usage of LONG vs ‘long’.

    We use a product called MainWin, which is an officially licensed port of Win32/Win64 to Linux.  

    on windows 64-bit with ms compilers:

    int and long are 32-bit

    64-bit linux with gcc:

    ‘int’ is 32-bit, LONG is 32-bit, the native compiler ‘long’ is 64-bit .

  15. steveg says:

    I always thought Windows.h had the platform types because some genius in the Win16 days foresaw Win32 on the horizon?

    It was MUCH easier porting to Win32 if you’d used the platform types (not that they were called that then IIRC). The porting pain was also reduced if you’d used windowsx.h — that was also a piece of genius, IMO — not just because of porting concerns, it actively encouraged people to stop writing 9000 line switch statements, and break each message into a function (here I could wax lyrical about how well MFC and OWL did[n’t] do that, but I won’t).

    One of my favourite porting bugs i screwed up was moving an app to a new version of Irix: the OS changed PIDs from uint16 to uint32. No problems until the PID reached 2^16…

  16. edgar says:

    No one remember the fight between Pascal and C ?

    Better you wrote your one types, because you don’t know what in future happens. :)

  17. marius says:

    well….

    In VB6 byte is a byte (8 bits), boolean is an integer actually in memory so it’s better to use byte instead of boolean,  integer is 2 bytes, long is 4 bytes… Obviously when trying to use API functions it’s a joy…

    If you don’t care about memory usage the application runs faster if most of the variables are of long data type…

    VB.net had to come along and say Integer is 4 bytes, Long is 8 bytes…. for amateur programmers that want to convert their vb6 programs to .net there’ll be lots of problems.

  18. Anon says:

    I always thought windowsx.h was the best Win16/Win32/Win64 ‘application framework’, despite or maybe because there isn’t really much to it.  

  19. Worf says:

    Do remember that C only guarantees one thing w.r.t. native types.

    sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

    64 bits is a whole new ball game. You can have ints be 64bit (forcing long to be 64 bit), int and long being 32 bit (use long long for a generic 64bit type), or int be 32 bits, and long be 64 bits. Each actually is a different way of doing 64 bit computing. At least though, pointers are 64 bit.

    As for the word/dword… it’s architecture-specific. Motorola used word to be the native register size (32 bit), and halfword to be half that (16 bit). And things get fun with bytes, words, and longs…

  20. Neil says:

    I thought that (U)INT were, at least in 16/32 bit land, equivalent to (unsigned) int, and it was only WORD, DWORD and LONG that had a fixed length. (Strangely enough there was no signed short.) The one oddity was WPARAM which got changed from WORD to UINT.

    As for the windowsx.h GET_ macros, they show up the places that weren’t sufficiently forward compatible (which at least goes to show that the rest of Windows was, I guess).

  21. Ian says:

    To Grant,

    I think you made a couple of wrong assumptions. First, the platform types are nothing to do with endianness. LONG will be big-endian on a big-endian machine and little-endian on a little-endian machine. How can a simple typedef possibly reverse byte-ordering?

    Second, if you want to write portable code you should still take care not to assume endianness. Even if PowerPC were dead (which it isn’t) there are still plenty of other big-endian architectures.

  22. SuperKoko says:

    @waleri: That would be for a specific generation of the Windows API where the ABI is perfectly defined.

    Take a time machine, look at the Windows 16 bits -> Win32 transition, and understand why INT isn’t called INT32…

    Basically, INT is implicitly supposed to be an efficient platform-specific integer type.

    INT: 16 bits on Windows 16 bits, 32 bits on Win32, 32 bits on Win64.

    Yep, 32 bits integers tend to be more efficient, even on 64 bits computers, because they use far less cache and RAM.

    LONG being 32 bits on Win64 rather than 64 bits is only for backwards compatibility with badly programmed applications assuming LONG==INT==32 bits.

  23. Grant says:

    My point was, if you see something like this:

    struct IconDirectoryEntry {

       BYTE  bWidth;

       BYTE  bHeight;

       BYTE  bColorCount;

       BYTE  bReserved;

       WORD  wPlanes;

       WORD  wBitCount;

       DWORD dwBytesInRes;

       DWORD dwImageOffset;

    };

    You have a pretty good idea that things are little endian when you’re reading it from disk.  The same wouldn’t automatically hold true if it was all chars, ints, and longs.

  24. Khan says:

    @Worf: C guarantees more. It guarantees short and int are at least 16 bits and long is at least 32 bits.

  25. Cooney says:

     The same wouldn’t automatically hold true if it was all chars, ints, and longs.

    If you also write 0xFEFF to disk in a known location, you can assume that if it comes back as 0xFFFE, you need to byteswap. Goofy byte orders like 1324 seem to be a vax thing only.

  26. manyirons says:

    Way back I somehow learned that short and long were like a dozen.  A dozen is always 12, short is always 16 bits, and long is always 32 bits no matter what.  int and word are free to change, depending on the compiler.

    In my experience porting back and forth between various "16-bit" and "32-bit" platforms, the above was always correct.

    I’m surprised now to hear from Ulric that "long" has been defined as 64 bits on Linux 64 bit gcc.  Seems like a mistake; can’t we ever depend on anything in this business?

     – Owen –

  27. manyirons says:

    Way back I somehow learned that short and long were like a dozen.  A dozen is always 12, short is always 16 bits, and long is always 32 bits no matter what.  int and word are free to change, depending on the compiler.

    In my experience porting back and forth between various "16-bit" and "32-bit" platforms, the above was always correct.

    I’m surprised now to hear from Ulric that "long" has been defined as 64 bits on Linux 64 bit gcc.  Seems like a mistake; can’t we ever depend on anything in this business?

     – Owen –

  28. BryanK says:

    manyirons: You learned it wrong.  The only guarantee from the C language is that long is "at least 32 bits", and is also "at least as long as int".  (In turn, int is "at least as long as short", and short is "at least 16 bits".  Neither short nor long are "exactly N bits", for any N.)

    long can be any width that satisfies those two requirements; 16 bits for short, 32 bits for int, and 64 bits for long certainly do satisfy them.

  29. BryanK says:

    Actually, now that I think about it, sizes in C might be defined in terms of a multiplier on the size of a char, not a fixed number of bits.  A char is always one (machine) byte, and AFAIK C can compile for processors whose bytes are a size other than 8 bits.  So the "at least 16 bits" requirement on shorts might in reality be "at least double the number of bits in a char", and longs might be "at least four times the number of bits in a char".  Of course int is still defined as >= short, <= long, so that one doesn’t depend on the exact bit count.

    But I don’t know the C definitions for sure, though I don’t see any reason to exclude 9-bit-byte CPUs.  What I do know is that what I wrote above is valid on 8-bit-byte machines.

  30. SuperKoko says:

    "

    A char is always one (machine) byte, and AFAIK C can compile for processors whose bytes are a size other than 8 bits.

    "

    Yes. CHAR_BIT >= 8

    "

    So the "at least 16 bits" requirement on shorts might in reality be "at least double the number of bits in a char"

    "

    Wrong. The standard requirements are about the ranges [-32767,+32767] for short and int and [-2147483647,+2147483647] for long which implies that the number of bits is >= 16. Note that a binary representation, with or without padding bits, is required by the standard.

    However, a 32 bits byte C compiler may have sizeof(char)==sizeof(short)==sizeof(int)==sizeof(long)==1. This can be seen on pure 32 bits architectures where individual bytes cannot be addressed without heavy bit mask operations.

  31. poochner says:

    Quite right, me (if that *is* your real name! :-)  A byte can be anything from 5 to 9 bits.  At least those are the ones I can think of.  This is one reason async chips can handle those sizes.  Though 9 wasn’t common on the chips for general 8-bit architectures (8080/Z80/6502, etc), since it was more of a mini-computer thing.

  32. Ulric says:

    manyirons: You learned it wrong.

    yep, ‘long’ was 64-bit on many 64-bit compilers.  

    To some of us, it’s Microsoft that made the mistake here, to accommodate some source-compatility that didn’t really save anyone work porting and verifying the code to win64.

    At best, it’s a 50/50 decision.

  33. SuperKoko says:

    @Ulric: I admit that I, too, depreciate very much the LLP64 model.

    Portable C90 programs makes an effective use of the computer in the LP64 model, while the LLP64 model limits C90 programs to 4GiB files, assuming they use the standard fopen API, 4GiB memory blocks, assuming the compiler is conforming to the standard C90, C++98 and C++03 specifications stating that size_t -> unsigned long conversions aren’t lossy.

    However, source-level portability of badly programmed Win32 applications (i.e. most of them) to Win64 is probably more important, at least from a marketing point of view, than standards conformance and effectiveness for portable programs.

    From what I see, Microsoft has always got the policy: Make the transition smooth for most people, developers or users, rather than make the transition smooth & effective for people who got it right (which might be a minority).

    Making applications work rather than making them effective, is another Microsoft principle.

    Most people and developers don’t care too much if their application is a bit slow or if it cannot open files larger than 4GiB. However, they want their applications to work!

    Another Microsoft principle: Keeping compatibility with itself gets a higher pripority than getting compatibility with the rest of the world.

    You may disagree with these policies, but I think they’re the root of Microsoft’s success.

    Personally, I’m sad that Microsoft did/had-to use this ugly model.

    LONG didn’t change from 16 bits MS-DOS to 64 bits Windows! That’s incredible.

    BTW, don’t think that source-level compatibility isn’t important because there’s binary compatibility. Many librairies have to be rewritten and maintained for Win64. Many active projects tries to move to Win64 too, because having an heterogenous 32/64 bits development environment isn’t easy to manage.

  34. BOO says:

    BOOL (32-bits) differs from bool (8-bit).

    Therefore lowercase types cannot be used interchangeably with uppercase types.

  35. size matters says:

    Is unfortunate C (ansi/iso c and ansi/iso c++) doesn’t have explicit sized integers.

  36. Cooney says:

    why is that? It needs to deal with various diverse architectures fairly closely. As I said, it’s portable assembler.

  37. size matters says:

    A source code becomes more portable if it can assume a fixed size of integer types.

  38. BryanK says:

    "size matters": It does.  See C99’s <stdint.h> and int32_t / uint32_t types (and friends for other sizes).

    Oh, you’re not using a compiler that conforms to C99 (or at least this part of it)?  Sorry to hear that…

    :-P

    SuperKoko: Thanks!  The "must hold at least this range of integers, and must be a binary representation" definition for the various short/int/long types makes sense; I had no idea it was written that way, but that explains why people have always said "at least N bits".  :-)

  39. dave says:

    A source code becomes more portable if

    it can assume a fixed size of integer types.

    On the contrary – source code becomes less portable if it is unnecessarily tied to a particular size.

    Let us assume you want to operate on numbers that are in the range 0 to 9999.   You choose to type the data as "exactly 16 bit integer". Your hardware has 16-bit integer ops, so that’s ok.

    Now you want your program to run on hardware that doesn’t have 16-bit integer ops; if the data is forced to 16-bit size, then the code has to contort itself.  

    And you probably didn’t even care, really.

    Much better to use (in C) "int" or "unsigned int", which is a natural integer size for the machine.

    Apart from cases where you need to match externally-defined reality, you should use sizes that are large enough but not overly specified.

    These days, C says that an int is at least 16 bits and a long is at least 32 bits, but I prefer to code as if an int is at least 32 bits (thus avoiding needless use of long types, just in case logn turns out to mean 128 bits).  I am quite confident my code will never need to run on a 16-bit-int machine.

  40. SuperKoko says:

    @size matters:

    "

    Is unfortunate C (ansi/iso c and ansi/iso c++) doesn’t have explicit sized integers.

    "

    Aren’t you aware of the 9 years old standard C release?

    C99 DOES have explicitly sized integers.

    @dave:

    For good portability of programs *and* file formats, you need both machine-specific integers and specified integers.

    "

    Much better to use (in C) "int" or "unsigned int", which is a natural integer size for the machine.

    "

    Until you wish to write this into a file, and exchange the file.

    "I am quite confident my code will never need to run on a 16-bit-int machine."

    And you argue that your practice is portable!

    I’d rather use int_fast32_t.

Comments are closed.

Skip to main content