Death to “int” (and its variations)


I’ve been cleaning up a bunch of code lately (including warnings from our forthcoming static analysis tool.)

I’m seeing a bunch of old code that uses integers when “unsigned int” was obviously the intended meaning. For instance:

AddEntriesyToList( ENTRY * pEntry, int cEntries )
{

}

There’s realistically no way that you’d ever want to pass a negative number, so DON’T DECLARE IT AS AN INT!!! The same goes for LONGs vs ULONGs, INT_PTR vs UINT_PTR, and so on.

You’ll often know you’re in the presence of this code when you see “%d” used in printf/sprintf string.

I’ve written a lot of code to be cross-CPU compatible, so I’ve become especially attuned to declaring the right data type.

My assertion: all of your integer data types should explicitly be defined as unsigned by default, unless you can articulate why they should be signed.


Comments (12)

  1. Trey Nash says:

    I could not agree more! There’s many more little gripes where this one comes from. Don’t get me started! 😉

    -Trey

  2. Johan Ericsson says:

    Why declare them as unsigned?

    See "Large Scale C++ Software Design" by Lakos for a counter-argument.

    Search for "unsigned lakos c++" in google groups.

    "Since reading Large Scale C++ Software Design, I have changed my views on

    unsigned. I used to think they were a good idea. Now I don’t. Since I’ve

    stopped using them I find my code easier to read and to debug." John Scott

    http://groups.google.com/groups?q=unsigned+lakos+c%2B%2B&hl=en&lr=&selm=9u8rbq%24dml%241%40plutonium.btinternet.com&rnum=1

  3. John Burnett says:

    Much agreed! So, given that, time for a silly question – why is System.Collections.BitArray (or even System.Array) using an int as the length? What if I want 2147483649 flags? 🙂 Yes, sort of a kidding question, but really – shouldn’t it be unsigned "just for correctness sake"?

  4. Gia says:

    I haven’t seen Lakos but came to same conclusion when working on quite large scale software and porting it to various platforms): use of unsigned ints forced developers to frequenlty make judgements on type casts, and they frequently judged it wrong. In the end int was MUCH lesser evil

  5. Matt Pietrek says:

    I agree that porting to other non-Windows platforms can be difficult when everybody wants to define their own pet types, and you don’t want to modify the universe to be type-consistent.

    However, if you’re writing *new* code, there’s no reason why you can’t create it to be type-correct from the beginning. Think when you’re creating the class in the first place: Would a signed value *ever* make sense?

    I personally don’t find code using "unsigned" (or UINT) easier to read. If anything, the additional detail helps provide information about what the original author intended.

    Your code is a contract with the outside world. When specifying a contract, you want to be as precise as possible to avoid later confusion.

  6. Boomzilla says:

    In .NET, its too bad the unsigned int types are not CLS compliant, as I do prefer unsigned types as you do, but it makes me kind of nervous as a library author, as my libraries might not be consumable by all the .NET languages.

    As well, as the BCL seems to use Int32 pretty much exclusively, I end up doing a lot of casting to pass my unsigned types to the BCL.

    For me, at least using C#, I’ve decided its just not worth it…

  7. Adrian says:

    I’ve been a staunch believer in unsigned (and const) for many, many years, *but* I’ve recently wondered if (signed) int wasn’t so bad after all.

    I’m finding that when everything is "properly" declared with unsigneds, I have to do lots of casting to eliminate all warnings. Casting is usually a sign something is wrong.

    What if the caller of AddEntriesToList() has to compute the value to pass as cEntries? Have you ever tried to subtract two unsigned ints and convince a strict compiler that there shouldn’t be any warnings?

    Using unsigned means you can index twice as many times as you can with signed values, but that can be a hazard when you can no longer reliably represent differences between two indexes.

    Unsigned makes the interface clearer and it eliminates the need for a lower boundary test, but it doesn’t really prevent bugs. If you have to compute a value, and you end up with an int, you’ll end up casting it to an unsigned. If your value was unexpectedly negative, the upper-bound limit test may catch it, but it may not. Either way, a lower bound test on a signed value *would* catch it and help you identify the bug more quickly.

    Mind you, if you do use int, I’m still against overloading variables by having negative numbers mean special things.

  8. Dmitriy Zaslavskiy says:

    Matt,

    1. (C/C++) I saw a lot of average programmers who declare unsigned ints than mix them in the expression with signed ints and ….

    (90% of them don’t know that the result is unsigned!)

    2. (C#/.NET) Unsigned is not CLS compliant!

    So I would have to disagree with you on this one.

  9. Denis says:

    Just to elaborate on what Dmitriy said: Unsigned is only non-compliant in the public interfaces. Using it internally is fine, of course.

  10. Denis says:

    Matt,

    It’s true this helps define the expectations for an interface a little better. But I would argue that it’s not worth the effort. All you are doing is saying that negatives aren’t allowed. In many cases there is a maximum value that is acceptible as well and this does not express that. In cases where enums are not an option, you may very well expose values (properties, consts, etc.) that represent the MINUMIM and MAXIMUM. In this case unsigned only express half of the contract. In other cases, there may be more constraints than just minimum or maximum. In that case, a value class may be better.

    Overall, the expressive power is limited and the pain of casting all the time isn’t worth it.

  11. Using unsigned for counts etc. is a problem when you want to use them in loops (using a Delphi Pascal sample here):

    procedure Foo(Count: Cardinal); // Cardinal is an unsigned int

    var

    i: Cardinal;

    begin

    for i := 0 to Count-1 do

    writeln(i);

    end;

    The problem is that this fails when Count is 0… You will either get an underflow exception, or it will loop for a long time ;).

  12. DrPizza says:

    Which is why you don’t write loops like that in C++. You write:

    for(unsigned int i(0); i < count; ++i)

    {

    }

    and that does the Right Thing. It seems more an indictment of Delphi/Pascal than of unsigned integers.

    unsigneds are surely safer as they preclude buffer underflow attacks. There are a number of attacks where the attacker forces an expression of the form:

    buffer[-1] = arbitraryValue;

    allowing them to execute arbitrary code. With unsigned offsets, that’d be:

    buffer[someReallyBigNumber] = arbitraryValue;

    and a *safe* runtime failure.

    And in any case, C++ forces the use of unsigned integers in many places because size_t is defined as being unsigned. Using signed integers results in far more warnings (and hence far more conversions) than using unsigned.

Skip to main content