When adding security bugs to your code is not your fault!

David LeBlanc and I (and a bunch of others) just had a little email exchange about some fascinating integer overflow vulnerabilities in gcc.

Long story made short: the code you add to detect integer overflows might actually be removed by the compiler because of assumptions made by the optimizer. I was going to write a post on the subject, but David did it for me 🙂 A frankly, no-one knows int-overflow science quite like LeBlanc.

I can’t help but be reminded of another compiler optimization vulnerability we discovered a few years back. I wonder what else might be in store for us from the world of compiler optimizations?

Comments (19)

  1. Chris says:

    I would say there is a quite a few more of these types of bugs on the way. Compilers are always pushing the optimization edge whenever possible.  So this bug does not surprise me. It just helps to further make the point that binary analysis is extremely important and still relevant. Pure source code analysis will never be enough on compiled languages.

  2. jf says:

    talk to v-fefe about this stuff if you havent already.

  3. Kid Icarus says:

    I was under the impression that Felix von Leitner tried to get this fixed two years ago. http://gcc.gnu.org/ml/gcc-bugs/2006-04/msg01297.html

    So why are people only talking about it just now?

  4. Someone says:

    It’s an old story….look at http://blog.fefe.de/?ts=babc3ebb There it was published in April 2006!

  5. Andreas says:

    This particular gcc vulnerability has been known for years, and the gcc developers are not very forthcoming in fixing it, claiming that the current gcc behaviour is standards-compliant.

  6. markus says:

    Actually I think this has been know for some time by now. The GCC team said its fixed but it is apparently not fixed… I think they are too slow :/

  7. Kurisu says:

    There is nothing to fix in GCC. This isn’t a bug in GCC. C isn’t assembler and it’s not defined by common-sense but an international standard. If you read the C standard, you’ll realize that the so-called "security" overflow checks exhibit *undefined behaviour*. Also the use of "int" and mention of "integer overflow" is really a red herring in the US-CERT publication because that’s not the issue at all.

  8. HM says:

    Hi, back in 2003 we found a serious bug in JDK 1.4.2ś System.arraycopy() routine.   Using this routine a few thousand times in one session would break its logic so that only part of the source array would get copied over to the destination array.  Turned out that Hotspotś optimized C++ array copy code was buggy. Once the Hotspot compiler identified this routine as a ¨hotspot" for optimization, it would replace the original code with the optimized, yet buggy, C++ code.

    Java folks will know that this is one of the core routines in the JDK. When we contacted Sun, they told us not to rely on any claims made by their APIs.

  9. Commenters above: where did you see Howard or David claiming to have discovered that?? Anyway, it is known and not fixed? Even worse.

  10. Kurisu, so how do you propose folks defend against int overflow vulns?

  11. Kurisu says:

    I suggest reading the C standard. Carefully.

  12. Kurisu – so do you know the C spec inside and out? I certainly do not, and I’d consider myself a pretty decent C and C++ coder. If I add defensive code I expect it to stay there!

  13. bcthanks says:

    I use something like this:

    if ((len < 0) || (len >= size_of_buffer_in_bytes)) { /* overflow */ }

    And often I spot-check the executable to make sure the check actually works, by intentionally causing an overflow. That verifies the ‘if’ as well as whatever action is supoose to happen inside the ‘if’

    (That is a basic testing requirement. Never assume the compiler – or anything else – is not buggy!)

    And yes, that code might be slower than just checking for wraparound. But IMO it has distinct advantages:

    1. it makes clear it is explicitly bounds-checking the len. Less experienced programmers may not understand what "len + buf < buf" means. Or they might cut-and-paste the C code into another language that has different semantics.

    Sure, it would be great if everyone is skilled. But reality is people are not and many people do not care enough to learn. Security related code by nature is not executed except under unusual circumstances (ie, an attack), so normal testing will not detect the problem.

    2. If the 32-bit code is ported to a world where ints are 64-bit, that wraparound check *may* not detect an overflow past the end of the code.

    Howard, you should know that software development is a learning process; now you’ve learned that compilers optimize away code in ways you didn’t expect. Unfortunately, that might mean you have to audit previously written code for that situation (I hope somebody does!). That is no different than what other people have done in response to buggy Windows code; or buggy tools from many vendors; or mitigation for specific attacks.

  14. Kurisu says:

    I don’t know every word of C standard by heart but I have the draft in

    plain text format around and I’m not too proud to (re-)verify assumptions

    when I come across something that I’m not sure about.

    I think C and common misconceptions of C are such a huge topic, you could

    easily write a book about it. This specific issue is just the tip of the

    iceberg. I’m not sure whether anyone would actually read such a book though.

    The audience which could actually benefit the most from it would probably

    consider it incredibly boring ("TLDR") but I find it difficult to compress

    this into a few lines or even a short essay. Everyone who likes to consider

    himself a serious C programmer should at the very least read comp.lang.c.

    Writing correct code in C is difficult and be very hard. Then again, C

    is likely the most challenged and discussed language. Others have many

    pitfalls too but most languages are not wide-spread or mostly used by

    experts that are aware of them. Then there are languages which are a pure

    mess in and out, even worse than C but nobody really cares because experts

    avoid them like the plague whereas skript kiddies and such do not care

    the least about the flaws.

    A central problem of C might be that it’s misunderstood as being a portable

    high-level assembler. In fact, people who are familiar with assembly

    programming will have little problems getting started with C because pointers

    are not hard to grasp when you have a good idea how they could be implemented

    in assembly language. The problems start when people generalize from examples.

    This isn’t C-specific at all, it happens in real-life all the time, invalid

    reverse deduction that is. Everyone with basic education in logic knows that

    "works here for me" does not prove anything. You can of course use examples in

    proofs but only in certain ways.

    Don’t misunderstand but it’s fairly irrelevant how competent people consider

    themselves to be:


    That’s why we have tests in school, driving licences, certificates and what

    not. Otherwise everyone could go around and simply claim being expert at some

    topic. Some people who are doing this are obviously lying, others are obviously

    incapable to recognize their own incompetence and there is some middle ground,

    basically people overestimating their own competence.

    I don’t doubt that you’re a pretty decent coder but you should ask yourself

    "In which way?". Maybe you’re really good at turning ideas into algorithms

    and/or implementing them but are you as good when it comes to correctness

    of your code? Some people may not be very creative but they are good at

    writing correct code. Both kinds of people are certainly decent coders,

    except that they have different strengths and flaws. That’s also why

    developing software in a team with people of diverse skills is highly

    benefitial. Nobody is perfect in everything. If a team is too homogenous

    with respect to their skills, that might reduce conflicts but it also means

    that mostly "throughput" will benefit and code quality won’t, it might

    actually degrade by emphasizing the common weaknesses.

    Mixing C and C++ isn’t a good idea. C isn’t a strict subset of C++ and it’s

    obviously difficult to remember which features are common and which aren’t.

    Whenever I read "C/C++", I’d like to barf. It’s a fair bet that most people

    lumping them together have little clue about either. I don’t code in C++.

    I could care less about C++.

    Regarding the specific issue again, GCC isn’t the only C compiler which behaves

    this way. So anything but adding diagnostic messages or warnings, would be

    absolutely counter-productive.

    The C standard tells you what you can legally expect and what you cannot.  What

    you are calling "defensive code" is simply nonsense. Anyone who is crying for

    security should first make sure they are using the right tool and using it

    correctly. Someone who doesn’t master a tool, can hardly expect security,

    especially if we’re talking about a fairly low-level tool. I hope I don’t need

    to explain this with an automobile analogy. Let this be common-sense, please.

  15. Kurisu says:

    Let me state once more that the issue you’re talking about has nothing to do with integer overflows. It’s about pointer arithmetic. If mean that the value of "len" is the result of an integer overflow, you’re checking in the wrong place anyway. You can’t detect an integer overflow after it happened because it causes undefined behaviour. You can at best prevent a buffer overflow (or out-of-bounds read rather) here but only if you do it correctly something the example code clearly doesn’t do. The two problems are certainly similar but are technically very different because integer arithmetic and pointer arithmetic follow different rules.

    You can easily find discussions, tutorials and even C code online explaining how you avoid integer overflows, if you can’t or don’t want deduce this yourself from the C standard.