You still need the "safe" functions even if you check string lengths ahead of time


Commenter POKE53280,0 claims, “If one validates parameters before using string functions (which quality programmers should do), the ‘safe’ functions have no reason to exist.”

Consider the following function:

int SomeFunction(const char *s)
{
  char buffer[256];
  if (strlen(s) ≥ 256) return ERR;
  strcpy(buffer, s);
  ...
}

What could possibly go wrong? You check the length of the string, and if it doesn’t fit in the buffer, then you reject it. Therefore, you claim, the strcpy is safe.

What could possibly go wrong is that the length of the string can change between the time you check it and the time you use it.

char attack[512] = "special string designed to trigger a "
                   "buffer overflow and attack your machine. [...]";

void Thread1()
{
 char c = attack[256];
 while (true) {
  attack[256] ^= c;
 }
}

void Thread2()
{
 while (true) {
  SomeFunction(attack);
 }
}

The first thread changes the length of the string rapidly between 255 and 511, between a string that passes validation and a string that doesn’t, and more specifically between a string that passes validation and a string that, if it snuck through validation, would pwn the machine.

The second thread keeps handing this string to Some­Function. Eventually, the following race condition will be hit:

  • Thread 1 changes the length to 255.
  • Thread 2 checks the length and when it reaches attack[256], it reads zero and concludes that the string length is less than 256.

  • Thread 1 changes the length to 511.
  • Thread 2 copies the string and when it reaches attack[256], it reads nonzero and keeps copying, thereby overflowing its buffer.

Oops, you just fell victim to the Time-of-check-to-time-of-use attack (commonly abbreviated TOCTTOU).

Now, the code above as-written is not technically a vulnerability because you haven’t crossed a security boundary. The attack code and the vulnerable code are running under the same security context. To make this a true vulnerability, you need to have the attack code running in a lower security context from the vulnerable code. For example, if the threads were running user-mode code and Some­Function is a kernel-mode function, then you have a real vulnerability. Of course, if Some­Function were at the boundary between user-mode and kernel-mode, then it has other things it needs to do, like verify that the memory is in fact readable by the process.

A more interesting case where you cross a security boundary is if the two threads are running code driven from an untrusted source; for example, they might be threads in a script interpreter, and the toggling of attack[256] is being done by a function on a Web page.

// this code is in some imaginary scripting language

var attack = new string("...");
procedure Thread1()
{
 var c = attack[256];
 while (true) attack[256] ^= c;
}

handler OnClick()
{
 new backgroundTask(Thread1);
 while (true) foo(attack);
}

When the user clicks on the button, the script interpret creates a background thread and starts toggling the length of the string under the instructions of the script. Meanwhile, the main thread calls foo repeatedly. And suppose the interpreter’s implementation of foo goes like this:

void interpret_foo(const function_args& args)
{
 if (args.GetLength() != 1) wna("foo");
 if (args.GetArgType(0) != V_STRING) wta("foo", 0, V_STRING);
 char *s = args.PinArgString(0);
 SomeFunction(s);
 args.ReleasePin(0);
}

The script interpreter has kindly converted the script code into the equivalent native code, and now you have a problem. Assuming the user doesn’t get impatient and click “Stop script”, the script will eventually hit the race condition and cause a buffer overflow in Some­Function.

And then you get to scramble a security hotfix.

Comments (69)
  1. Adam Rosenfield says:

    CPython's Global Interpreter Lock to the rescue!

  2. Skyborne says:

    My knowledge of threads in C is a bit fuzzy.  Can you mount the same attack against strlcpy by a second thread toggling the length value between allocation and strlcpy call, or is the length value guaranteed to stay in a register (assuming non-volatile) in the strlcpy()ing thread?

  3. Philipp says:

    @Adam Rosenfield No, the GIL wouldn't help at all. The GIL would only ensure that the call of strlen(s) would return the correct result at the time you call strlen (assuming strlen to be a native function). After the call to strlen, the GIL could still be acquired by the malicious thread before strcpy. What would help is a serialization of the program (i.e. disable multi-threading). That's obviously not going to happen.

  4. tobi says:

    Toggling the char using xor made you smile, didn't it? ;-)

  5. Adam Rosenfield says:

    @Philipp: If interpret_foo() holds the GIL, then SomeFunction() runs without interruption since the attacker's thread cannot run without the GIL.  If SomeFunction() were itself implemented in script instead of natively, then yes, it could lose the GIL in between strlen() and strcpy(), but why the heck would you be calling strcpy() from a scripting language?

  6. pete.d says:

    Huh?

    If your attacked has already injected a thread in your process that can modify memory during a call in another thread of your process, they already own you.

    There are plenty of reasons it's foolish to not take advantage of pre-existing secure APIs, but I just don't see how this is one of them.

    [The attacker didn't inject a thread. You invited them into your process! (Think multithreaded scripting language.) -Raymond]
  7. John says:

    But… If someone can execute code in your process already then how do they need a stack overflow to do anything useful. Surely they could just run the code directly. I understand that if the function is an operating system call or something that runs in kernel mode that it matters, but I can't see how this helps secure an ordinary process much. I'm not arguing that it doesn't I just still don't understand the additional risk here.

  8. Avi says:

    @John

    Most embedded scripting languages expose a very limited set of what the machine can do.  Think Lua scripting in WoW's client.  If the script itself can trigger an overflow and inject its own machine code on the stack, that script can now do a lot more than Blizzard, or the player, ever intended.

  9. Joshua says:

    Congratulations. You just justified a nonsense practice with a chunk of code that violates the ground rules of programming.

    Since there's obviously no sense in the matter (else you would have come up with a better example), I'm turning that warning off and leaving it off.

    [News flash: The bad guys don't care about the ground rules of programming. And even if the other code is not malicious, it may have violated the ground rules by mistake. You should take reasonable steps not to amplify a minor error into a security vulnerability. -Raymond]
  10. Joey says:

    Can't the TOCTTOU issue still occur after the "safe" function checks the string, but before it does whatever string operation the caller requested?

  11. Avi says:

    @Mmx:

    Even such apps draw the line somewhere.  You'd be hard-pressed to create a keylogger in VBA, for example.  Even a simple one using a hook.  Or make an arbitrary network connection or start listening on a port for command&control signals.

  12. SimonRev says:

    @Joey

    The issue isn't with the strlen.  The version of the function written with the secure functions would look like

    int SomeFunction(const char *s)

    {

     char buffer[256];

     if (strcpy_s(buffer, _countof(buffer), s) != 0) { return ERR; }

     …

    }

    (side note for nitpickers, deliberately calling the C version of the function instead of the C++ to illustrate the point).

    The secure version ensures that buffer has a valid, null-terminated string that never writes out of buffers bounds.  Raymond's sample version attempts to guarantee that but fails.  

    Now, given the docs for strcpy_s, I am honestly not sure if it really will behave correctly where the source string is being modified during the copy operation.  If the source string is too large it will return ERANGE, but the destination buffer is not modified.  This implies an initial size check, exactly like Raymond's original (vulnerable) example.  The MSDN docs say nothing about what happens if the source string gets to large during the copy itself.  I would hope that a sanity check in the copy would abort the copy if it is about to overflow the destination buffer, but I don't see that guarantee made in a multithread environment.

  13. Joey says:

    @SimonRev "This implies an initial size check, exactly like Raymond's original (vulnerable) example." Yes, this is what I was thinking — how does letting the "safe" function pre-check the size for you accomplish anything in a multithreaded environment where the string could be changing mid-copy?

    It may turn out that strcpy_s can handle this situation too, but as far as I can see, neither the docs nor Raymond have explicitly said this, which makes this multi-threaded TOCTTOU argument weak IMO (whatever other good reasons there are to use "safe" functions.)

  14. Matt Tait says:

    I disagree ur theory Raymond:

    void func(char* s)

    {

    #pragma pack(push, 1)

    struct 
    
    {
    
        char prepad;
    
        char buffer[256];
    
        char aftpad;
    
    } locals;
    

    #pragma pack(pop)

    memset(&locals, 0xcc, sizeof(locals));
    
    StringCchCopyA(locals.buffer, sizeof(locals.buffer), s);
    
    if(locals.prepad != 0xcc || locals.aftpad != 0xcc)
    
        MessageBoxA(NULL, "Oh noes!!", "Oh noes! Stackoverflowed!", MB_OK);
    

    }

    char attack[512];

    DWORD __stdcall ThreadStart(PVOID pv)

    {

    while(true) 
    
        attack[256] ^= 'A';
    

    }

    void main()

    {

    memset(&attack, 'A', sizeof(attack));
    
    CreateThread(NULL, 0, ThreadStart, NULL, 0, NULL);
    
    while(true)
    
        func(&attack[0]);
    

    }

  15. voo says:

    It seems to me we're basically arguing: "Hey instead of using a function that already does the checks I can just reimplement the functionality wherever I need it". Which I guess is true enough in most cases, but that applies to almost everything.

    char buffer[256];

    if (strlen(s) ≥ 256) return ERR;

    strcpy(buffer, s);

    vs.

    char buffer[256];

    if (strcpy_s(buffer, _countof(buffer), s) != 0) return ERR;

    Seems more error prone, because it's easy to forget, longer and maybe even less efficient.

  16. sean says:

    @matt

    all bets are off if you pass the wrong size to StringCchCopy.  

    "The size of the destination buffer, in characters. This value must equal the length of pszSrc plus 1 to account for the copied source string and the terminating null character."

  17. Mark says:

    If the string passed into SomeFunction is being used, or even merely has the possibility to be used in multiple threads, then a wait lock on it should be taken immediately before reading, and released as soon as the reading is finished, and other threads should likewise respect that policy as well.  Threads that take such locks have a dutiful responsibility to release them immediately.  If you are wanting to code particicularly defensively, you could try a timed wait lock instead, and if after a certain amount of time the lock is still not available, then the function should simply fail, and return a status that indicates it did not succeed.

  18. Matt Tait says:

    I disagree with the premise of the article. The reason you should be using the safe functions is because people who try and do the checks themselves tend to get it wrong:

    I've seen lots of programmers smugly try "if(strlen(X) > COUNT(Y)) return ERR; else strcpy(X, Y)", only to end up with a one-byte heap/stack overwrite.

    The idea that failing to use it is a TOCTOU bug isn't valid unless the function EXPECTS the parameter to be modified by another thread, for example when kernel-mode code is getting a pointer from user-mode. In this case, the bug isn't failing to call the safe function – it's failing to capture the buffer first.

    Functions are entitled to expect their parameters to be thread-local. If you're building a multithreaded scripting language then you need to either capture the parameter first or somehow lock the object against writes before doing the checks.

    The alternative is chaos, since if you have to expect other threads are attacking you then you suddenly need to mark all variables as volatile, to disable compiler pre-fetches, and do interlocked writes to make sure other threads don't get confused by your modifications. You COULD write your apps like that, but then you'd go insane, and you'd write pretty slow apps.

    As a programmer you should use strcpy_s (actually StringCchCopy) because that way you can afford to suck at math every now and then. If you choose strcpy then you have to be sure that you don't suck at math, will never suck at math, and have never sucked at math before. Otherwise you're going to be dragged into work at 4am on a Sunday to patch your way of a hole when all your customers' credit cards start disappearing.

  19. Matt Tait says:

    @voo.

    True. But replace StringCchCopy with StringCbCopy and it still stack-overflows.

  20. Matt says:

    Change two lines to

    StringCbCopyA(locals.buffer, 3, s);

    and

    attack[3] ^= 'A';

    and it still works (for the inevitable pedant who points out that I should have -1 in there somewhere).

  21. Matt Tait says:

    Ugh. Stupid chars being signed in C. OK, StringCbCopy doesn't overflow, strcpy_s aborts given a wrong size, so replacing strcpy with strcpy_s turns an EoP into a DoS.

  22. acq says:

    "I disagree ur theory" Matt Tait:

    Your program that tries to prove wrongness is wrong: comparing any (signed) char with 0xcc is always false. The char you filled with 0xcc is implicitely casted to int, giving 0xffffffcc on the left and 0xcc on the right to be compared to equality.

  23. bah says:

    Didn't look at the cchDest thingy :/

  24. Well, I've looked into the implementation of strcpy_s and it can't overflow from the timed attack. What it does is to keep the length you passed in and for every character you copy, it subtracts one from that. Now if this gets to 0 and you haven't read a null character, then it will blank the entire string and return an error.

    So strcpy_s isn't a moral equivalent of strlen followed by strcpy, it is more involved than that.

  25. mikeb says:

    Aside from security/correctness, there's another reason that someone might want to use the 'secure' version of strcpy() instead of checking the length then calling strcpy():  the safe version only makes a single pass over the original input (and may not even need to touch all of the input buffer).

    So even if you're a programmer who never makes a mistake and injects bugs, you still might like to have it or something similar as an option.

  26. Gabe says:

    To those who think they're safe because their script language only supports a single thread: you're wrong. The guy who adds threads to your scripting language isn't going to audit all of your code to make sure that introducing threads doesn't create any race conditions. And if you rely on a third-party script engine, you might not even know that it ever became multithreaded!

    Are you going to just do the right thing, or pretend you know better and allow every PDF to become a potential attack vector because somebody added threading to the JavaScript engine you embed in your PDF viewer?

    But really, how is it in any way better to litter your code with string validation followed by unsafe calls when it is easier to just make safe calls in the first place?

    Also, I'm surprised by the use of ≥ rather than >= for the >= operator.

  27. Joey says:

    @Crescens2k If this (and others' analyses) is true, I still agree with Joshua about relying on undocumented behavior.  If everyone has to look at the source of these functions (since docs don't seem to answer the question) to see if there's a TOCTTOU, we can't be sure these functions will always be thread-safe.  But from Raymond saying, "I'll see what I can do to make this clearer in the docs", I gather that the existing precautions we've discovered will become documented.

  28. Joshua says:

    @mikeb:

    strncpy(dst, src, buflen – 1)[buflen – 1] = 0;

    There. Safe as strcpy_s.

  29. voo says:

    @Joshua Great, so you only have to do 3 things including calling a function to get the same behavior instead of calling the function to begin with. I assume you also don't use printf in C, because hey it's only a simple write to a known file descriptor and a bit of parsing involved?

  30. Adam Rosenfield says:

    Too bad that portable code can't rely on strcat_s being available, at least not until C11 implementations become widely available (which has strncat_s).

  31. Adam Rosenfield says:

    Err I meant to write strcpy_s()/strncpy_s(), though the same argument applies for the cats.

  32. Joshua says:

    @voo: Maybe because I'm not willing to write an additional function to save 2 instructions:

     mov ecx, buflen

     mov [eax + ecx – 1], byte 0

  33. steven says:

    I'm suddenly left wondering whether strdup/_strdup is safe. That would seem to be implementation dependent too.

  34. voo says:

    @Joshua A premature optimization – why didn't you mention that at first? But you're still going to call strcpy anyhow, so you only safe the additional guarantees made by strcpy_s which are pretty negligible..

  35. Adam Rosenfield says:

    @Cesar: Good point about strcpy_s not guaranteed to be thread-safe.  Hopefully C11 adoption will be more swift than C99 adoption, since it made optional certain hard-to-implement features (such as VLAs) which were required in C99.

  36. Joe says:

    What about the case where the destination buffer is deleted by another thread? If we want to create ridiculous assumptions, you can pretty much make ANY code unsafe in some circumstance.

    [The destination buffer tends to be trusted. These functions are typically used for transferring data from an untrusted location to a trusted location. -Raymond]
  37. Henry Skoglund says:

    Sorry a bit off topic but POKE53280,0 brings back memories, it set the border background color to black on the C-64 if I remember correctly.

  38. mikeb says:

    @SimonRev:

    > If the source string is too large it will return ERANGE, but the destination buffer is not modified.

    I can't find where the docs say this.  In fact, they seem to say the opposite.  From MSDN docs:

     – strcpy_s(): "The debug versions of these functions first fill the buffer with 0xFD."

     – StringCchCopy(): "STRSAFE_E_INSUFFICIENT_BUFFER – The copy operation failed due to insufficient buffer space. The destination buffer contains a truncated, null-terminated version of the intended result."

  39. Mmx says:

    Scripting engines are hard to write.

    Multithreaded code is hard to write.

    Secure code is very hard to write.

    Somehow, a scripting engine (hard), multithreaded (hard) and with a security requirement (very hard) is a bit of a corner case to explain the existence of a function in a *standard* library – surely someone with those three skills can write their own safe strcpy.

    The real reason is that progammers often forget to apply good practices and if they already got into the habit of using secure functions the damage if they forget some sanity check would be limited. Because if programmers were following best practices 100% right in the first place, we wouldn't ahve the problem to start with: surely you have unit test for all boundary conditions, stress tests, security tests etc in place for every single line of code don't you ? :)

    ===

    @Avi : I would change most with "many". Many scripting scenarios are quite unrestricted and treat script basically as plugins. I don't know which way are most scripting engines, but I would not take security for granted. WoW of course is probably a secure case. Most LoB applications with scripting aren't secured on the scripting side instead and for a good reason: if a script can do operations on the LoB app, usually formatting the hard drive is the least damage it can do compared to intentionally corrupting business data, etc.

  40. Tony Cox [MSFT] says:

    The original commenter was claiming that if you do your own check you don't need to use the secure version of the function. Well, that *might* be true sometimes, or it might not. It depends on how your code is going to be used. Raymond's example shows that there are cases where you really might want to worry about it.

    So, at a minimum, you need to do some additional analysis to figure out if your check is sufficient. In some cases you won't know the answer, because you don't completely control the context in which your code is used (say because you are a library or other component which gets used in many different contexts). Plus, you might get the analysis wrong simply because you haven't considered certain lines of attack (would *you* have thought of Raymond's TOCTTOU example on your own?). Are you *sure* there are no other issues to worry about?

    Compare the cost of doing that analysis (if you can), and the risk that you might have accidentally got it wrong, with the cost of just having a cast-iron rule that says "always use the secure string functions". The cast-iron rule has very little downside, and prevents you from making a mistake.

  41. mikeb says:

    @Steven Don:

    As far as Microsoft's implementation of strdup()/_strdup() is concerned, it uses strcpy_s() to perform the copy into the allocated buffer (at least since VS2005 SP1 Update for Windows Vista, anyway) using the size value that was passed in to malloc().  However, that says nothing about older versions or non-Microsoft implementations.

  42. Joshua says:

    [You should take reasonable steps not to amplify a minor error into a security vulnerability. -Raymond]

    Don't you go over not depending on undocumented behavior frequently?

    There is no indication in the documentation that this is not a valid implementation of strcpy_s:

    errno_t strcpy_s(char *s, size_t nemb, char *t)

    {

      if (!s || !t) return EINVAL;

      size_t tl = strlen(t) + 1;

      if (nemb < tl + 1) return ERANGE;

      strcpy(s, t);

      memset(s + tl, 0, nemb – tl);

      return 0;

    }

    I would have actually implemented like so (which just has a different problem with that race):

    errno_t strcpy_s(char *s, size_t nemb, char *t)

    {

      if (!s || !t) return EINVAL;

      size_t tl = strlen(t) + 1;

      if (nemb < tl + 1) return ERANGE;

      memcpy(s, t, tl);

      memset(s + tl, 0, nemb – tl);

      return 0;

    }

    [I don't know what strcpy_s does, but StringCchCopy will not overflow the specified output buffer no matter how crazy the source string is. That's sort of its whole point. I'll see what I can do to make this clearer in the docs. -Raymond]
  43. Anonymous Coward says:

    ‘is not technically a vulnerability because you haven't crossed a security boundary.’ – I strongly disagree. The attack string will probably come from outside rather than be a string constant. Maybe it's pasted by the user who has no idea that there's a chance that the strange accented letters mean ‘root my box’ or maybe it's from the *gasp* internet.

    In practice though, I feel the example to be contrived, and buggy even if the safe string function is used. A better answer to our C64 fan would have been that the safe function forces you to do the check, whereas even the best programmers sometimes forget the check if they use the standard functions. Furthermore they sometimes get the check wrong in subtle ways (this article is an example). As someone who has seen various (non-Windows) kernels and rtls I feel this point cannot be stressed enough. I thus feel a certain amount of irritation that the safe functions still aren't available on most Posix systems. The developers of said systems are usually well aware of the safe string functions but tend to refuse to implement them (or deprecate the old ones) because of exactly that kind of arrogance.

    That said, I tend to use a C++ string class library in which the string class really is whatever is most convenient (on Windows it's a BSTR).

    @SimonRev: There is an assembler opcode which when correctly used in conjunction with other opcodes and initial conditions prevents the fault condition you describe. I haven't checked but it seems likely that the rtl would use that (or something that compiles to it) possibly with some optimisations thrown in (maybe copy full words first and then the left-over bytes).

  44. @acq says:

    From the StringCbCopy source which ships with Visual Studio (actually inside StringCopyWorkerA called from StringCbCopyA):

       while (cchDest && cchToCopy && (*pszSrc != ''))

       {

           *pszDest++ = *pszSrc++;

           cchDest–;

           cchToCopy–;

           cchNewDestLength++;

       }

    It's correct, but only because the compiler inlined the dereference of *pszSrc. If it didn't, there's still a TOCTOU in there.

    [It's correct even without the inlining, because the result will still be null-terminated and will still never overflow the destination. The result may be a mishmash of old and new strings (possibly even with an embedded null) but the output buffer will always be a valid null-terminated string. -Raymond]
  45. Anoymous Coward says:

    To quote John Mc Enroe: "YOU CAN NOT BE SERIOUS". If two threads are accessing the same buffer, all bets are off.

    [But you should still fail safe. Or are you saying that you shouldn't be allowed to write a multithreaded script interpreter? -Raymond]
  46. Joshua says:

    [But you should still fail safe. Or are you saying that you shouldn't be allowed to write a multithreaded script interpreter? -Raymond]

    Personally I'd not expect multithreaded script interpreters (or any interpreters for that matter) to have writable strings, but that's just me.

  47. Ken Hagan says:

    Copying a string that is simultaneously being modified by another thread is undefined behaviour no matter what function you use to perform the copy. As far as I can see, using a function that is less likely to crash and burn is probably a *bad* idea, since the programmer needs to know about this truly horrible bug ASAP.

  48. Joe says:

    Sorry, Raymond, but I don't buy it. You are engaging in a form of sophistry. The notion that the destination is trusted but the source not is completely artificial to simply make your argument. NOTHING in the function calls, let alone in programming implies the integrity of either side. Only the conditions of a highly artificial test create that relationship.

    Moreover, I'll simply change the condition and have the other thread delete the source and force the memory to be invalidated. Then what?

    While at it, a third thread writes a completely different string to the destination. A fourth thread finds the address of the first threads stack and overwrites it with zeroes.

    [Yes, you could also delete the memory while it is being copied. That shouldn't result in a buffer overflow either. You are inventing malicious third and fourth threads. Whereas the second thread may have modified the buffer or freed it prematurely by mistake – a very common error. On the other hand, attacking another thread's stack is harder to do by mistake. -Raymond]
  49. Joshua says:

    We might as well call a spade a spade. The whole point of this was really so Microsoft could get every potentially dangerous call (with respect to the class of bug that is string buffer overrun) in their entire codebase audited.

  50. Cesar says:

    @Adam Rosenfield: do the C11 specification of strcpy_s guarantee thread safety? I took a quick look at the latest draft, and saw no mention of it. If there is no mention of thread safety in the specification, you cannot depend on it, because your code might be ported to something which uses a naive implementation of these functions (strlen+copy).

    I believe the safety here is just a side effect of a performance optimization: it is faster to compute the length and copy at the same time (one pass) than to compute the length and copy separately (two passes).

    As to waiting for C11 (which will take ages, some compiler vendors have not updated yet to the previous 10-year-old revision of the C standard), according to Wikipedia these _s functions are *optional*. It is best to just write a portable implementation of these functions and use it when the native implementation is not available (autoconf makes that easy), and you can that today.

  51. LarryOsterman says:

    @Joe: I encounter situations where the destination is trusted and the source is untrusted quite literally every day.  Certainly any time you're dealing with content that could possibly have EVER come from the web, the source is 100% untrusted.  It's possible that you live in a world that never deals with untrusted data (although I cannot for the life of me think of such a world), but the rest of us live in that world every day.

    Don't forget: Whenever you read a file from the disk or accept an input from the user you're almost certainly dealing with untrusted data (there are some extremely limited scenarios where file input could be trusted but I can't think of any situation where user input is trusted).  

  52. @Larry Osterman says:

    It might be the case that your web-string is untrusted, but it's not the case that your web-string is being modified on another thread. String Variants are immutable in COM.

  53. MMx says:

    @Avi

    > You'd be hard-pressed to create a keylogger in VBA, for example.

    Quite easy really:

    1) get a binary on a local disk (easier done than said, you can download from HTTP, fake a PNG image with a payload at the end, whatever)

    2) shell execute it

  54. Remus Rusanu says:

    The implementation of the safe functions must itself be resilient, internally, to this attack (and other…).

  55. Neil says:

    Do we need a safe version of strlen, or does everyone just use memchr(s, 0, 256)?

  56. @Neil says:

    StringCchLength(A|W) and strlen_s do that.

    strlen is an SDL-banned function too :)

  57. Random832 says:

    @Larry Osterman "I can't think of any situation where user input is trusted"

    I think this is an artifact of the poorly defined category of "trusted". You can certainly do things like pass user input directly to ShellExecute, or use it as a filename without removing ".." from it.

  58. SimonRev says:

    @MikeB:

    There is a giant table in the section called RETURN VALUE which states very explicitly that if the source buffer is larger than the destination buffer that the destination will be unchanged.

  59. Louis Pace says:

    @Henry Skoglund: It was the border color. The background color was controlled by address 53281. Ah… fond memories…

  60. David Walker says:

    I was believing Joey, who opined that the safe functions could also have the rug pulled out from under them.  Someone pointed out that the implementation of the safe functions is more involved than that, so the safe functions don't have the problem.

    Next, decided that the reason to use the safe functions is that if and when their implementation is improved, to use new machine instructions for example, all code that calls them will be automatically improved.  

    Then I decided that the implementation of the unsafe functions should be changed to have the undsafe functions internally call the safe functions.  Is there any harm in that?  Programmers won't have to change their code, and everything will get better.  :-)  Is there any harm in that?

  61. @David Walker says:

    > Then I decided that the implementation of the unsafe functions should be changed to have the undsafe functions internally call the safe functions.  Is there any harm in that?  Programmers won't have to change their code, and everything will get better.  :-)  Is there any harm in that?

    It's not possible. The basic difference of the safe functions is that they take (at least) an additional parameter.

    Take the most insecure function ever: "gets".

    char s[100];

    gets(s);

    As is, "gets" always overruns if the user is patient enough to type 100+ keys. You can basically change it in two ways that I can think of:

    1) take the length of s and never overrun

    2) introduce an arbitary limit and document it

    1: is not compatible to be replaced, since you now have to call gets(s, 100); instead of gets(s);

    2: is plain stupid as it does not secure those calls which are on a buffer under the arbitrary limit, and introduces an incompatibility for the code which should allow to type over that limit

    Of course, with a friendlier string type (or language) it could derive from s its length; but for C strings you can't – the basic mistake was in the design of C strings which saved one byte in the sixties to bite us forever since then.

  62. @David Walker

    Umm, if you are talking about my comment earlier on, I never once even intended to imply that if you pull the rug out from under them, they would continue working. You see, if you end up decomitting a page of memory that it is reading from/writing to then obviously, even the safe functions will cause an access violation.

    What the safe functions do is allow block any attempt at changing the length of the source string from overflowing the buffer. If that happens to the safe function, then it will get to the end of the destination buffer, notice that no null has been copied in and then fail. My comment was only to point out that a call to strcpy_s is not the same as a call to strlen followed by strcpy in the VC CRT.

  63. mikeb says:

    @SimonRev:

    I did  overlook that table (I guess I focused too much on the comments section).  However, it's only for VS2005 that the table describes the contents of strDestination as being unmodified when ERANGE is returned.  For later versions of VS, it indicates that strDestination[0] is set to 0.  I'd assume that that's the case only if numberOfElements isn't 0 (another doc error).  

    I also wouldn't assume that other contents of the buffer are unmodified (in fact, in general they will be modified).

    It also appears that strcpy_s() returns EINVAL if numberOfElements is 0, while the table indicates it will return ERANGE (however, the remarks section correctly says that it will return EINVAL in that case).

    To summarize, testing on VS 2005 SP1 (with the Vista update) and VS 2010 SP1 shows that the buffer is modified when ERANGE is returned from strcpy_s(). There's also a minor doc error about when the dest buffer size is 0.  So I think you should consider the docs for these functions to be subtly incorrect in certain error cases.  I'll post soemthing to the MSDN "Community Comments" section about these small issues.

  64. Joe says:

    @Larry, just because you encounter a situation doesn't make it the defacto rule, nor does it imply that it is always the case. Like Raymond, you keep changing the conditions. Who say the source string was user input? That's ex post facto reasoning.

    A further point is that if a buffer is so untrusted that an initial check is unreliable, then you have serious problems with your algorithm and are simply using the safe function to fool yourself into thinking your algorithm is sound. More specifically, if a thread is modifying a buffer in an unprotected manner, you have lousy code and far bigger problems than a buffer overrun.

    I'd argue that the best practice would be to move user input to a trusted buffer and never worry about it again.

  65. @Joe

    The thing is, it doesn't really have to be lousy code to be put in this kind of situation. All you have to do is get into the situation where you are a host to code you didn't write and then you are open to this kind of problem. Especially if as a host you allow communication to the hosted code.

    So this isn't about user input, and you can't move anything since there is part of the code that you don't have control over. This is why they kept pointing out the scripting engine scenario. It isn't limited to that, applications that allow plugins that have string parameters are prone to this, just as an example. Oh, and in before anyone mentions putting a string length parameter in the plugin API. Unfortunately, I have come across some which are not that smart.

  66. Gabe says:

    Joe: You're absolutely right — you should move the user input to a trusted buffer. But what function do you call to move that input to a trusted buffer?

    Also, history has shown that trusted inputs have a way of becoming untrusted: a machine on a trusted LAN ends up getting put on the Internet; a trusted process ends up getting invoked by a web server. There's no harm in always using the safe functions, and there's clearly harm in not using them, so why not just always use them and then never have to worry about a certain class of bugs?

  67. 640k says:

    @Gabe: You only eliminate compiler warnings, not bugs or security holes.

  68. John Doe says:

    Basically, any function which writes a string, or traverses a string until , without boundaries is unsafe.

    The former is the most obvious case, but the latter surely triggers strlen fans to go wild and simply expose their inner denial with public rejection. OK, it might not be a security issue if and only if your program is both the source and consumer of such "safe" strings, but if a string comes from somewhere else, like a file, a DB or user input, you shouldn't trust it.

    One of the most subtle "string-security" problem is when a string gets copied or concatenated without the guarantee we'll have a trailing , such as in strncpy. So much so that if you keep using such functions, you'll most surely forget to null-terminate the result at some point, probably in the most obscure place.

    Raymond, the string modification race condition isn't so much the trouble from my point of view. You answered to a comment that changing the stack at the buffer length argument is not as common as multiple threads handling the string, but I say it's fairly easy to happen even without malice, for instance, if you have a pointer to a struct which actually points wrongly to the said stack position, or some position above where the buffer lenght happens to be at the position of an integer field. So, the safe string functions are obviously not safe facing someone who's beyond the airtight hatchway, they're just (much!) safer than relying on the null of null-terminated strings or on the unbounded storage of characters.

  69. John Doe says:

    Just to complement my previous post, you obviously need the safer functions to actually be able to avoid access violations.

Comments are closed.