It rather involved being on the other side of this airtight hatchway: Invalid parameters from one security level crashing code at the same security level (again)


A few years after I posted this story, the security team received something very similar.

If have found that if you call the XYZ function (whose last parameter is supposed to be a pointer to a DWORD) and instead of passing a value pointer to a DWORD, you pass NULL, then you can trigger an access violation in the XYZ function. The XYZ function does not check whether the input parameter is NULL. This is a denial of service attack against the system.

Okay, first of all, even if the XYZ function checked that the final parameter is non-NULL, that wouldn’t prevent a caller from passing an invalid non-NULL pointer, so adding a NULL check doesn’t accomplish much from a security-theoretical standpoint.

The problem with this vulnerability report is that there is no elevation. The attack code and the code that crashes are on the same side of the airtight hatchway. If your goal was to make the process crash, then instead of passing invalid parameters to the XYZ function, you can just trigger the crash yourself.

int __cdecl main(int, char**)
{
    return *(DWORD*)NULL = 0;
}

In other words, in order to trigger an access violation in the XYZ function, you must already have had enough privilege to run code, which means that you already have enough privilege to trigger an access violation without even needing the help of the XYZ function.

This dubious vulnerability falls into the category Code execution results in code execution.

Comments (37)
  1. Joshua says:

    I've seen these turn into exploitable DOS before so I'm not inclined to knock it without knowing the API call.

    Kind of like dividing the largest negative integer by -1. You don't think about it but the DOS guy did. Something like 50 DOS exploits that day.

  2. Joshua says:

    @Mark: There's something about NULL and marshaling that has odd behavior. I've seen cases where bad NULL could happen where bad non-NULL was excluded.

  3. Jason says:

    @Joshua: Have an example?  And what does it have to do with marshaling?  Does it only happen through COM or .net?

  4. Dominic says:

    This doesn't "turn into a DoS" no matter how many times the attacking code crashes. It's only a DoS if it corrupts system state such that the function call starts failing in well-behaved code.

    And it's the responsibility of the reporter to make that link.

  5. Joshua says:

    @Jason: It had something to do with SOAP marshaling. A bad SOAP request managed to pass NULL for a string where non-NULL was required; causing a NULL pointer deference when passed to the lower layer (the contents of the string had no interesting security consequences if you assume the right to edit data is the right to corrupt data). Had that not been wrapped in an exception handler, we would have lost the worker process along with all pending requests.

    [But that was a crash at a different security level. The attacker is on the network, and the crash is in the worker process. -Raymond]
  6. Mark says:

    Joshua – those sort of issues may be exploitable if the code being run is the target of the privilege elevation. However, if someone made these API calls without first sanitising the data to check for null parameters, that's *their* bug. There's no way the code that makes the API call is elevating its privilege, though.

  7. Fleet Command says:

    Try putting your hand on the last sentence (which only serves to make that person as overreacting) and read the whole thing again: With that sentence out of the way, my stance would be that am going to need more context to judge whether his concern was justified, or your dismissal.

    Although the posts in this blog do not disclose the identity of the people but they still read like personal attacks, not like argument on problems.

  8. Joshua says:

    [XYZ(blah blah, DWORD *p, blah blah)]

    I am actually claiming there is an inability to say DOS or no without knowing what XYZ is. I believe this relates to elevated COM servers.

    [XYZ runs in-context, as evidenced by the fact that the access violation occurs in the same process that issued the rogue call. Let's say it's Heap­Query­Information. -Raymond]
  9. Matt says:

    On the other hand, it IS a contractual violation if the parameter was listed as __in_opt rather than __in, because if you declare a parameter as __in_opt you are declaring to the world that callers of your function don't need to worry about passing you NULL; it has other valid meaning within the context of the function.

    It's not quite a security bug, but it can have security consequences because the violation here means that static analysis tools will tell you that everything's OK when it's not, because the annotation is a lie.

  10. Maurits says:

    @Matt: maybe. For example, you might have a Foo(DWORD flags, in_opt LPCWSTR sid) where the sid parameter is formally optional, but a close reading of the contract states that sid becomes mandatory if the FOO_CUSTOM_SID flag is set.

  11. Matt says:

    The semantic contract would hold then, sure, and the function is entirely at its liberty in such a case to fail to do the requested operation. But it shouldn't crash.

    Here's why:

    If you declare a parameter as __in_opt and the function is exposed directly or close to a security boundary, you MUST ensure that every path through the function can cope (i.e. avoid crashing) if a NULL is passed to that function in that parameter. The alternative is that an attacker may be able to coerce a caller into passing NULL instead of a valid DWORD* by controlling values sent over the security boundary. Despite Raymond's protestations, it's usually quite easy for an attacker to supply data-or-null to a function over a security boundary. In contrast, it's usually very hard to supply an invalid-non-null pointer over a security boundary.

    A case in point is DCOM. Attackers can pass in any value they wish on the low-side of the security boundary and this gets wrapped in an interface for the high-side so that calls through the interface get marshalled back to the low-side for the low-side's implementation of that function to run. Now, if the high-side of the security boundary declares the parameter to be __in, an attacker passing NULL will be rejected by the DCOM layer before it hits the high-side code. All non-null values (valid and invalid) are converted to DCOM interfaces that are valid on the high-side and translate to invokes on the low-side. An attacker who sends a bad (non-null) pointer therefore only gets to crash his own process when the invoke comes back.

    The salient point here is that if we had instead defined our high-side interface to be __in_OPT, the DCOM magic will translate NULLs on the low-side to NULLs on the high-side (and everything else is wrapped). If there is a path through the high-side that crashes if you pass it a NULL, even if that's semantically against the contract, that's a DOS across a security boundary.

    So the TL;DR here is that it's normally much easier for an attacker to supply NULLs to a function than to supply invalid-non-NULLs, and if your function declares its parameters to allow __in_opts you should always handle the possibility of a NULL because tools and security layers rely on annotations. Your documentation might say that's against the contract, but attackers don't care what your documentation says if they can hit your function over a security boundary. They care about how your function actually behaves in response to their input.

    So if Raymond is correct, and the function was HeapQueryInformation, and it crashes when you pass a NULL parameter for the first argument; that's a bug in HeapQueryInformation. It's either an annotation error or an invalidly validated parameter. But if they report was that the THIRD parameter was NULL, then it's a bug in whoever is calling HeapQueryInformation, since that parameter is marked __OUT and anyone who calls it with a NULL (or potentially-NULL) argument is in breach of contract.

    [You're confusing "shouldn't crash" with "is a security vulnerability." The crash (that arguably shouldn't have occurred) is at the same security level as the attacker. Therefore, the attacker gained nothing. -Raymond]
  12. Joshua says:

    [But that was a crash at a different security level. The attacker is on the network, and the crash is in the worker process. -Raymond]

    True. As you recall I started out with "not knowing the API call."

    I also agree with Dominic about the reporter needs to make the link.

    [It's like this: Given the function XYZ(blah blah, DWORD *p, blah blah) { blah blah; *p = value; blah blah; }, I call it as XYZ(nullptr) and it crashes! DoS! -Raymond]
  13. Maurits says:

    If the security boundary is an RPC interface, the attacker can invoke the RPC interface directly and pass whatever parameters they like. So marshalling in the client process can make life easy (or difficult) for well-meaning clients, but will not hamper a determined attacker.

  14. Goran says:

    Haha, I call those "my code crashed, it's YOUR problem". Happens all the time in collaboration (sic!) with other people.

  15. steveg says:

    I once reported a Sql injection vulnerability to a vendor and my example culminated in "DROP DATABASE FOOBAR". To which the vendor replied, "Because other people are logged on, the database won't be dropped, therefore it's not a security issue."

  16. Nick says:

    [You're confusing "shouldn't crash" with "is a security vulnerability." The crash (that arguably shouldn't have occurred) is at the same security level as the attacker. Therefore, the attacker gained nothing. -Raymond]

    About bloody time!

  17. dbacher says:

    Even if Microsoft prevents the crash, all they can do is return an error code.

    Microsoft can catch this error and set a return code.  That is of limited value, because the calling code still has to check that return value.  And so if you look at the cases that are likely to make a pointer null just some of the time, it's pretty clear its not checking return codes as it stands.  Therefore there's not a lot Microsoft can do here to make the call not fail.

  18. Joshua says:

    And indeed HeapQueryInformation is not such a function where this could be turned into a DOS.

  19. John says:

    ITT: "Security" minded folk.

    I've seen this from time to time at work, "No way it could have been my code, it must have been sabotaged from above! Micro$oft lololol"

  20. Goran says:

    @Matt (long comment): you made several logical mistakes. You said that in the low-end parameter is in-opt, but that it is in on the high side. That's a case of a broken interface, so it does not count, and is not the case here anyhow. You also tried to involve a different security context, which is not the case either. The situation is very simple, much simpler than your analysis: it's a C interface, parameter is "out", or "in-out", but non-optional, and the caller broke that. It deserves to be punished. As Raymond would say, "the alternative is worse": because C has no pass-by-ref, people use pointers all the time. Checking all those pointers and returning "invalid parameter" all the time has a maintenance and runtime cost we should rather avoid.

  21. Ben Voigt [Visual C++ MVP] says:

    I feel like several commenters are discussing the WriteFile API:

    lpNumberOfBytesWritten [out, optional]

       A pointer to the variable that receives the number of bytes written when using a synchronous hFile parameter. WriteFile sets this value to zero before doing any work or error checking. Use NULL for this parameter if this is an asynchronous operation to avoid potentially erroneous results.

       This parameter can be NULL only when the lpOverlapped parameter is not NULL.

  22. Myria says:

    There seems to be an inconsistency within Microsoft, because the security team seems to consider kernel code execution bugs that require Administrator access to be cases of being on the same side of the airtight hatchway, but the kernel and DRM teams do not, because they rely on driver signing and Secure Boot for "security".

    [They are different problems. -Raymond]
  23. Matt says:

    OK.

    @Maurits: No, if it's RPC, you can pass any parameter, but you can't pass invalid pointers (or rather, if you do, they cause the low side to crash, so don't buy the attacker anything).

    @Goran: A security boundary by definition has two sides; a low-side and a high-side (and a marshaling interface between them). Over RPC an __in_opt parameter in an interface will marshal NULLs on the low-side to NULLs on the high-side, but if the interface is __in, it will fail to marshal the parameter. That is a different issue to static analysis.

    In static analysis is a little different: passing a potentially null value to a definitely-not-null (__in) parameter is a potential-bug, but passing a potentially-null value to a potentially-null (__in_opt) parameter is not. Consequently if I run driver-verifier over my Windows 7 driver and it says "Aha! You are passing that potentially-null variable into a function that Windows has declared to be __in, you have a bug", I should fix it. But if Windows declared that function as __in_opt, it won't. And if Windows declared it to be __in_opt and then crashes when my potentially-null value that I got from the user or network turns out to be /actually/ NULL, well, then my Windows7 driver has just exposed a massive local kernel EoP or remote DoS to the caller.

    @Raymond: No, I'm saying "shouldn't crash OVER an interface you've asserted is safe for those parameters".

    Whilst NULL-dereferences are rarely major security bugs (except local kernel EoPs <= Windows7), I've seen critical security patches for bugs in IE where a Pinvoke in an Xbap looked safe in code, but under the Pinvoke is an integer overflow leading to a heap overflow.

    Now, I'm sure the authors of that API would have swore blind it's "not over a security boundary", and at the time it wasn't. But it is now. Now it's stealing people's credit cards in IE and Firefox because there's a heap overflow accessible from the Internet.

    Too many credit cards have been lost to people taking the attitude of dismissing functions that fail to uphold their own contractual violations as "not a bug worth fixing" and waiting until someone eventually exposes it over a security boundary (how many "not a security boundary" bugs were "won't fixed" in DirectX right up until WebGL and Miracast? How many "not a security bug"s were in the C/C++ compiler before they put it on the Internet as a service in Roslyn?).

    Sending data in breach of a contract is undefined behavior. It's a bug. You should fix it. And if it's not a security bug now, it will be when you, your collegaues or your successors decide that exposing it to the Internet in future.

  24. Joker_vD says:

    @Matt: Wait-wait-wait. Let's take that C/C++ compiler example. If its run under one user account, it can't do anything to meddle with others, unless there is an elevation bug… but that's not the bug inside the compiler, right?

    And anything that takes input from a web-server better be run under a user who has absolutely minimum permissions.

  25. foo says:

    In the beginning the universe was created. This has made a lot of people angry and been widely regarded as a bad move.

  26. Matt says:

    @Joker_vD: What if the compiler is warning you that an object (like, maybe the result of an allocation by ExAllocatePoolWithTag) is being passed to a function that might dereference it because the parameter is marked as __in, instead of __in_opt? That's a bug.

    And in Windows7, if a remote attacker or a local user-mode caller can arrange for that function to be called, that's either a local elevation issue or a remote denial of service vulnerability. The attacker just calls the function in a loop until eventually the bug triggers. Then the attacker has triggered a NULL dereference in the kernel from outside of the kernel, and can use that to take control of the kernel locally (by mapping the zero page) or to blue-screen a computer remotely if the function is exposed over a network.

    And I didn't say "web server". I said over a network. Think more along the lines of RDP, SMB and RPC than HTTP, although even there there is a bunch of parsing going on in TCPIP.sys and HTTP.sys.

  27. John says:

    @Matt: "What if the compiler is warning you that an object (like, maybe the result of an allocation by ExAllocatePoolWithTag) is being passed to a function that might dereference it because the parameter is marked as __in, instead of __in_opt? That's a bug."

    Are you saying its Microsoft's responsibility to fix developer's bugs? The compiler did its job and said "Hey bro, this might be a problem".

    The first person to invent a compiler that fixes programmers bugs will be heralded as a God amongst men and responsible for the termination of several (if not all) programmers.

  28. Gabe says:

    Matt: You don't have to worry about Roslyn. It's a C# and VB compiler, not a C/C++ compiler. While the pre-Roslyn C# compiler was written in C++, and thus potentially subject to buffer overflows, Roslyn is a complete rewrite in C#.

    Regardless, they didn't put it on the Internet as a service. Well, they put it up on Codeplex as a service to mankind, but not as a web service. When they call it "compiler as a service", that means you can invoke individual parts of it as a library (while previously you could just invoke the whole thing as a separate process). It does not mean some sort of cloud-based remote compilation.

  29. Matt says:

    @Gabe: OK, it's not called Roslyn, but Microsoft does have a cloud-provided compiler: It takes attacker-provided code and then compiles it using the old C++ compiler into native ARM code for the Windows8 AppStore, so the point still stands. Also your point about C# not having buffer overflows makes some rather large assumptions that the JIT, GC and Pinvokes are all using platform APIs that adhere to their C/C++ contracts and not exposing buffer overflows when certain parameters (like a long string) are passed to them.

    @John: If the developer works at Microsoft and just compiled a bug that allows an attacker running in low-integrity IE to run Ring0 kernel-mode code in Windows, then yes, that's Microsoft's problem.

    Fixing annotations and adhering to contracts doesn't magically fix all security problems, but that's a strawman argument like saying giving our troops in theater bullet-proof vests doesn't stop the Taliban from shooting their legs and hence not worth investing in at all.

    Adding and correctly implementing annotations via code-contracts allows tools developed by Microsoft and elsewhere to reliably find actual vulnerabilities that steal credit cards and destroy computers remotely with low false-positive rates, but if your code doesn't adhere to your contract, real bugs go undetected by such tools until it's too late.

    If your function declares an __in_opt parameter and crashes when you pass it a NULL, change the contract, or change the code.

  30. ChrisR says:

    @Matt:  Why are you speculating and arguing about all these random tangents?  Driver verifier?  Web services?  Marshalling into separate processes on computers halfway around the world?  Raymond said, for the purpose of this discussion, that the function in question could be HeapQueryInformation, and that the call was in-process.

  31. Joshua says:

    @Matt, etc.

    The GNU guys declared the heap buffer overflow in bash to be a security bug. Gee, I wonder why.

  32. Matt says:

    @ChrisR: Because HeapQueryInformation is a public interface, not a program, and besides, I'm more disturbed by the cavalier "code that violates code-contracts is OK so long as its not explicitly exposed over a security boundary" notion.

    Public APIs _are_ a security boundary. Look at this CVE (http://www.cvedetails.com/…/CVE-2013-3195) rated 10.0 – the highest possible rating – because COMCTL.dll!DSA_InsertItem had a public API that ended up being exposed to anyone that rendered a RTF – including Internet Explorer (via Xbaps), Firefox (via Xbaps), Word (via crafted Word files), ASP.NET via the RichEditControl and so on.

    If HeapQueryInformation is stating that it contractually handles NULLs in a parameter and it doesn't, that's a bug, and it should be fixed.

  33. ChrisR says:

    @Matt: You are so far out in space and I don't feel like arguing with you.  The referenced CVE has nothing to do with HeapQueryInformation.  Nor is it a DoS attack against a server process via an access violation caused by some imagined contract violation in HeapQueryInformation.  Of course the bug referenced by the CVE is a bad bug, but why are you arguing about things nobody else is talking about?  Just to prove some point about how much you care about security?  Go get your own blog please.

  34. Matt says:

    @Chris: This post is about HeapQueryInformation, which defines a public interface which it fails to uphold – specifically it announces that it contractually succeed if passed any valid pointer to a SIZE_T or NULL, and fails if you pass it a NULL.

    If it is ever exposed over a security boundary – say an RPC boundary – that is a security bug. And in the meantime, it's just a bug that makes static analysis tools ignore real bugs (false negatives) and report false bugs (false positives), which has real security implications on your codebase.

    If you disagree, please tell me your company so I can avoid using your software, because probably 90% of actual security vulnerabilities are caused by API violations that were originally not over a security boundary and became a critical security bug when someone later put it over a security boundary.

    That is all.

  35. ChrisR says:

    @Matt: Indeed it is about HeapQueryInformation; I'm glad you finally understand that.  The part you still don't get is that the bug reporter is talking about a local process AV, not a DoS via some RPC, so there is no security vulnerability in the bug report (as you state yourself).

    I won't respond to your ad-hominem attacks against me or the company I work for, but thanks for playing!

  36. Mark says:

    Matt: dude, you're sounding a lot like Nicholas Lemonias here. You just said that it's Microsoft's problem if one of their developers compiles a bug. Yes, of course it is! But it has *nothing* to do with the problem described here, and there's no way that could make the problem described here a security bug.

  37. GregM says:

    Nowhere in the original post does it say either that the last parameter is allowed to be null, or that the function is HeapQueryInformation.  That function name only came up in a comment by later Raymond: "Let's say it's Heap¬≠Query¬≠Information".  That's just one possible function.

Comments are closed.