Asking questions where the answer is unreliable anyway


Here are some questions and then explanations why you can't do anything meaningful with the answer anyway even if you could get an answer in the first place.

"How can I find out how many outstanding references there are to a shared memory object?"
Even if there were a way to find out, the answer you get would be instantly wrong anyway because the microsecond after you ask the question, somebody can open a new handle. This is an example of "Meaningless due to unavoidable race condition."

"How can I find out whether a critical section is free without entering it?"
Again, once you get an answer, the answer could instantly become wrong if another thread decides to enter the critical section immediately after you checked that it was free.
"How can I tell whether there is a keyboard hook installed in the system?"
This suffers from the same problem yet again: The instant you get the answer ("all clear"), somebody can install a hook.

This is actually even worse because people who ask this question are typically interested in secure keyboard access. But if somebody has a keyboard hook installed, that means that they have already injected code into your process (namely, the hook itself). At which point they could easily patch the imaginary IsKeyboardHooked() function to always return FALSE.

Now when your program asks if the keyboard is hooked, the answer is a happy "no" and you proceed, blithely confident that there are no hooks. Just because somebody said so.

You cannot reliably reason about the security of a system from within the system itself.

It's like trying to prove to yourself that you aren't insane.

The system may itself have already been compromised and all your reasoning therefore can be virtualized away. Besides, your program could be running inside a virtual PC environment, in which case the absence of a keyboard hook inside the virtual PC proves nothing. The keyboard logging could be happening in the virtual PC host software.

From a UI standpoint, the desktop is the security boundary. Once you let somebody run on your desktop, you implicitly trust them. Because now they can send your program random messages, inject hooks, hack at your window handles, edit your menus, and generally party all over you.

That's why it is such a horrible mistake to let a service interact with the desktop. By joining the interactive desktop, you have granted trust to a security context you should not be trusting. Sure, it lets you manipulate objects on that desktop, but it also lets the objects on that desktop manipulate you. (There's a Yakov Smirnoff joke in there somewhere, but instead I will quote Nietzsche: Wenn du lange in einen Abgrund blickst, blickt der Abgrund auch in dich hinein.)

If you're a service, you don't want to start letting untrusted programs manipulate you. That opens you up to a Shatter attack.

Comments (39)
  1. You cannot reliably reason about the security of a system from within the system itself. It’s like trying to prove to yourself that you aren’t insane.

    Since you quoted Nietzche, I’ll quote René Descartes: "Je pense, donc je suis" / "cogito ergo sum" / "I think, therefore I am" – sometimes you can draw meaningful conclusions from within the system to which you are referring.

  2. Shattered says:

    I thought the release you linked to was retracted by the MS02-071 security bulletin.

    [quote]

    I saw a posting Microsoft authored shortly after this issue was reported, in which you said the problem was that processes with differing levels of privilege were running on the interactive desktop. It sounds like you’ve changed your opinion.

    We have. When we initially examined the situation, we concluded that the problem here lay solely in the fact that highly-privileged and lower-privileged processes were both present in the interactive desktop. We pointed out that, by design, all processes on the interactive desktop are peers, and stated that we believed the real solution was to not mix processes of varying privileges.

    However, upon deeper investigation, we determined that the real answer is somewhat more complicated. It’s possible for a highly privilege process to coexist safely with less privileged processes on the interactive desktop, provided that it’s been properly designed to vet all requests before acting on them. However, the flaw in WM_TIMER would undermine these safeguards even if they were present. As a result, although we still recommend that developers use extreme care before writing a process that has high privileges and runs in the interactive desktop, we believe that in this case the real culprit is the flaw in WM_TIMER.

    [/quote]

    That doesn’t mean that processes with different priviliges should exist on the same desktop, but I thought the shatter vulnerability could be (or is) fixed with proper message marshalling. True or false?

  3. Dilip says:

    You cannot reliably reason about the

    > security of a system from within the system

    > itself. It’s like trying to prove to

    > yourself that you aren’t insane.

    I think most people already know this but Kurt Goedel conclusively proved this ages ago breaking the entire foundation of Russel/Northhead’s Principia Mathematica.

  4. Cooney says:

    Even if there were a way to find out, the answer you get would be instantly wrong anyway because the microsecond after you ask the question, somebody can open a new handle.

    So what? Perhaps I want to see if I have a bug in my code – 100 refs might be a bit higher than I expect.

    > This suffers from the same problem yet again: The instant you get the answer ("all clear"), somebody can install a hook.

    Suppose I were performing a scan for trojans. A keyboard hook for an unknown program would be a positive indicator.

    > You cannot reliably reason about the security of a system from within the system itself. It’s like trying to prove to yourself that you aren’t insane.

    Seconding Steve: you can’t be certain, but you can get 90%. You can certainly catch some stuff.

    > From a UI standpoint, the desktop is the security boundary. Once you let somebody run on your desktop, you implicitly trust them. Because now they can send your program random messages, inject hooks, hack at your window handles, edit your menus, and generally party all over you.

    Wouldn’t it be nice if we had a fine grained level of control over what a serice was allowed? Security is, after all, the number one priority at Microsoft.

  5. Raymond Chen says:

    "Perhaps I want to see if I have a bug in my code" – There are things documented as "for diagnostic purposes only" but people use them in production code. "Control_RunDLL" for example. The return value of IUnknown::Release for another.

    "Suppose I were performing a scan for trojans". Any decent trojan would have patched the system so that IsKeyboardHooked doesn’t count the trojan itself as a hook! All you get is a false sense of security.

    "You can’t be certain, but you can get 90%." And it’s the other 10% that kills you.

    "… what a sevice was allowed": Services cannot interact with the desktop by default; you have to select that option explicitly (SERVICE_INTERACTIVE_PROCESS). Unclear what fine-grained control gets you, since the service author would just turn it on anyway.

  6. Anonymous Coward says:

    Ok, so how come there is an API to find out free disk space, since in the millisecond it is answered, another process could consume most of what is left?

  7. Keith Moore [exmsft] says:

    Continuing this line of reasoning, I suggest MS remove the "DIR" command from cmd.exe. After all, just because a file exists at the time the command was executed does not mean it exists now…

  8. Anonymous says:

    You cannot reliably reason about the security of a system from within the system itself.

    So what is the point in "http://www.microsoft.com/whdc/driver/kernel/64bitpatching.mspx" except for killing 3rd party tools?

  9. mpz says:

    The possibly already-obsolete information provided by the free disk space reporting function or the DIR command does not (usually) lead to unwarranted assumptions about system security. Unless you’re braindead.

    Concentrate on what matters, people.

  10. Owen Cunningham says:

    Another one: I once asked MS Premier Support for a way to get the suspend count of a thread without actually having to call SuspendThread/ResumeThread (both of which return the thread’s NEW suspend count). They said "oh it’s unreliable." I wrote a kernel-mode driver that can be queried via DeviceIoControl and indexes into position 429 of the ETHREAD block (which it locates using the undocumented NTOSKRNL export PsLookupThreadByThreadId). It has yet to behave unreliably (although I understand this is brittle if NTOSKRNL ever stops exporting PsLookupThreadByThreadId, or if MS ever moves the suspend count out of ETHREAD offset 429).

  11. Skywing says:

    Yes, and what happens if somebody suspends or resumes the thread after you query the ETHREAD but before you return from the IOCTL IRP? The suspend count you return is "dead on arrival".

  12. James Harlow says:

    Steve, if "cogito ergo sum" was correct then Descartes would have slain the entire Empiricism movement before it was even born.

    "Cogito", of course, assumes the existance of the thought and the thinker.

  13. Marcel says:

    Then I’m just glad that even though the answer will immediately be wrong MS still put some routines in to read the current time ;-)

  14. Cooney says:

    Any decent trojan would have patched the system so that IsKeyboardHooked doesn’t count the trojan itself as a hook! All you get is a false sense of security.

    Apparently virus writers hew to a higher standard than commercial software. IsKeyboardHooked is something that should be patched, but you always want to check it anyway, just like you check that your PC is plugged in before calling tech support.

    Did you really think I was going to write a security scanner and do nothing more than check for a declared keyboard hook?

  15. Raymond Chen says:

    So far as I can tell the only people who want to look for hooks are people looking for spyware. What’s the point of adding a function whose goal is to help find spyware if it can be hacked by spyware anyway? It’d be effective for about two weeks before all the spyware authors issue upgrades that hack this new function. And then the tech press will write articles saying "Microsoft spyware-detection functions easily circumvented. More proof that they’re a bunch of morons."

  16. Also, keyboard hooks are often present on things that are not worms, except perhaps by the widest possible defintion. AIM, and similar programs, hook the keyboard in order to check if the keyboard is idle… and even if you could verify, beyond a shadow of a doubt, that you don’t have a keyboard hook in userspace, that doesn’t matter. There could be one in kernelspace. There could be one between the keyboard and the socket. There could be a well-placed camera.

  17. Mike Dimmick says:

    Another example is ‘how do I find out if I can write to a file?’. Don’t ‘find out’, just open the file with GENERIC_WRITE access. If you can write, the open will succeed. If you can’t, it won’t and you can check GetLastError() to find out why.

  18. adeht says:

    Yet another example is using temporary files. You’re not supposed to just generate a filename then create it, because it might get created before you try. You’re supposed to do both as one operation.

  19. adeht says:

    (the operation should be performed by the OS, of course)

  20. Thomas says:

    You cannot reliably reason about the security of a system from within the system itself. It’s like trying to prove to yourself that you aren’t insane.

    If only our governments would understand this…

  21. Norman Diamond says:

    Reading the first 75% of the base note, I thought I was seeing Microsoft’s excuses for not letting programmers debug some characteristics of their programs. Then the focus changed to security. Well sure, answers to most of those questions would be useless from a security point of view, but answers that can help simplify a lot of debugging problems would still be highly useful.

    Here’s another example:

    "What process has this folder locked?"

    "We’re not going to tell you because in one more millisecond the process might unlock the folder, and/or other processes might lock it."

    "Well I sure do wish that in one more millisecond the process will unlock the folder. I sure don’t plan to run any more processes that will lock the folder, and I’ve been trying for the last 3 weeks to delete the folder. I already deleted all the files in it."

    "Well, we’re not going to tell you which process locked it, we’re only telling you it’s locked."

    "But in one more millisecond it might not be locked."

    "Oops right. We’re only telling you it was locked at each of the 30 times you’ve tried deleting it during the past 3 weeks."

    11/15/2004 2:35 PM Mike Dimmick

    > Another example is ‘how do I find out if I

    > can write to a file?’. Don’t ‘find out’,

    > just open the file with GENERIC_WRITE access.

    Bad example. Sometimes people look at the last-updated date of a file in order to guess whether they updated it recently. Of course this information isn’t completely reliable, it only helps as a rough guess, just like in debugging. Your example destroys information for no useful reason.

  22. Raymond Chen says:

    For the locked file thing I think the real reason is that the filesystem doesn’t know either. All it sees is a nonzero lock count. (It’s like asking a COM object, "Who still has a reference to this object?" The COM object doesn’t know. All it knows is that its refcount is nonzero.)

  23. asdf says:

    I’ll always side with being able to answer the question even if it is unreliable because having *some* metric is better than having none at all. Just because crappy programmers ignore documentation and API writers can’t add concurrency to most functions doesn’t mean these sort of diagnostic functions shouldn’t be written.

  24. Cooney: You can control what a service (and everybody else on your system) can do in the Local Security Policy MMC.

    But, be very careful since you can end up in a lot of stupid situations from changing variables there… The same disclaimer you read about changing registry variables applies here as well ;)

  25. Actually the filesystem has three counts (for share read, share write and share delete), but essentially that’s the case.

    For locked ranges in the file, more information is tracked, but even that’s questionable.

    Oh, and as for the file opening example, MS-Mail tried looking at the last-updated date to see if a file had changed. And this failed miserably when you tried running MS-MAIL’s over the network, because the last-updated date is no longer as reliable as it is locally.

  26. asdf says:

    There is a very useful app that finds out which programs have a lock on your files: http://www.dr-hoiby.com/WhoLockMe/

  27. Mike Weiss says:

    > Even if there were a way to find out, the

    > answer you get would be instantly wrong

    > anyway because the microsecond after you ask > the question, somebody can open a new handle.

    OK! I’ll just add one to the count **just in case** this happened! ;)

  28. Classic example is psapi.dll.

    Since psapi is reading other processes’s memory, it is almost certain that the data it returns are not reliable. But it is still an invaluable tool.

  29. Factory says:

    "Even if there were a way to find out, the answer you get would be instantly wrong anyway because the microsecond after you ask the question, somebody can open a new handle."

    Or in another words: "Any publically attribute of a concurrent system is subject to change after one has read the status of that attribute, thus we will never tell anyone what the status is."

    Kinda reminds me of the 100% secure system which has no functionality. :)

  30. Robert Moir says:

    "Any publically attribute of a concurrent system is subject to change after one has read the status of that attribute, thus we will never tell anyone what the status is."

    – Sounds silly that way until you realise that when Microsoft start supporting a way of returning these ‘secret’ values, people will start criticising them and blaming everyone else but themselves when they write code that relies on that value being stable when it isn’t.

    It doesn’t matter how good you are at picking a path through a minefield, you’ll still never be quite as safe as the person that avoids the minefield in the first place.

  31. This is an example of "Meaningless due to unavoidable race condition."

    Sounds like Heisenberg had it nailed down pretty well.

  32. anonymous coward says:

    Why I love this blog:

    – neitzche

    – descartes

    – gödel

    – russel

    – whitehead

    – heisenberg

    …all in a post about computer security.

  33. Norman Diamond says:

    11/15/2004 9:20 PM Larry Osterman

    > Oh, and as for the file opening example,

    > MS-Mail tried looking at the last-updated

    > date to see if a file had changed. And this

    > failed miserably when you tried running

    > MS-MAIL’s over the network, because the

    > last-updated date is no longer as reliable

    > as it is locally.

    Oh no, then I wrote a highly defective program earlier this year. A customer specified that when they wrote files in a particular directory, my program was supposed to notice and act on them. Each machine being watched had one directory being watched, and my program was on a separate machine. One designated filename was kind of a controlling file, and other files were various data to be interpreted. I used FindNextFileNotification and related tools. When the controlling file appeared to have been updated, I read its contents and obeyed its orders in interpreting other files. When the controlling file did not appear to have been updated, I ignored whatever other changes had occured so far in the directory, and waited for the next change notification. If the last-updated date is unreliable, there is no hope for this application to work as designed.

    How long has it been known that the information returned by GetFileAttributesEx for files over a LAN is unreliable? Why doesn’t the MSDN page for GetFileAttributesEx say so?

  34. Raymond Chen says:

    "Why doesn’t the MSDN page for GetFileAttributesEx say so?"

    Because GetFileAttributesEx just reports what the filesystem gives it. If the filesystem redirector reports stale data, then you get stale data from GetFileAttributesEx.

    The crumminess of the data is a property of the redirector, not of GetFileAttributesEx. GetFileAttributesEx is just the messenger. Don’t blame the messenger.

    (And of course you can’t expect the messenger to have a comprehensive list of all crummy redirector behaviors.)

  35. Norman Diamond says:

    11/16/2004 5:31 PM Raymond Chen

    > If the filesystem redirector reports stale

    > data, then you get stale data from

    > GetFileAttributesEx.

    Are these redirectors on the server side or the client side? Is there a list of known reliable and known unreliable and unknown filesystem redirectors, for example server side Windows 2000 being reliable, or client side Windows 98 being unreliable and client side Windows XP SP1a not being tested, or something along these lines?

    Of course the net result for each configuration would be the weakest link in the chain (unless some kinds of interaction can yield even worse results).

  36. Foolhardy says:

    About shatter attacks and windows on the same desktop being able to send messages to each other: Job objects (lookup SetInformationJobObject) have a certain limit called JOB_OBJECT_UILIMIT_HANDLES. It prevents processes in the job from getting handles to windows owned by processes not in the job. If you can’t get a handle to the window, you can’t send messages to it. Put your untrusted processes on the interactive desktop in a new job with this limit, and shatter attacks from it are prevented.

    It’s quite a nice solution: windows that need to can send messages to each other, but they can be sandboxed as needed. Unfortunately, Microsoft provides almost no tools for creating jobs. The only one I know of is in Datacenter Server, although the functionaility is a part of 2000 and later (workstation and server).

    If anyone’s interested, I created a command-line program just for this purpose. It creates a new job (or re-opens a named job) with the limits you specify and adds the processes you list to the job. The source is included if you want compile it yourself or see how it works.

    ftp://68.62.27.206/pub/jobprccur.zip

  37. Ed says:

    With regard to the first example (how many references to a shared memory object), then there is a case where the answer can be meaningful, assuming that the handles work the way I think they do.

    If the answer is one, and you are the one holding the reference, and there’s no way for any other process to get a handle on the resource without your cooperation (e.g. by duping the handle or some similar mechanism), then you know that your process is the only one still holding a reference and proceed accordingly.

    If the answer is greater than one, then it may drop to one a microsecond after you checked, but that doesn’t prevent the case above from being useful.

  38. Raymond Chen says:

    The original question was in the context of a named shared memory object, in which case somebody could call OpenFileMapping and bump the refcount to 2.

  39. Tom Canham says:

    It seems like people can’t/don’t want to believe the old adage that the only 100% secure system is one that is disconnected from the network, with the power button in the "off" position. Security is always a matter of degree, not absolutes.

    Reliability/correctness is the same sort of situation; *no* application is 100% bug free. "Bug free" is actually a meaningless term, since depending upon the domain of definitions for "bug" you use, you can *always* find some bugs in a program.

    Matters of degree, levels of confidence. I think we coders don’t like "messy" answers, but the fact is that life is messy. Deal with it.

Comments are closed.

Skip to main content