What’s wrong with this code, Part 7: The answers.

As I mentioned in yesterday’s post, there are two intentional bugs in the code.

The first bug is a huge one.  Remember my comment about the context of the API: “a service author decided that…”.  The problem is that the HKEY_CLASSES_ROOT predefined registry key they used to open the classes database cannot be used in a service (or in an application that impersonates users). 

The problem is that the HKEY_CLASSES_ROOT key isn’t a real registry key – it contains registry entries from both HKEY_LOCAL_MACHINE and HKEY_CURRENT_USER.  To retrieve the values from HKEY_CURRENT_USER, advapi32 opens this key under the covers and caches the result until the process is terminated.

And opening HKEY_CURRENT_USER is a big no-no when you’re running as a service.  The problem is that the registry handle for HKEY_CURRENT_USER (or any of the user hives) keeps a reference to the users token.  For the NT authentication scheme this extra token isn’t horrendously bad (although it WILL cause major problems with romaing profiles, and thus should be avoided at all costs).  But there are other authentication mechanisms available for Windows that require that a user only be logged onto a single machine at a time.  This means that this token handle that’s being kept open by your service prevents that user from logging onto ANY machine on the network, and will continue to prevent the user from logging on until your service terminates.

To fix this problem, as the documentation for HKEY_CLASSES_ROOT states, you should NEVER use HKEY_CLASSES_ROOT from a service (or from an interactive application that impersonates users other than the logged on user).  Instead, you should use RegOpenUserClassesRoot to open the per-user class database, and HKEY_LOCAL_MACHINE\Software\Classes when opening the per-machine class database.

Btw, all of the COM APIs know about this restriction, and they correctly respect the HKLM vs HKCU split.  So if you’re using the documented COM APIs, you won’t ever have a problem.

And that leads to the second bug in the code.  The second bug is somewhat more hypothetical, but no more significant.  Essentially, the structure of the data under HKEY_CLASSES_ROOT is considered to be internal-only.  The COM registry entries are fully documented to allow a user to add a COM component, but they do not guarantee that you can determine which DLL is going to be loaded by following the rules laid out in my “How does COM activation work anyway” post.  If a COM object is distributed as a side-by-side assembly then the rather simplistic rules I wrote about in my “How does COM activation work anyway?” post for InprocServer32 activation don’t completely describe the process by which the DLL is loaded.

If you use CoCreateInstance, then the right thing will happen, but if you attempt to assume that you know what COM is going to do to activate your code, then at some point in the future, you’re likely to discover that you got it wrong and have to hotfix your code to resolve the issue.

So there are two takeaways for this “What’s wrong with this code”.  The first is “Don’t use HKCR unless you KNOW that your code will NEVER be invoked in a service”.  The second one is “The COM APIs are your friend, trust COM, it knows what it’s doing, do not presume that you can do a better job than it does, for therein lie dragons”.

Now for kudos and mea culpas:

I’m not surprised nobody found the second bug – it was an incredibly subtle bug, and not at all clear.

Once again, Mike Dimmick nailed the first bug.  In addition, he pointed out one of the major problems with HRESULTs – people use SUCCEDED to check for success, when S_FALSE is a success return code that means “I didn’t do what you wanted me to, but it wasn’t your fault” – S_FALSE is effectively the same as a failure lurking under the success class of return values.

Paul Winwood pointed out that I should have used a tighter restriction than KEY_READ in my security flags.

Peter da Silva and Mo pointed out that if the code was used in its stated purpose that there was a race condition that would cause registry entries to be doubly written.

Mike R asked if CoTaskMemFree should be SysFreeString.  The answer is NO, since StringFromCLSID returns a LPWSTR not a BSTR – SysFreeString can only be called on a BSTR – they’re different (see this most excellent post by Eric Lippert for details).

Simon Cooke pointed out that I could safe a registry round trip.  He also brought up issues with Win9x (which is usually outside the scope).

Chui Tey pointed out that RegOpenKeyEx might change the value of its out values in the failure case.  That’s possible, but HIGHLY unlikely, simply because the possibility of breaking applications. 

Pavel Lebedinsky pointed out (correctly) that if the service isn’t impersonating anyone that using HKCR is safe.  This is true on its face, but it needs to be made ABSOLUTELY clear.  There’s no enforcement of this, so…  And given the potential of people misusing HKCR, it’s safer to just ignore it.

Edit: I think this didn’t show up on peoples aggregators, so…

Comments (4)

  1. Anonymous says:

    > But there are other authentication

    > mechanisms available for Windows that

    > require that a user only be logged onto a

    > single machine at a time.

    That’s something I’ve never seen. I’ve had problems with roaming profiles copying a start menu and stuff from one client PC where they were appropriate to another client PC where they caused all kinds of problems, but it didn’t matter if there were concurrent logons or not. When a logoff on one machine was followed by a later logon on a different machine, breakage was imposed. That "one at a time" rule does nothing to solve it. When was a "one at a time" scheme created and for what purpose?

  2. Anonymous says:

    There’s at least one very popular commercially available networking system whose authentication mechanism only allows a user to be logged on to one computer at a time. This means that if someone leaks a token, their service effectively performs a DoS against that user, because the user doesn’t get logged out of their authentication system until their token goes away.

    I’m not going to say which one it is, sorry 🙂

    But trust me, there is such a system.

  3. Anonymous says:

    I’ve used in the past (talking Win3.1 days, here) such authentication systems. NT 4.0 had long since been released and the system was in wide deployment in certain sectors.

    (And no, I’m not going to name names, either).

    Machines crashing (as they inevitably did) caused all sorts of fun; especially as the restriction applied (in a way sensibly, though impractical) to workstation accounts, too. Machines would boot and fail to be able to access ANY networked resources, because they were already "logged on". It was a truly wonderful piece of engineering. I honestly hope the QA people responsible were shot 🙂

  4. Anonymous says:

    Sorry, that was unclear – I meant "NT 4.0 had long since been released, BUT the system was STILL in wide deployment…"