What possible use are those extra bits in kernel handles? Part 2: Overcoming limited expressiveness


Last time, we saw how those extra bits can be used to develop safe sentinel values. That is a special case of a more general problem: How do you pack 33 bits of information into a 32-bit value? Whereas last time, we weren’t forced into the use of a sentinel value because we could develop a (cumbersome) helper class and switch people over to the helper class (or to pass two parameters to every function that used to take one), there are places where you are forced to try to squeeze 33 bits of information into a 32-bit value, and the helper class simply isn’t going to work. (I’m going to assume 32-bit Windows for concreteness, but the same considerations apply to 64-bit Windows. Just make the appropriate changes.)

Suppose you have a window message that does some work and returns a HANDLE, but it can fail, and when it does, you want to return an error code. In other words, you want to return either (TRUE, HANDLE) or (FALSE, HRESULT). But that’s 33 bits of information, and you can return only 32 bits. How can you provide 33 bits of information with only 32 bits?

Well, it turns out that you don’t actually have 33 bits of information to return. Since handle values are multiples of four, the bottom two bits are always zero and therefore convey no information. A kernel handle is really only 30 bits. Similarly, a COM error code in the form of an HRESULT always has the top bit set—after all if the top bit were clear, it would be a success code! Therefore, there are only 31 bits of information in an HRESULT error code.

Okay, so it turns out that (TRUE, HANDLE) is only 1 + 30 = 31 bits of information, and (FALSE, HRESULT) is only 1 + 31 = 32 bits of information. We can fit them inside a single 32-bit value after all!

LRESULT PackHandleIntoLresult(HANDLE Handle)
{
  LRESULT Lresult = (LRESULT)Handle;

  // if this assertion fires, then somebody tried to
  // pack an invalid handle!
  assert((Lresult & 1) == 0);

  return Lresult;
}

LRESULT PackErrorHresultIntoLresult(HRESULT Hresult)
{
  // if this assertion fires, then somebody tried to
  // pack a success code!
  assert(FAILED(Hresult));

  return ((DWORD)Hresult << 1) | 1;
}

The bottom bit is our boolean that tells us whether we have a success or failure. If the bit is clear, then the operation succeeded and the entire value is the handle, relying on the fact that valid handles always have the bottom two bits clear. On the other hand, if the bottom bit is set, then we have an error code, and the remaining 31 bits give us the significant bits of the HRESULT. Unpacking the values would then go like this:

BOOL IsLresultError(LRESULT Lresult)
{
  return Lresult & 1;
}

HANDLE UnpackLresultToHandle(LRESULT Lresult)
{
  assert(!IsLresultError(Lresult));
  return (HANDLE)Lresult;
}

HRESULT UnpackLresultToHresult(LRESULT Lresult)
{
  assert(IsLresultError(Lresult));
  return (HRESULT)(0x80000000 | ((DWORD)Lresult >> 1));
}

In pictures (since people like pictures):

Success:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x|0|0| HANDLE
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
 v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x|0|0| LRESULT
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
 v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x|0|0| HANDLE
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Failure:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1|e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e| HRESULT
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
 v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e|1| LRESULT
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
  \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \
   v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v v
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1|e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e e| HRESULT
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

That bottom bit tells us whether the upper 31 bits are the meaningful bits from a HANDLE or the meaningful bits from an HRESULT. Once we know which case we are in, we can take those upper bits and put them back into meaningful parts of the source data.

If you want to put this trick on a more formal footing, you could express the multiplexing in the form of a discriminant in a union:

// Type-specific value for HANDLE is the upper 31 bits
LRESULT TypeSpecificValueFromHandle(HANDLE Handle)
{
  LRESULT Lresult = (LRESULT)Handle;

  // if this assertion fires, then somebody tried to
  // pack an invalid handle!
  assert((Lresult &1) == 0);

  // discard the bottom bit since we know it is zero
  return Lresult >> 1;
}

HANDLE HandleFromTypeSpecificValue(LRESULT Lresult)
{
  // regenerate the bottom bit which we know is zero
  return (HANDLE)(Lresult << 1);
}

// Type-specific value for HRESULT is the lower 31 bits
LRESULT TypeSpecificValueFromHresult(HRESULT Hresult)
{
  // if this assertion fires, then somebody tried to
  // pack a success code!
  assert(FAILED(Hresult));

  // discard the top bit since we know it is 1
  return (DWORD)Hresult & 0x7FFFFFFF;
}

HRESULT HresultFromTypeSpecificValue(LRESULT Lresult)
{
  // regenerate the top bit which we know is 1
  return (HRESULT)(Lresult | 0x80000000);
}

// Oh boy, let's pack and unpack these puppies
#define TYPE_HANDLE  0
#define TYPE_HRESULT 1

typedef struct PACKEDLRESULT {
 int Type:1;
 LRESULT TypeSpecificValue:sizeof(LRESULT)*8-1;
} PACKEDLRESULT;

typedef union PACKEDLRESULTHELPER {
 PACKEDLRESULT Structure;
 LRESULT Lresult;
} PACKEDLRESULTHELPER;

LRESULT PackLresult(int Type, LRESULT TypeSpecificValue)
{
  PACKEDLRESULTHELPER Helper;
  Helper.Structure.Type = Type;
  Helper.Structure.TypeSpecificValue = TypeSpecificValue;
  return Helper.Lresult;
}

int GetPackedLresultType(LRESULT Lresult)
{
  PACKEDLRESULTHELPER Helper;
  Helper.Lresult = Lresult;
  return Helper.Structure.Type;
}

HANDLE GetHandleFromPackedLresult(LRESULT Lresult)
{
  PACKEDLRESULTHELPER Helper;
  Helper.Lresult = Lresult;
  return HandleFromTypeSpecificValue(Helper.Structure.TypeSpecificValue);
}

HRESULT GetHresultFromPackedLresult(LRESULT Lresult)
{
  PACKEDLRESULTHELPER Helper;
  Helper.Lresult = Lresult;
  return HresultFromTypeSpecificValue(Helper.Structure.TypeSpecificValue);
}

This more explicit form may be more helpful if you have more than just two types to discriminate among, but in our case, the extra typing probably just confuses the matter more than clearing it up.

This type of trick is actually quite common. For example, the LresultFromObject function uses a variation of this technique in order to pack a marshallable object and a COM error code into a single 32-bit value. It’s also common in lisp systems, where it is known by the name tagged pointers.

Comments (22)
  1. John says:

    “Suppose you have a window message that does some work and returns a HANDLE, but it can fail, and when it does, you want to return an error code.”

    Congratulations.  You’ve convinced me that Aaargh! was correct: you guys are in need of a major asswhooping.

    [Feel free to use WPF if you find window messages absurd. -Raymond]
  2. John says:

    I don’t find all window messages absurd, just the kind that behave like this.  I can’t remember ever coming across one, though, so I suppose I’ll manage to survive.

  3. andy says:

    John: If I understand you right, you’d like to either receive a valid handle or INVALID_HANDLE as a result. But what’s so horrible about adding more context to the result with a HRESULT?

  4. Nicholas says:

    Seriously people, please put this into perspective.  Raymond is not writing about a system the Microsoft just recectly designed.  He is writing about a legacy system developed a long time ago, when computers were underpowered (yet very expensive).  Packing more than one piece of information into one variable was the norm.

    Congratulations to Aaargh!, Bao, and Levicki.  I would bet money that if these guys developed Windows then it would have been worse.  At best they would have built something on par.

    That’s the way it is folks.  If you started using computers in the late 90’s then you really have no context or understanding as to why software in the early 90’s and 80’s was written the way it was written.

    But please, don’t let me stop all of you with your insightful hindsight-is-20/20 criticisms.

  5. Mark Sowul says:

    Recall the comment responding to Aargh’s rant:

    "Clearly none of you [complaining] has ever read (or just plain has not understood) a single historical article Raymond has written, because every time, some clown says, "why was this done in such a stupid way" and every time the answer is "because it had to run on a toaster with a 2 MHz processor and 5 bytes of RAM."  Bonus points if the person also complains about how bloated software is nowadays."

  6. DrkMatter says:

    I’m not sure I see how "history" considerations fit into this. The function has a codomain that includes "handles" and "errors". You need to be able to express them all, hence you divide the set of possible return values in such a way that they are easily distinguishable. What’s strange, wrong, weird, or unnatural about that? It’s how it has been done since the dawn of time, and it has always worked perfectly fine.

  7. John says:

    Aaargh!, Bao, and Levicki are heroes.

  8. Slippery Slope Guy says:

    Sure, stuffing handles and errors into the same return value might seem harmless.  But it’s a slippery slope.  Next thing you know, cats and dogs will be mating with each other and there’s not a damn thing you can do about it!

  9. Maurits says:

     // if this assertion fires, then somebody tried to

     // pack an invalid handle!

     assert((Lresult & 1) == 0);

    Or even assert((Lresult & 3) == 0);

  10. Dean Harding says:

    "Aaargh!, Bao, and Levicki are heroes."

    Poor Norman… how quickly we forget.

  11. Addict says:

    ..after I finish smoking my crack pipe. I’ve seen a lot of code in my life, but I’ve almost never seen a situation where this kind of trick has been needed, and in the places where it has been used, its use has been questionable.

    Why??

  12. Messiant R says:

    This reminded me of the uninitialised floating point article you wrote not so long ago.

    If we think of the hidden bit and the signaling NaN, the situation is quite similar as in this article: encoding different types of data inside a container type, and rely on the representation of the container data to indicate the type of the stored data.

    Does it make for beautiful, easy to understand code? Not really, but we could simply see it as taking advantage of the contract that has been supplied to us. Which begs the question: if something that looks hackish is in fact reliable, is it really a hack? If yes, when does it move into the trick department?

  13. Jules says:

    "Which begs the question: if something that looks hackish is in fact reliable, is it really a hack? If yes, when does it move into the trick department?"

    Yes, it is.  It becomes a trick when it becomes difficult for somebody looking at the code without already knowing what’s going on to understand it.

    This stuff clearly falls into that category.

    I mean, OK, I’ve used this kind of stuff before (I once wrote a garbage collector that relied on using the empty two bits at the bottom of a pointer to an aligned memory block), but that doesn’t make it good practice, IMO.

  14. Nicholas says:

    Let’s be clear on something.  I am not saying that the programming tricks described by Raymond should be used today.  It is not the case that "it worked in the past so let’s keep doing it that way."  Clearly with Moore’s law giving us the gigs of clock cycles and RAM we can afford to write code for humans instead of computers (knowing that the code will be bigger and slower but also knowing that the increase doesn’t impact performance requirements).

    However, rewind the calendar.  This is how stuff was done back then because the alternative was that your program did’t run on the hardware of the day.

    In fact, look at Part 3 of this series.  Raymond shows a piece of code that is something most of us would write to deal with local and remote handles.  He has an interface and two subclases to encapsulate the two kinds of handles and to provide type safety for those handles.  But, look at how he critiques the design:

    "Your memory requirements have tripled and you added another level of indirection (costing you locality), just for that 1% case"

    Now pay real close attention.  Given that it is the year 2008 AD, computers have become so powerful that the above performance hit described above is probably acceptable.  So we can justify the design.  However, go back a couple of decades and the above performance hit is absolutely unacceptable.

    If you are going to continue arguing this then you are a lost cause.  Engineering is about difficult trade offs.  The complainers on this blog are clearly not engineers.  They are just plain old programmers who think they have a clue.

  15. Medinoc says:

    That’s nice, but I think the best method is a variation on the two examples: Still pack all in one value (like the first example), but use the mapping of the second one (the one where the HANDLE is stored right-shifted) and use the high bit as the discriminating bit.

    That way, we have either:

    * a failed HRESULT, usable as-is.

    * a succeeded HRESULT, whose lowest 31 bits are the HANDLE’s highest ones.

    And the SUCCEEDED() and FAILED() macros work on them.

  16. DrkMatter says:

    "knowing that the code will be bigger and slower but also knowing that the increase doesn’t impact performance requirements"

    Clearly, it does. You just said it got bigger and slower. I would much rather my additional RAM, clock speed and cores go toward making my application faster, than for them to be wasted on your "highly engineered" code.

    As has been stated in the comments to part 3, it’s not premature optimization unless it’s harder or more obscure to implement. And honestly, what Raymond has been describing here isn’t rocket science.

  17. SuperKoko says:

    From Raymond’s article:

    "How can you provide 33 bits of information with only 32 bits?"

    For 33 bits or more, pointers to a malloc’ed memory block are a solution.

    However:

    1) It requires the caller to free() the block, otherwise there’re memory leaks. That’s not possible if the message has to be backward compatible with some previous version of the software.

    2) The caller has to be in the same process as the callee, which is likely, as handles are rarely valid in more than one process (except inherited handles).

  18. Igor Levicki says:

    Why people keep mentioning me in negative context here even when I say nothing whatsoever?

    Andy said : "If I understand you right, you’d like to either receive a valid handle or INVALID_HANDLE as a result. But what’s so horrible about adding more context to the result with a HRESULT?"

    Few ideas off the top of my head:

    1. Because HANDLEs were supposed to be opaque and non-interpretable by the application and HRESULT is a value?
    2. Because you are attempting to interpret the value of a HANDLE when you should only check whether you have INVALID_HANDLE_VALUE (or NULL)?

    3. Because it defeats type-checking?

    4. Because it makes your interface ambiguous?

    5. Because there is a SetLastError()/GetLastError() API?

    6. Because there is SetWindowLong()/GetWindowLong() API?

    7. Because a message can contain a separate field for a full-blown error code?

    Pick one.

  19. Extending the namespace to cover new object types.

  20. SuperKoko says:

    "1. Because HANDLEs were supposed to be opaque and non-interpretable by the application and HRESULT is a value?"

    A HANDLE is opaque, except the last 2 bits. This is conceptually ugly, but the only real stress it puts on the Windows implementation is a limit to 1073741824 kernel handles per process. Not a big deal.

    "3. Because it defeats type-checking?"

    Yep.

    "5. Because there is a SetLastError()/GetLastError() API?"

    Doesn’t work between threads.

    "6. Because there is SetWindowLong()/GetWindowLong() API?"

    Race condition if there’re concurrent threads sending messages to the same window.

    "7. Because a message can contain a separate field for a full-blown error code?"

    Do you mean using LPARAM as a pointer to an error code set on output?

    Yes, that’s ok, however, I assume Raymond’s example is to be put in a context where the message parameters & result must conform to a predefined interface, in that case, it’s not possible to redefine your parameters as you wish.

    There’s also an inter-process communication issue, fortunately, it’s irrelevant as HANDLEs aren’t typically passed through processes boundaries.

  21. Dave Harris says:

    Messiant: "Which begs the question: if something that looks hackish is in fact reliable, is it really a hack?"

    I wouldn’t call it reliable. It’ll fail if someone tries to return the result of GetCurrentProcess(), or any other sentinal value. This article and the previous one both assume the two bits are available for them to play with, and they play with them in conflicting ways. An elegant solution would be scalable and not conflict with other code.

    Sometimes hacks are justified, eg for performance reasons, but let’s not pretend they aren’t hacks.

  22. Medinoc says:

    "There’s also an inter-process communication issue, fortunately, it’s irrelevant as HANDLEs aren’t typically passed through processes boundaries."

    Except for IPC object handles: Anonymous pipes, anonymous shared memory, etc. In brief, the very reason DuplicateHandle() can do interprocess. And very easy to pass through a Windows message.

Comments are closed.