Closing holes in the update notification pattern


Suppose you have a function that is registered to be called the next time something gets updated, and suppose that the notification is a one-shot notification and needs to be re-armed each time you want to wait for the next notification. (For example, the Reg­Notify­Change­Key­Value function behaves this way.) Consider the following code fragment:

void onUpdateThing()
{
 // get the updated properties of the thing
 getThingProperties();

 // ask to be called back the next time it updates
 registerUpdateCallback(onUpdateThing);
}

mainProgram()
{
 // get the thing's initial properties
 // and register for updates
 onUpdateThing();
}

There is a race condition here if the thing updates twice in rapid succession. On the first update, your onUpdateThing function is called. If the second update occurs while get­Thing­Properties is running, then your call to register­Update­Callback will be too late, and you will miss the second update.

The solution is to register for the next update before studying the previous one.

void onUpdateThing()
{
 // ask to be called back the next time it updates
 registerUpdateCallback(onUpdateThing);

 // get the updated properties of the thing
 getThingProperties();
}

That way, if a second update comes in while you're studying the first one, your update callback will be called because you already registered it. (I'm assuming you're only interested in the last update.)

Of course, this assumes that update requests are queued if the receiving thread is busy. If updates can be received during the execution of get­Thing­Properties, then you will end up in a bad re-entrant situation: During the processing of one update, you start processing a new update. Then when the nested update finishes, you return to the original update, which is now actually performing the second half of the second update.

Suppose your update code wants to keep the colors of two additional objects in sync with the color of the thing:

void getThingProperties()
{
 Color currentThingColor = getThingColor();
 object1.setColor(currentThingColor);
 object2.setColor(currentThingColor);
}

If the set­Color method creates a re-entrancy window, you can have this problem:

  • Thing changes color to red.
  • on­Update­Thing begins.
  • Register update callback.
  • get­Thing­Properties reads current color as red.

  • get­Thing­Properties sets object 1's color to red. The set­Color method creates an opportunity for re-entrancy by some means. (For example, it may send a message to another thread, causing inbound sent messages to be processed.)

    • Thing changes color to blue.
    • on­Update­Thing begins.
    • Register update callback.
    • get­Thing­Properties reads current color as blue.

    • get­Thing­Properties sets object 1's color to blue.

    • get­Thing­Properties sets object 2's color to blue.

    • get­Thing­Properties returns.
    • on­Update­Thing returns.
  • get­Thing­Properties sets object 2's color to red. (Oops.)

  • get­Thing­Properties returns.
  • on­Update­Thing returns.

One solution is to use a sequence number (also known as a change counter) that gets incremented each time the thing changes. If there is only one thread which updates the thing, you can try to update it atomically. For example, if the information is in the registry, you can put all the information into a single registry value or use registry transactions.

If you can associate a change counter with the data, then you can use the following algorithm:

// start with a known invalid value
// (If you have multiple listeners, then this naturally
// needs to be instance data rather than global.)
LONG lLastChange = 0;

void onUpdateThing()
{
 bool finished = false;
 do {
  // record the most recent change we've processed
  lLastChange = getThingChangeCount();

  getThingProperties();

  // ask to be called back the next time it updates
  registerUpdateCallback(onUpdateThing);

  // did it change while we were busy?
  LONG lNewChange = getThingChangeCount();

  finished = lLastChange == lNewChange;
  if (!finished) {
   // cancel the update callback because we don't
   // want to be re-entered
   unregisterUpdateCallback(onUpdateThing);
  }
 } while (!finished);
}

Another solution would be to detect the re-entrancy and just remember that there is more work to be done after the previous update finishes.

// 0 = not busy
// 1 = busy
// 2 = busy, and a change occurred while we were busy
// (If you have multiple listeners, then this naturally
// needs to be instance data rather than global.)
int iBusy = 0;

void onUpdateThing()
{
 // ask to be called back the next time it updates
 registerUpdateCallback(onUpdateThing);

 if (iBusy) {
   iBusy = 2;
 } else {
  iBusy = 1;
  do {
   getThingProperties();
  } while (--iBusy);
 }
}

Note that all of the above examples assume that the on­Update­Thing function has thread affinity.

Comments (19)
  1. MItaly says:

    :%s/execption/execution/g

  2. saveddijon says:

    The whole "one-shot and then re-register" model is broken.

    That's why SVR3-style Unix signal handling (one-shot) was replaced with BSD style signal handling (not necessarily one-shot, and can mask signals to avoid reentrancy issues).

  3. Smouch says:

    One might ask, why do I have to continue to register, why doesn't the notification simply stay in force until I de-register.

  4. Adam Rosenfield says:

    Execption is what you get when you cross execution with an exception.

  5. Smouch: "why do I have to continue to register, why doesn't the notification simply stay in force until I de-register."

    Because it's a one-shot. If it weren't then you have the same re-entrantcy problem you get if you re-register. I.e. It updates while you're processing the previous update.

  6. SpecLad says:

    It seems like the second example is still susceptible to the same race condition as the first one – what if the second update happens before registerUpdateCallback is entered?

    [Then getThingProperties gets the second updated value, so everything works out. The assumption here is that you merely want to be up to date; missing intermediate values is no big deal. -Raymond]
  7. 640k says:

    The one-shot/re-register model is broken because there's always a race condition even before the register-call, the thread can be preemted before the register call has even initiated.

  8. Joshua S. says:

    Why not create a Reg­Notify­Change­Key­ValueEx that takes a handle to an IOCP plus a completion key (or OVERLAPPED*) instead of an event handle? Then you get every single change notification through your IOCP without loss, and presumably better scalability.

  9. Anonymous Coward says:

    Smouch: Why is it a one shot?

    Brian: Because it is.

    Smouch makes a valid point. Generally if you're interested in a key value you won't lose interest just because it changed once, I should think. And according to the documentation, NT – XP will behave sanely as long as you keep the key open. (I don't know/care about Vista+.) Note that this doesn't cause any reentrancy problems beyond those we already have and can avoid, since the application is calling the function and can wait for the event and call it at its leisure.

    However, regardless of whether the function were to behave as documented, you cannot guarantee that you'll receive all key changes. Raymond's suggestion might make the window smaller, but it doesn't eliminate the problem. I think this is because the function wasn't meant for detecting the changes as such, but to check if the key's value has changed (one or more times) from what you thought it was, and then the suggestion helps.

    Scenario 1: re-register after the check

    • Register

    • Key is modified

    • Handler enters

    • Key may be modified again (but we don't care)

    • We read the value

    • The value might change (that's bad!)

    • Re-register

    ⇒ We might miss the change after we read the value. Our state is now inconsistent with the registry value.

    Scenario 2: re-register before the check

    • Register

    • Key is modified

    • Handler enters

    • Key may be modified again (but we don't care)

    • Re-register

    • Key may be modified again → handler will execute again when we're done

            Note: we have to actually wait for the event so it won't be called recursively or in parallel.

    • We read the value

    • The value might change → handler will execute again when we're done

    ⇒ We might miss some intermediate values, but if the barrage of updates stops, eventually we'll get it right.

    However it needs to be stressed again that Raymond's suggestion won't work in the general case and is to be avoided in other situations to prevent bugs caused by missed updates.

    I have worked with the BSD model before and I cannot advertise it as a replacement. The drop-what-you're-doing-and-handle-it-now model is a nightmare to work with in complex applications and the bit-mask signal model is very limiting. It's also too easy in certain situations to make the same error as in scenario 1 while thinking you're using the API as intended. On that note, the RegNotifyChangeKeyValue MSDN page should contain a code sample showing how to get it right, but at present the only sample there just shows how to wait for a single change, is overly verbose and suffers from the ‘i++ //increment i’ disease.

    Of all the event models I've encountered, the VB event model is the easiest to work with. At its core is the Windows message queue, but rather than having to cast the arguments of the window procedure to the proper type depending on message code, every event type gets its own procedure.

  10. Ian Boyd says:

    Wow, people really confused the problem with the example. Shouldn't have mentioned the registry, then people would have been forced to focus on the useful code pattern.

  11. Anonymous Coward says:

    Ian, the problem is precisely that the code pattern works for this particular problem (getting notifications about a registry key) but not in general.

  12. Jerome says:

    Again, I'm in South Africa so I'm commenting nearly 12 hours after everone else. :(

    I love the way so many take every part of the example so literally. It's intended to be general, right? Raymond has provided a problem, and a possible solution that he deliberately left a hole in, to make us think about the problem.

    My managed code solution:

    Always register for the change immediately. Then fire off my async code to read the properties. The async code is serialized via a SemaphoreSlim (actually I use something similar – an AsyncLock, which is sample code from the Parallel programing blog.)

    I also pass the reading properties code a CancellationToken, which is in a class-level property. If there's another change, the currently running task is cancelled and the token is reinitialized for the next task, and so on. However many changes happen while the code is running doesn't matter, all but the last one will be cancelled, and the last one will save the correct property values.

  13. alexcohn says:

    Anonymous Coward: thanks for the detailed post!

  14. Smouch says:

    Indeed, Raymonds "solution" isn't a solution at all, regardless of whether the notification is a one shot, continuous.  The fact is, as soon as you allow for the possibility of re-entrance, you need to guard the callback with some kind of lock, or program the callback so that re-entrancy isn't an issue.

    My preference is the latter, so I almost always simply enqueue the notification, and do nothing else in the callback (except re-register in this case).  This helps to ensure that the "notifier" isn't blocked, which is also frequently a serious issue.

    Another thread (not even necessary, if you know how to write a proper application) can be working on the messages, and then has the opportunity to ensure messages are processed in order, conflated, or discarded as desired.  

    Working on messages inside a callback usually bites one in the tuchas.

  15. John Doe says:

    @Smouch, you're almost right. You should at the very least fetch the new registry value in the callback, because between the time you queue the notification and the time you'll read the value, the value may not be the same anymore.

    Or better yet, only keep the latest notification.

    In the registry's specific case, there are a few nuances:

    • changed made by RegRestoreKey are not notified
    • only NT/2K/XP notify changes between two calls while the handle is still open

    • only from 8 does it support no thread affinity

    • you have to manage your own one-shot notification handling algorithm, and all presented by Raymond still have problems

    But in the end, for what it's worth, you're probably overthinking it. Shouldn't the registry be some sort of lightweight DB for settings? Shouldn't it be something that an application's instance reads in the beginning and saves when the user applies or saves settings? Shouldn't it be that if multiple instances need to synchronize settings, then you'll need IPC and/or an out-of-process settings manager instead of local "solutions" (hacks)?

  16. Smouch says:

    I wasn't talking about the registry in particular, rather the concept of callbacks, be they one-shot, or continuous.

  17. dave says:

    The APC provides a nice model for callbacks, but alas, it seems to be seldom used.

    (I come from a pre-NT Cutlerian culture where the ancestor of the APC, the AST, was the primary program-organization principle, at least amongst the cognoscenti.  Of course, we didn't have to worry about languages with complicated runtime systems then.)

  18. Gabe says:

    John Doe: If you have multiple instances of a program communicating via the Registry, you're probably doing something wrong. Odds are more likely that you want your program to be able to respond to configuration changes made by hand (via Regedit) or done remotely (via Group Policy?).

  19. John Doe says:

    @Gabe, that's my point exactly. I wasn't suggesting that the registry be used for communication. Such mis-fuctionality would be called what, registry I/O or registry IPC?

    What I suggested is that if this notification is necessary in multipel processes, there probably should be an intermediate one that checks this centrally and notifies interested (i.e. registered) applications.

Comments are closed.