The case of the SRWLock violation in a thread pool work item


A customer encountered a failure detected by Application Verifier and wanted some assistance understanding what happened and how they can fix it.

The verifier failure was "invalid SRWLock owner", and the failure details were as follows:

  • Lock = (the address of an SRWLock)
  • Local thread ID = (the thread trying to release the lock)
  • Owner thread ID = (some thread pool thread)
  • Acquire stack trace = (see below)

The acquire stack trace looked like this:

verifier!AVrfpRtlAcquireSRWLockShared+0x5e
contoso!Microsoft::WRL::Wrappers::SRWLock::LockShared+0x16
contoso!CDoodad::NotifyListenersAsync+0xb2
contoso!CDoodad::OnPropertyChanged+0x20a
contoso!CChannel::ProcessNotification+0x20a
ntdll!RtlpTpWaitCallback+0x9b
ntdll!TppExecuteWaitCallback+0x9b
ntdll!TppDirectExecuteCallback+0xb9
ntdll!TppWorkerThread+0x497
kernel32!BaseThreadInitThunk+0x14
ntdll!RtlUserThreadStart+0x21

The thread pool processed a notification from the channel, and the channel handed the property-change notification to the doodad, and the doodad took a shared lock so it could notify the listeners.

Here's the stack that is trying to release the lock:

ntdll!RtlReportException+0x9d
verifier!VerifierCaptureContextAndReportStop+0xc0
verifier!VerifierStopMessageEx+0x750
verifier!AVrfpVerifySRWLockRelease+0x126
verifier!AVrfpRtlReleaseSRWLockShared+0x42
contoso!Microsoft::WRL::Wrappers::Details::SyncLockShared::{dtor}+0x12
contoso!Doodad::Notifier::`scalar deleting destructor'+0x20
contoso!Doodad::Notifier::DispatchNotification+0x11a0d
ntdll!TppWorkpExecuteCallback+0x131
ntdll!TppWorkerThread+0x43e
kernel32!BaseThreadInitThunk+0x14
ntdll!RtlUserThreadStart+0x21

The customer was kind enough to share the code for the classes in question. Here's what the code is trying to do: When the doodad learns from the channel that a property has changed, it notifies all the listeners who subscribed to that change. To avoid blocking the channel, the code queues a work item to deliver the notifications.

But there's a race condition: What if the doodad is destroyed while the work item is still waiting in the thread pool? When the work item gets to run on the thread pool, it will try to access a freed doodad. (You also have this problem if the doodad is destroyed while the work item is running.)

The code solves this problem like this:

// Code in italics is wrong.
class Doodad
{
 ...
private:
  ~Doodad()
  {
    auto ensureNoNotifications = m_notifyLock.LockExclusive();
    ... other cleanup ...
  }

  class Notifier
  {
  public:
    static void QueueNotification(CDoodad* doodad)
    {
      auto notifier = std::make_unique<Notifier>(doodad);
      if (!QueueUserWorkItem(DispatchNotification, notifier.get(), 0)) {
        throw some_error();
      }
      notifier.release(); // work item owns the Notifier now
    }

  private:
    Notifier(CDoodad* doodad) :
      m_doodad(doodad),
      m_sharedLock(doodad->m_notifyLock.LockShared())
    {
    }

    static DWORD CALLBACK DispatchNotification(void* parameter)
    {
      // ensure that the Notifier gets deleted
      auto notifier = std::make_unique<Notifier>(
                        reinterpret_cast<Notifier*>(parameter));

      ... notify the listeners ...
    }

  private:
    CDoodad* m_doodad;
    Microsoft::WRL::Wrappers::SRWLock::SyncLockShared m_sharedLock;
  };

  void NotifyListenersAsync()
  {
    Notifier::QueueNotification();
  }

  Microsoft::WRL::Wrappers::SRWLock m_srwlock;
};

In words: When we want to notify the listeners, we create a Notifier object and queue it onto the thread pool. The Notifier object has a reference to the Doodad, as well as a shared lock. Since a shared lock can be acquired multiple times, this means that multiple notifications can be in flight.

When the work item is dispatched, it notifies all the listeners and then destroys the Notifier, which in turn releases the shared lock.

Finally, when the Doodad is destructed, it takes an exclusive lock. Since an exclusive lock cannot be acquired while there are still any shared locks active, this will wait until all of the notification work items have been retired. That way, no notification will operate on a destroyed Doodad.

Do you see the problem?

If you know what's good for you, you put RAII object that represents a lock on the stack, or inside an object whose lifetime is tied to the stack, because the lifetime of the lock needs to be tied to the thread.

But this code doesn't do what proper deity-fearing code does. Instead, it saves the lock in a member variable of an object that is not destroyed before the function returns. This means that the lifetime of the shared lock is not tied to a thread. The shared lock is acquired by the thread that queues the work item, and the shared lock is released by the thread that processes the work item. If you're lucky, they are the same thread and nobody gets hurt. But if you're not lucky, then they are different threads, and you violated the lock rules and have entered the world of undefined behavior.

The developer here thought they were being clever by abusing the SRWLock, but in fact they were getting themselves into trouble.

To fix the problem, they switched to a custom synchronization object built out of Wait­On­Address.

class Doodad
{
 ...
private:
  ~Doodad()
  {
    WaitForValueByAddress(m_notificationCount,
      [](auto&& value) { return value == 0; });
    ... other cleanup ...
  }

  class Notifier
  {
  public:
    static void QueueNotification(CDoodad* doodad)
    {
      auto notifier = std::make_unique<Notifier>(doodad);
      if (!QueueUserWorkItem(DispatchNotification, notifier.get(), 0)) {
        throw some_error();
      }
      notifier.release(); // work item owns the Notifier now
    }

  private:
    Notifier(CDoodad* doodad) : m_doodad(doodad)
    {
      InterlockedIncrement(&doodad->m_notificationCount);
    }

    ~Notifier()
    {
      if (InterlockedDecrement(&doodad->m_notificationCount) == 0) {
        WakeByAddressSingle(&doodad->m_notificationCount);
      }
    }

    // Not copyable or movable
    Notifier(const Notifier&) = delete;
    Notifier(Notifier&&) = delete;
    Notifier& operator=(const Notifier&) = delete;
    Notifier& operator=(Notifier&&) = delete;

    static DWORD CALLBACK DispatchNotification(void* parameter)
    {
      // ensure that the Notifier gets deleted
      auto notifier = std::make_unique<Notifier>(
                        reinterpret_cast<Notifier*>(parameter));

      ... notify the listeners ...
    }

  private:
    CDoodad* m_doodad;
  };

  void NotifyListenersAsync()
  {
    Notifier::QueueNotification();
  }

  LONG m_notificationCount = 0;
};

Wait­On­Address requires Windows 8 or higher, and the customer was okay with that. If the customer needed something that ran on Windows Vista, they could have accomplished something similar with a condition variable.

class Doodad
{
 ...
private:
  ~Doodad()
  {
    auto lock = m_notifierLock.LockExclusive();
    SleepConditionVariableExclusiveSRWUntil(&m_notifierCV,
        &m_notifierLock,
        [](auto&& value) { return value == 0; });
    ... other cleanup ...
  }

  class Notifier
  {
  public:
    static void QueueNotification(CDoodad* doodad)
    {
      auto notifier = std::make_unique<Notifier>(doodad);
      if (QueueUserWorkItem(DispatchNotification, this, 0)) {
        notifier.release(); // work item owns the Notifier now
      }
      throw some_error();
    }

  private:
    Notifier(CDoodad* doodad) : m_doodad(doodad)
    {
      auto lock = m_doodad->m_notifierLock.LockExclusive();
      &doodad->m_notificationCount++;
    }

    ~Notifier()
    {
      auto lock = m_doodad->m_notifierLock.LockExclusive();
      if (--doodad->m_notificationCount == 0) {
        WakeConditionVariable(&doodad->m_notificationCV);
      }
    }

    // Not copyable or movable
    Notifier(const Notifier&) = delete;
    Notifier(Notifier&&) = delete;
    Notifier& operator=(const Notifier&) = delete;
    Notifier& operator=(Notifier&&) = delete;

    static DWORD CALLBACK DispatchNotification(void* parameter)
    {
      // ensure that the Notifier gets deleted
      auto notifier = std::make_unique<Notifier>(
                        reinterpret_cast<Notifier*>(parameter));

      ... notify the listeners ...
    }

  private:
    CDoodad* m_doodad;
  };

  void NotifyListenersAsync()
  {
    Notifier::QueueNotification();
  }

  LONG m_notificationCount = 0;
  Microsoft::WRL::Wrappers::SRWLock m_notificationLock;
  CONDITION_VARIABLE m_notificationCV = CONDITION_VARIABLE_INIT;
};
Comments (16)
  1. Pietro Gagliardi (andlabs) says:

    Why not just allocate a single SRW lock (Microsoft::WRL::Wrappers::SRWLock *)? It seems to me like the real problem is that the lock is copied rather than shared… Or does WRL do magic to avoid this scenario?

    And I’ve never understood the zealousness of demanding that the same thread that acquires a lock be the one that releases the lock, especially for a slim lock where you couldn’t even store data for thread affinity. Where is the contract that enforces that rule for SRW locks? Is it specific to the WRL wrapper (you did say RAII, after all) or part of the base SRWLOCK system?

    1. Not sure what your question is asking. There is a single SRW Lock (in the Doodad) that is never copied. The lock guard is moved into a member, and then the guard is released from the wrong thread. Thread affinity is one of the rules of SRWLock.

  2. Peter Doubleday says:

    An interesting example of how C++11 (and later) optional restrictions on things like constructors and operators (ie the = delete) bit) can be used to enforce a low-level contract. It’s basically RAII on steroids.

    I wonder how many C++ shops have actually left the 20th century and embraced this stuff?

    1. McBucket says:

      There’s plenty of 20th century code that works fine without C++11 additions. Moreover, since C++11 came about some time after the turn of the century, 2000 not a particularly good line of demarcation for archaic vs. “modern” C++ practices anyways. But to answer the question I think you’re asking, we adopt features as appropriate and when they’re available in the compiler that we’re using, one which has historically been slow at adopting C++ standards, though that’s changing (can you guess which one it is?), and that compiler choice is dictated / affected by availability of 3rd-party library options that we rely on. Even so, rewriting large portions of the code base to take on C++11 and higher features “just because” isn’t always a practical move either.

  3. pm100 says:

    // Code in italics is wrong.

    all the code you give from the customer is in italics. Is that a joke or is it a styling error

    1. pc says:

      It’s not uncommon for Raymond to post large sections of code in italics, when the approach the code is taking is wrong.

    2. Scarlet Manuka says:

      Probably there’s not much point showing us the correct bits of code? The focus of this post is “this is the wrong way to do this, here is the right way”, so the incorrect parts of the customer code are the parts we need to see.

    3. That’s how it’s always been. The entire code fragment is marked “wrong”, and then we discuss what’s wrong with it, and then we present a correct (or at least “less wrong”) version.

    4. DWalker07 says:

      Even the code itself says

      // Code in italics is wrong.

      The same with all of the “bad” code in this same blog. It is supposed to prevent people from copying and reusing an example of bad code. But who knows, the bad code might get copied and reused anyway.

      1. Scarlet Manuka says:

        Complete with “// Code in italics is wrong” comments, no doubt.

  4. Joshua says:

    Personally, I like locks that don’t have the “must be released on same thread” rule.

    From time to time you really do want to lock an object and move it’s ownership to another thread.

    On the other hand, for reference counting I don’t lock at all. I directly use InterlockedIncrement and InterlockedDecrement and the guy who sees InterlockedDecrement return zero is the guy who frees the object.

    I haven’t been able to work out what the shared lock is actually protecting. Maybe it’s in code not present in the example.

    1. What you’ve got there my friend is a semaphore. (Also, the shared lock is protecting destruction of the Doodad.)

  5. Voo says:

    The same problem exists in .NET or Java too where locks also have an owner.

    The only use case for that is to enable recursive locking, i.e. if the thread that owns the lock tries to acquire it out succeeds and increases a lock count instead of causing a deadlock.

    While this sounds like a clever idea on first glance the repercussions are enormous.
    – Any kind of asynchronous code can’t work with it.
    – It’s inefficient to drag that amount of extra data with you and support efficient recursive locking (look at the lengths to which HotSpot goes)
    – But by far the worst: it leads to really, really bad designs. If you’re writing concurrent code where you don’t know what locks you’re holding you have a big design problem. Or as in this case you don’t have a sane lifetime for your objects and have to come up with workarounds.

    We’d be much better off if they weren’t the default but alas too late.

  6. DWalker07 says:

    Doodads! We need more doodads.

  7. anonymous says:

    Synchronization in a destructor is usually a bad idea. Might work in this case, as long as there’s no classes that derive from doodad, but a better solution is to use a reference count.

    1. True, but that would require the doodad to have an independent lifetime, which means a separate heap allocation.

Comments are closed.

Skip to main content