Wait, you never said that I had to initialize the object before I used it!


A customer reported that they were having trouble creating slim reader/writer locks at runtime. They simplified the issue to a short program:

#include <windows.h>
#include <iostream>

using namespace std; // this is just a quick test app

int a = 10;

// This version works
int working_version()
{
 SRWLOCK lock;
 AcquireSRWLockExclusive(&lock);
 cout<<"Acquired exclusively"<<endl;
 a++;
 ReleaseSRWLockExclusive(&lock);
}

// This one doesn't
int broken_version_1()
{
 SRWLOCK *lock = new SRWLOCK;
 AcquireSRWLockExclusive(lock);
 cout<<"Acquired exclusively"<<endl;
 a++;
 ReleaseSRWLockExclusive(lock);
 // ignore the memory leak - this is just a quick test app
}

// This one doesn't either
int broken_version_2()
{
 SRWLOCK *lock = new SRWLOCK[2];
 AcquireSRWLockExclusive(&lock[0]);
 cout<<"Acquired exclusively"<<endl;
 a++;
 ReleaseSRWLockExclusive(&lock[0]);
 // ignore the memory leak - this is just a quick test app
}

int main(int argc, char **argv)
{
 switch (argv[1][0]) {
 case '0': working_version(); break;
 case '1': broken_version_1(); break;
 case '2': broken_version_2(); break;
 }

 cout<<"a="<<a<<endl;

 return 0;
}

“What is the correct way of creating an SRWLOCK via the new operator?”

It wasn’t long before somebody noted that nowhere in the code is the function Initialize­SRW­Lock called.

“Oh, yeah, thanks for catching that. It looks like one needs to initialize SRW locks which are created via the new operator. Otherwise it’s not required.”

No, the function is always required. It’s just that you got lucky in the local variable case and the initial stack garbage looks enough like an initialized SRW lock that you don’t notice the problem.

MSDN doesn’t say “You must initialize an SRW lock before using it” because the statement was believed to be so obvious that it never occurred to anybody that somebody would think the opposite was true. I mean, what’s the point of having an Initialize­SRW­Lock function if initialization is not required? Think of it as one of the ground rules for programming: If an object has an initialization method, you must initialize the object before using it.

But just to be sure, I’ve submitted a documentation change request to add the requirement.

Bonus chatter: A common coding pattern is to wrap the low-level C-style object inside a C++style RAII-style object.

Bonus chatter 2: If you’re creating a highly-concurrent system, then you should probably put each lock on its own cache line.

Comments (29)
  1. 640k says:

    if(!lucky)

    {

    AcquireSRWLockExclusive(…);

    }

  2. RP says:

    'If an object has an initialization method, you must initialize the object before using it.'

    can be generalised to

    'If you want to use something, you must initialize it first.'

    That way, you cover not just objects but all types.

  3. henke37 says:

    It would have taken less code to properly free the memory than the comment about omitting that.

    [Good job, you found it. (Less code but more typing.) -Raymond]
  4. Masklinn says:

    > If you want to use something, you must initialize it first.

    On the other hand, this is C++ code and the `new`-ed the type to instantiate it, so it was not entirely abnormal to expect the object to be initialized at the end of the constructor: that's the normal behavior of C++ object (and Microsoft::WRL::Wrapper::SRWLock provides exactly that).

    On the other hand, the usage of C-style functions to acquire and release the lock (rather than method) hinted at SRWLOCK not being a C++ class.

    [(1) Constructors are run both for newed variables and local variables, so that doesn't explain the author's confusion. (2) If newed variables didn't need to be initialized, and local variables didn't need to be initialized, then what's the point of InitializeSRWLock? In the author's mental model, you would never need to call it! -Raymond]
  5. Mike says:

    The real issue here is having to call initializers at all. We should invent a language where things are, say, 'constructed' for you when you new them up…

    Bonus Chatter: I know I'm just being silly and slightly tongue in cheek and understand why C APIs are (sometimes!) necessary. I all seriousness tho, he may've come from a background of one of those languages and not even considered it, especially new programmers who learn Java and C# as their first main stream language unlike us old fogeys. So in this case docs explicitely saying this are no bad thing.

  6. I all seriousness tho, he may've come from a background of one of those languages and not even considered it, especially new programmers who learn Java and C# as their first main stream language unlike us old fogeys.

    Agreed. I actually ran into exactly this problem for exactly this reason once :)

  7. RangerFish says:

    Pit of success, anyone?

  8. waleri says:

    How do I "put each lock on its own cache line"?

  9. Mason Wheeler says:

    "If you're creating a highly-concurrent system, then you should probably put each lock on its own cache line."

    If you're creating concurrency primitives for people to use in highly-concurrent systems, you should pad the object with enough filler bytes that it's large enough that each lock will necessarily end up on its own cache line. :P

  10. No says:

    SRWLOCK_INIT is defined as zero, and SRWLOCK_INIT is part of the eternally-supported Windows ABI. You don't have to call Initialize­SRW­Lock — you can just zero-initialize your SRWLOCK instead. Globals and statics are automatically zero-initialized, so you only really have to worry about initializing SRWLOCKs in dynamic storage.

  11. "I tested the code without initialization and it worked. Why the documentation doesn't say that initialization is not required?"

  12. Gabe says:

    Mason Wheeler: Cache line sizes aren't known until run time, but padding is specified at compile time. The only way to make sure there's no false sharing is to use the largest possible padding. Why should a 4-byte field take 64 bytes? And then what happens when a CPU comes out with 128-byte cache lines?

    What if you have millions of objects that each need their own RW lock? Instead of a 4-byte field that might fit in with other padding (thus taking up 0 additional bytes), you have over 64MB of additional memory use with 60MB of it wasted! A little bit of cache line reloading will be orders of magnitude faster than the extra page faults incurred by all that padding.

  13. Mason Wheeler says:

    @Gabe: If you have millions of objects (or even hundreds, IMO) that each need their own RW lock, all instantiated at the same time, you've got bigger problems to worry about! :P

  14. dave says:

    <smug>

    If you don't know how to go about putting each lock in its own cache line, you shouldn't be doing designs that depend on putting each lock in its own cache line.

    </smug>

    Really: solutions need to be understood before being used.

  15. Eugene says:

    On the subject of documentation. The documentation for SetCurrentProcessExplicitAppUserModelID function that you mentioned recently neglects to say that it uses stdcall calling convention. Strictly speaking it gives an invalid prototype for the function. I know it's not hard to guess, but it cost me some time yesterday.

    [If you look at other functions, you'll see that MSDN has a general policy of suppressing the calling convention details to remove clutter. Just include the proper header file and the header file will declare the calling convention. -Raymond]
  16. @Gabe:

    Cache line size for x86 and x64 is fixed at max 64 bytes, and the OSes have that assumption hardcoded. It's not likely to change ever for IA processors.

    @Eugene:

    Are you writing in ASM?

  17. Douglas says:

    @Raymond

    Header files don't work when you're p/Invoking =)

    [If you want to write a p/invoke you need to study the header file since that's what actually happens. The MSDN prototype doesn't include A/W redirection either. -Raymond]
  18. Gabe says:

    Mason Wheeler: What problems do I have to worry about in a system with large numbers of objects with locks? Indeed, one would presume that MS wouldn't have bothered to come up with such a "slim" RW lock if systems with hundreds of them (let alone millions) had such problems.

    Have you ever used the JVM or CLR (.Net runtime)? Each object in those systems has a lock, and you can have systems with billions of objects if you have enough RAM. Of course having a lock per object may be a questionable design decision, but it isn't a critical weakness. The locks in question are actually monitors rather than RW locks, but I think the point stands.

    If you would like an example with RW locks, how about an in-memory database? Each row could have its own RW lock and you would certainly expect there to be millions or billions of rows. Even if you had only one lock per table, with typical databases having hundreds of tables you would still have hundreds of locks!

    [If you have millions of locks, then you are also unlikely to have high contention, so false sharing isn't that big a deal. -Raymond]
  19. @No: You're kinda assuming Initialize­SRW­Lock doesn't do anything else, either now or in some hypothetical future version of Windows.

    @alegr: And if your app ends up still running on some future version of Windows on some future x86 variant where that assumption no longer applies?

    Honestly, you'd think after reading this blog for any length of time the practice of blindly assuming that what is true today will remain so forevermore would've been thoroughly stomped out. 640K really isn't enough for everyone!

  20. Eugene says:

    @Raymond

    I need my application to work on WinXP, so I cannot link directly to Win7-only functions. I have to GetProcAddress it, so prototype doesn't help much.

    And MSDN used to include calling convention in prototypes. Take any old function: LoadLibrary, CreateFile, EnterCriticalSection etc. Or SRW functions mentioned in this post — all have WINAPI, which is a calling convention macro. In fact, the only functions without calling convention that I can find in a couple of minutes are from Shell32.dll. So this looks more like Shell Team policy rather than general MSDN policy.

    [if you're going to re-prototype a function, get it from the header file. That's where the compiler gets it. (Or use decltype.) The MSDN documentation is presented for exposition, not for handing directly to a compiler. (SAL annotations are often stripped down or removed entirely, and separate prototypes are not given for A/W variants.) -Raymond]
  21. Carl D says:

    @Gabe

    While the CLR and JVM logically have a lock per object, neither of them is actually implemented that way.  Both use a pool of locks that are associated with objects on an as-needed basis.  Since the vast majority of objects are never waited on, this saves resources on an enormous scale.

    A quality design for an in-memory database would likely involve a similar level of indirection and not actually allocate a lock per row.

  22. @Eugene

    Even still, it is well known that the MSDN is for documenting, and if you have found some functions that do it then it is best not to assume that everyone will put in the calling convention. Oh, and it isn't related to the shell team either, look at the WDK documentation, none of them include the calling convention either. So even though it is their policy, not all of the teams stick to it.

    I agree with Raymond, if you want to re-prototype a function, get the prototype from the header, that is the only way you are guaranteed accuracy. This is how I normally do it when I need to access a small amount of backwards incompatable functions. If I need to use a larger amount, I tend to write DLLs that wrap the behaviour I want and dynamically load it after a version check.

  23. Matt says:

    On the subject of Pinvoking, the default calling convention is WINAPI (which is __stdcall on x86), and it transparently handles the W/A case (under the hood it tries GetProcAddress(module, name) – if that fails, then it tries GetProcAddress(module, name+W) and sets the conversion of strings to unicode, and if that fails it tries GetProcAddress(module, name+A) and sets the conversion mode to ascii).

    BONUS: This logic is the same as the logic used by rundll32.exe :)

  24. Matt says:

    @Eugene When in doubt, Microsoft uses the WINAPI calling convention. There are some exceptions (like printf) but by-and-large, expect all Microsoft functions to be WINAPI.

  25. "If newed variables didn't need to be initialized, and local variables didn't need to be initialized, then what's the point of InitializeSRWLock? In the author's mental model, you would never need to call it!"

    If they're using a higher level wrapper, it seems reasonable to assume the wrapper calls InitializeSRWLock for you – after all, you aren't explicitly calling any memory management functions either, but they are still documented and involved behind the scenes.

    In this particular case, the 'memory leak' could pass for an optimization if it had a good PR team: it gets disposed of implicitly when the process exists anyway, and that automatic disposal could well be more efficient. Not an issue here, but I've seen a few programs spent long enough to be quite irritating cleaning up all their internal state before exiting. (In particular, swapping pages back in just to dispose of their contents, when the OS could just have marked those pages free on exit without the excess I/O.)

  26. John Doe says:

    «MSDN doesn't say "You must initialize an SRW lock before using it" because the statement was believed to be so obvious that it never occurred to anybody that somebody would think the opposite was true. I mean, what's the point of having an Initialize­SRW­Lock function if initialization is not required? Think of it as one of the ground rules for programming: If an object has an initialization method, you must initialize the object before using it.»

    @Raymond, not everyone knows the Windows API as you do. I dare to say most people don't, actually. It may seem very logic to you that the mere presence of an Init* function in the reference is enough to imply that it must be used. I could say the same about the API's I or my colleagues develop to people outside the department or even the company.

    I say it should be explicitly there. What's not there is not to take for granted. Of course, there are limits to this, such as passing strings that are not zero-terminated to C API's that deal with char * (and the likes of it). But imagine that there was no string convention for C; then at least the latter would have to be specified somewhere, either in each function reference or in an introductory chapter stating that "every char * parameter and return value is a zero terminated string".

    I regard the MSDN library as a very good quality documentation library, actually one of the best you can read and navigate online. It often states how to initialize many structures, mainly the first field of structures that are expected to evolve between versions of Windows. These don't have an initialization function, so they kind of have to say how to initialize. Others don't need initialization, such as an array of structs that is filled by some API function. Yet others have initialization functions, such as VariantInit:

    «Before use, all VARIANTARGs must be initialized by VariantInit.»

    Why should the synchronization section be any different from others in this regard? Why not say how and when to initialize the objects it handles? The user doesn't know what's required unless it's stated. Try to put yourself on the other side.

    Disclaimer: I belong to that group of people that doesn't know the Windows API that intimately, so there may be other sections that also lack documentation about required initializations. I myself may have taken some initializations for granted just because I've seen an Init* function in the reference section.

    Disclaimer: My intention is not to point a finger to the MSDN library or the synchronization section, I'm just keeping the subject of conversation.

  27. arm ram says:

    @alegr1: Good luck cooding on WindowsRT.

  28. Joshua says:

    @Carl D. Last I checked, the slab locks that I prefer to use require only zero initialization and no cleanup, and when not being waited on way two pointers in size (actually an atomic_t followed by a pointer but padding will push it to two pointers in the rare architecture where that's different).

  29. Why did they make every object in .NET / Java an object, anyway?  It seems like that needlessly complicates the runtime, when a class in the base library called "Monitor" could have done the job just as well.  As an added bonus, it forces the user to specify that functionality in a scope.  As it is, who knows what objects are being used as locks and what are not?

Comments are closed.