Software Contracts, Part 3 – Sometimes implicit contracts are subtle


I was planning on discussing this later on in the series, but "Dave" asked a question that really should be answered in a complete post (I did say I was doing this ad-hoc, it shows).

 

Let's go back to the PlaySound API, and let's ask two different questions that can be answered by looking at the APIs software contract (the first one is Dave's question):

I am happy to fulfill my contractual obligations but I need to know what they are. If you don't tell them, how is the caller to know that you need their memory until the sound finishes playing?

If I call PlaySound with the SND_ASYNC flag set, how can I know if the sound's been played.

As I implied, both of these questions can be answered by carefully reading the APIs contract (and by doing a bit of thinking about the implications of the contract).

Let's take question 2 first.

The explicit contract for the PlaySound API states that it returns TRUE if successful and FALSE otherwise.  If you specify the SND_ASYNC, what does that TRUE/FALSE return mean though?  Well, that's not a part of the explicit contract, it must be a part of the impicit contract.

Remember that the PlaySound API only has three parameters (the sound name, a module handle and a set of flags).  All of these parameters are INPUT parameters - there's no way to return the final status in the async case.  Since there's no way for the AP to return whether or not the sound successfully played, the only way that the return from the API contained an indication of the success/failure of playing the sound implies that the SND_ASYNC flag didn't actually do anything.  And that violates the principle of least surprise - if the SND_ASYNC flag was a NOP, it would be a surprise.

And in fact all the call to PlaySound does is to queue the request to a worker thread and return - the success/failure code refers to whether or not the request was successfully queued to the worker thread, not to whether or not the sound actually played.

 

No for Dave's question...

First off: One critical part of interpreting software contracts is:  If you have a question about whether or not a function behaves in a specific manner, if it's not specified in the explicit contract, assume the answer is 'no' unless otherwise specified.

Since the contract for PlaySound is currently silent about the use of memory in combination with the SND_ASYNC flag, you should always make the most conservative assumptions about the behavior of PlaySound.  Since the API documentation doesn't say explicitly that the memory can be freed while the sound is playing, you should assume that it shouldn't.  And that means that the memory handed to the PlaySound call must remain valid until the call to PlaySound has completed playing the sound.

 

But even without that, with a bit of digging, you can come to the same answer.

Here's how my logic works. Both of the givens below are either explicit or implicit in the contract.

  1. You own the memory handed to PlaySound - you are responsible for allocating and freeing it. You know this because PlaySound is mute about what is done with the memory, thus it has no expectations about what happens to the memory it uses (this is an implicit part of the contract).
  2. The default behavior for PlaySound is synchronous (you know this because the documentation states that the SND_SYNC flag is the default behavior) (this is an explicit part of the contract).

 

You can also assume that the SND_ASYNC flag is implemented by dispatching some parts of the call PlaySound to a background thread.  This is pretty obvious given the fact that something has to execute the code to open the file, load it into memory, and play it.  You can verify this trivially by using your favorite debugger and looking at the threads after calling PlaySound with the SND_ASYNC flag.  In addition, there are no asynchronous playback calls in Windows, so again, it's highly unlikely the playback is done using some kind of interrupt time processing (it's possible, but highly unlikely - remember that PlaySound was written for Windows 3.1).  I actually went back to the Windows 3.1 source code for PlaySound and checked how it did it's work (there were no threads in Windows 3.1) - on Windows 3.1, if you specified the SND_ASYNC flag, it created a hidden window and played the sound from that windows wndproc.

But even given this, we're not done.  After all, it's possible that the PlaySound code makes a private copy of the memory passed into PlaySound before returning from the original call.  So the decision about whether or not the memory passed into the PlaySound API can be freed when specifying SND_ASYNC really boils down to this: If PlaySound makes a private copy of the memory, then the memory can be freed immediately on return, if it doesn't, you can't.

This is where you need to step back and make some assumptions.  Up until now, pretty much everything that's been discussed has been a direct consequence of how the API must work - SND_ASYNC MUST be implemented on a background thread, you DO own the memory for the API, etc.

So let's consider the kind of data that appears in the memory for which the PlaySound API is called.

Remember that most WAV files shipped with Windows (before Vista) were authored as 22kHz, 16 bit sample, mono files (for Vista, the samples are all stereo).  That means that each second of audio takes up 44K of RAM.  That means that all non trivial WAV files are likely to be more than 64K in size (this is important).  Again, consider that the PlaySound API was written for Windows 3.1 where memory was at a premium, especially huge blocks of memory (any block larger than 64K of RAM had to be kept in "huge" memory allowing the blocks to be contiguous. 

If Windows were to take a copy of the memory, it would require allocating another block the size of the original block.  And on a resource constrained OS like Windows 3.1 (or Windows 95) that would be a big deal.

Also remember my 2nd point above - the defaut behavior for PlaySound is synchronous.  That means that the PlaySound call assumes that it's going to be called synchronously. 

Given the fact that PlaySound was originally written for Windows 3.1 and given that the default for PlaySound is synchronous, and given the size of the WAV files involved, it thus makes sense that the PlaySound API would not allocate a new copy of the memory for the .WAV file and instead would use the samples that were already in memory - why take the time to allocate a new block and copy its contents over when it was already available.

Now this is a big assumption to make - it might not even be right.  But it's likely to be a reasonable assumption.

So you should assume that PlaySound doesn't take a copy of the memory being rendered, and thus you need to ensure that the memory is valid across the life of the call.

 

Btw, I just was told by the doc writers that they're planning on making this part of the contract explicit at some point in the future.

 

Tomorrow: Let's look at some explicit contracts.

Comments (32)

  1. Anonymous says:

    I think part of the mystery with PlaySound is that there is no documented way of knowing when the sound has finished playing.  Of course the way to handle this is to spin up your own thread, call PlaySound synchronously and then free the buffer, but that seems like a high price to pay for a very common scenario.  I ended up writing my own PlaySound built on top of DirectSound to compensate for some of these issues (among others of course) and was able to design the thing to properly manage buffers transparently.

    One more question, by your reasoning above we can only assume that if we specify SND_RESOURCE | SND_ASYNC to PlaySound that the API will manage freeing the resource (I know its just a mmap’d section of an executable image and doesn’t need much freeing, but…) when playback has finished.  How’s that for deductive reasoning?

  2. Lonnie, there are two major cases for PlaySound.

    The first is calling PlaySound with an alias or filename – in that case, all you need to is to ensure that the memory containing the name of the alias or filename in question remains in memory.  If you use the alias IDs, then you don’t even need to do that.

    The second case is when you call PlaySound with a chunk of memory.  It turns out that you can determine the length of the file from the contents of the FMT section and the contents of the DATA section.  

    Even without that, if you call PlaySound(NULL, …), you’ll stop the playback.  So just call PlaySound(NULL, …) before freeing the memory and you’ll be just fine.

  3. Oh, and Lonnie, you’re right – if you specify SND_RESOURCE, then PlaySound calls LoadResource etc so it guarantees that the memory is still valid during the course of playback.  Similarly, for SND_FILENAME it allocated a block of memory and frees it when done (all behind the scenes).  

    Again, this behavior can be deduced from the API and some common sense.

  4. Anonymous says:

    Whoa, did my RSS reader accidentally cross-link Raymond Chen’s feed with yours???

    (That’s meant as a compliment, BTW.  :p)

  5. Anonymous says:

    Thanks, Larry! I agree that a program can be much more bulletproof if you make the worst-case assumptions about the functions you call. The depressing thing about doing that is that it can significantly complicate the code and obscures its goal. So I think a lot of us tend to extend the explicit contract with our own principle of least surprise contract terms. It would be good if that documentation was upgraded as well as the code to turn implicit into explicit when possible. I know Microsoft moves slowly nowadays, but it can’t be that hard to update MSDN to say "on ASYNC, you must not free the buffer until the sound has finished playing."

  6. Anonymous says:

    "If you have a question about whether or not a function behaves in a specific manner, if it’s not specified in the explicit contract, assume the answer is ‘no’ unless otherwise specified."

    Does PlaySound() access the memory passed in after the call completes?

    Say what? No? Oh, OK then…. 😉

    Yeah, that’s facetious. But…

    "Since the contract for PlaySound is currently silent about the use of memory in combination with the SND_ASYNC flag, you should always make the most conservative assumptions about the behavior of PlaySound."

    …is an argument *against* the actual behaviour of PlaySound().

    IMO, as far as *any* C API goes, you should *always* assume that if the caller passes a caller-allocated memory buffer to a function, any function, then the caller may be free to do what it likes with that buffer after the function has returned, *unless the documentation says otherwise*.

    I therefore contend that the MSDN documentation for PlaySound(), specifically the part describing the SND_ASYNC flag, is defective by being incomplete in this regard.

    And that you’re just making excuses for it.

    You’re just invalidating almost every other line of C code ever written otherwise. Hmmm…..strlen() doesn’t say whether it uses the buffer I passed it after the function returns, therefore I must hang onto it and keep it unmodified for the remainder of my program in order to "always make the most conservative assumptions about [its] behavior".

    🙂

  7. Adam, you’re right – that’s why I’m having the documentation fixed.  But to answer your question: strlen doesn’t also say that it works asynchronously.

    On the other hand, the ReadFile API’s documentation doesn’t say anything about the memory pointed to by lpbuffer being valid from the time the API is called to when it completes.  

    It’s another example of an implicit API contract – even though ReadFile doesn’t say it, it’s contract is: you provide a buffer and ReadFile fills it in.  If the read is asynchronous, the contents of the buffer aren’t valid until the read completes, but until it completes, the buffer MUST remain valid (this falls out of the fact that the buffer is effectively an OUT parameter – out parameters must remain valid from the time an API is invoked until it completes).

    The PlaySound API with the SND_ASYNC flag behaves the same as ReadFile behaves with an LPOVERLAPPED parameter – the buffer passed in must remain valid until the API completes.  There is absolutely no difference in the contracts.

    Having said that, there IS one significant difference: ReadFile provides a mechanism (the LPOVERLAPPED) that can be used to determine when the ReadFile API has completed, the PlaySound API (as has been discussed earlier in this thread) doesn’t.

    On the other hand, the PlaySound API DOES provide a mechanism for canceling any outstanding asynchronous call to PlaySound – calling PlaySound with a NULL filename is documented as terminating any and all outstanding calls to PlaySound, so there IS a safe way of ensuring that the API is completed.

  8. Anonymous says:

    Yes, you’re clever that you can use the fact that it’s written for Windows 3.1 to work out it must require the memory block not to be freed, but 10 years after Windows 95, who will think of such things?

    Also, was not copying the memory a good design decision given that it makes it much harder for the caller to free the memory after the sound has been played (meaning memory is less likely to be freed after being passed to PlaySound)?

  9. Tim, as I pointed out earlier: The reason you can’t free the memory is that the API doesn’t say that it’s ok to free the memory.  And in general, memory passed to APIs must be valid until the API has completed (even when the API is asynchronous).

    But the same logic (not copying multi-hundred K blocks of memory) applies to Win9x as well.

    And yeah, 20/20 hindsight is wonderful, it would have been great if the API had been defined differently.  But as Raymond Chen likes to say: Time machines haven’t been invented yet.

    This behavior has been the behavior of PlaySound since 1991 when the API was originally added to Windows 3.x.

  10. Anonymous says:

    If you are interested in this subject, I suggest that you take a look at "Design by Contract" (http://en.wikipedia.org/wiki/Design_by_contract) and the Eiffel language which heavily relies on it.

  11. Anonymous says:

    A question just popped in my head after reading your entry:

    Following the principle of "least surprise", wouldn’t ASYNChronous calls, by nature, have an "implicit contract" by definition since:

    – It requires some sort of signaling the caller that the operation has ended and the allocation are no longer required (so you can do your "house cleaning")?

    – Or, to remove any need of signaling, an alternative way would be, to segregate any memory allocations in order to remove any dependencies from the calling function?

    My apologies if this is just another a dumb question.

  12. jugger: If this API hadn’t behaved in this manner for 15 years now, I agree – the async code should copy the memory.  But time travel hasn’t been invented yet.

  13. Anonymous says:

    But couldn’t this behaviour be "fixed" in future Windows releases? Surely if Vista had just buffered ASYNC calls to PlaySound then the issue of applications releasing memory too early would eventually go away.

  14. Anonymous says:

    Larry, it’s true that time machine hasn’t been invented yet. But what harm would it be done if we create a new version that DOES copy the memory, and run an old program that does not expect this on it? (Anyway, would someone write a program that modifies the buffer as it plays?)

    I’d say that if it’d be a non-breaking change and your teams should consider fix it this way.

  15. Anonymous says:

    how about a simple memlock inside PlaySound API ? Wouldnt that re-affirm the implicit contract ?

  16. I could be wrong but…: What’s a "memlock"?  You mean locking the pages into memory?  I’m certain that would mess up more than just allocating memory and copying the data.

    Cheong:  Maybe.  But in the scheme of things changing this behavior is not high on the list of priorities.  As I said – it’s behaved this way since 1991, so I’d have a hard time convincing management that we should spend development time (and take a  perf hit) to fix a 15 year old bug that’s not causing any customer pain (I’ve looked – there are currently no OCA hits in the PlaySound API so nobody’s submitted this issue to MS)…  Also, fixing this would involve making a number of other more trivial changes (for instance, if you do:

    szFoo = malloc(sizeof("c:\windows\media\ding.wav"));

    strcpy(szFoo, "c:\windows\media\ding.wav");

    PlaySound(szFoo, SND_ASYNC | SND_FILENAME);

    free(szFoo);

    you’ll hit exactly the same problem – it’s harder to hit this version of the problem though.  If I were to fix the SND_MEMORY case, I’d also have to fix the SND_FILENAME case.

    It’s FAR easier to document the issue than it is to fix it :(.

    In the future, if we ever rewrite PlaySound, this will probably be fixed, but…

  17. Anonymous says:

    Any halfway experienced c/win32/x86 programmer would instantly realize the implicit demands of this API.

    I think the detractors in this thread are being deliberately obtuse, or at least I hope so for their sake..

    Also, copying non-trivial sized memory blocks around just to save the lazy programmer from himself is a sure way to performance hell,.

  18. Anonymous says:

    Cheong: In that case, I can a imagine a programmer writing a program, testing it in Vista and releasing it, and then it crashes on XP.

    The correct way to do this would be to introduce a PlaySoundEx function with proper async behavior – having a mechanism to know that the sound ended, maybe even freeing the memory automatically using an explicitely-stated allocator, etc.

  19. Anonymous says:

    Ah – sorry. The way you’d worded the doc change sentence made me think that "the documentation might get fixed at some point, which was probably going to happen anyway", not "I’ve made them aware this is an error, so they’re definitely fixing it".

    As for ReadFile(), if what you say is the case then its documentation is also bad, and also needs fixing.

    Get the MSDN folks to read the documentation for some non-Microsoft asynchronous APIs. They could start with the POSIX aio stuff (e.g. http://www.opengroup.org/onlinepubs/009695399/basedefs/aio.h.html  , http://www.opengroup.org/onlinepubs/009695399/functions/aio_read.html , etc…)

    "And yeah, 20/20 hindsight is wonderful, it would have been great if the API had been defined differently."

    It doesn’t have to be designed differently, it just needs to be documented properly. That it’s taken 15 *years* to do this is … stunning.

  20. Mike Dimmick says:

    You can effectively use SND_NOSTOP to poll to see whether the sound has finished playing, but that’s not much fun and will of course use lots of CPU (depending on how you poll). I think the only other alternative to spinning up a thread is using the waveOut APIs directly, which is even worse.

  21. Anonymous says:

    Larry, the hard way: http://en.wikipedia.org/wiki/Tachyon 🙂

    Now seriously, we can see that is not possible to change the code, so you change the information about the code ("it’s not a bug, it’s feature!" can of a way).

    But a question remains: If the documentation states that you can only free allocation only after the sound being played, how can you code so you check that the "sound stop playing"?

  22. Anonymous says:

    One thing that I’m sometimes noticing in blog posts from both you and Raymond Chen is that you assume that everyone has your inside-Microsoft knowledge. *You* know that you, as an implementor of PlaySound, would use threads to get async behavior.

    A couple of years ago, before there were any MS blogs, I just had no clue at all how the OS calls were implemented. For all I knew you would be using something kernel-specific that was totally different than user-mode threads to implement async stuff, so I didn’t assume *anything* about the behavior of a function other than what is written explicitly in the documentation.

    What you’ve written about PlaySound is easy for you to think of because you know how it works, and how other API calls are implemented behind the scenes. That doesn’t mean that it’s also easy to figure that out if you haven’t worked on Windows internals for tens of years. 🙂

  23. Anonymous says:

    Gordias > You honestly think that a function that accesses a caller-supplied memory buffer, after it has returned to the caller, even if it exhibits asynchronous behaviour, should not mention this in it’s documentation?

    You could make a similar point about localtime() – it should be obvious that the returned buffer is static and that the result will be overwritten by subsequent calls, but *every single manual or book I’ve ever read* that describes what localtime() does has still pointed this out.

    There are a number of "implicit" rules when it comes to calling C functions. One is that pointer arguments should not be NULL unless the documentation says otherwise. One is that functions should be reentrant unless the documentation says otherwise. And one is that functions do not affect user-supplied memory after they’ve returned, *unless the documentation says otherwise*. No doubt there are more. But if *your* function breaks any of these rules, *document it* as such. It’s not hard – a short sentence or two will suffice. This is not an onerous burden to place on library writers.

  24. Anonymous says:

    I don’t know, Larry: usually I agree with you, but not this time.  This way lie memory leaks and other hard-to-track-down problems.  I appreciate that fixing the documentation, checking back later and calling with a NULL before freeing the memory, and such are necessary workarounds for a bad contract.  But it *is* a bad contract.

    Any API that takes an input buffer and does something asynchronously with it has to also do one of the following things:

    1. Copy the contents to its own buffer synchronously, before it returns.

    2. Take ownership of the buffer, freeing it itself when it’s done.

    3. Provide for some sort of callback so it can *tell* the caller when it’s done.

    An API that doesn’t do one of those is, simply put, bad.  ReadFile isn’t quite the same, since, as you point out, the buffer is an output parameter.  Even so, I’d say that the API should provide a callback so the caller can find out when it’s done without having to poll.

    Consider a parallel to PlaySound, where you want me to hear some favourite music that you have on CD.  You can invite me over (Thank you!) and play it for me (synchronous).  Or you can give me your CD (asynchronous).  In the latter case, I can copy your CD (and fend off the DRM police), you can just give it to me and not want it back, or I can drop by and give it back to you when I’m done.

    Phoning me every 20 minutes to ask if I’m done with it yet… isn’t really a viable option.

  25. Anonymous says:

    Jonathan: Totally agreed. Just like the ReadFileEx() and WriteFileEx() that exclusively released to deal with async. read/write. 🙂

  26. Anonymous says:

    One question that kept on coming up during my earlier post was "How long is it going to take to play

  27. Anonymous says:

    I suspect the first point Tim Lovell-Smith tried to make was not that you should have invented a time machine and fixed the design to copy the data (which I think would have been a mistake anyway) but that the method by which you arrived at your conclusion simply isn’t available to those of us who came to the platform only recently. While I think your suggested method is sound in principle, the example may not be the best. (64K? Oh yeah, I guess I remember that from 20 years ago in my DOS days, but Windows programmers had to deal with that, too? Insane.)

    (To me, it seems obvious that when you tell a function to process some memory asynchronously that you can’t free the memory until whatever logic is spawned by the function finishes. But then I’m used to old-school asynch Mac OS parameter blocks. And patching the functions that traffic in them. And far viler things you don’t even want to know about. At least we had a flat address space. 🙂

  28. Anonymous says:

    A fix to this api would be to enable a callback, or wait mechanism when the sound has finished.

  29. Anonymous says:

    The lack of a robust way to know when the sound is finished playing, is what have led developers assume things that isn’t in this contract.

    You have two options to fix this:

    1. Make a more robust dokumentation.

    2. Make a more robust api (as suggested in a previous post, a way to know when the sound has finished, use overlapped i/o as an good example of this).

  30. MrFixIt: That would change the function signature for the API which would break all the callers.

    Thex nailed it on the head.

  31. Anonymous says:

    Larry,

    Are you on drugs? (only 50% joking)

    Trying to compare Win 3.1 with Vista is like comparing a fuel-efficient moped with a Boeing 7[47]7.

    Vista requires engines (CPU’s) and fuel tanks (disk, not to mention RAM) that could easily swallow *thousands* of 3.1’s. Seriously, for the on-disk price you pay for Vista you could install *literally* over one thousand 3.11’s. To in this context talk about 64KB, when the OS you mention requires *over one thousand times* as much memory just to boot… I don’t know, I find it an insult against all developers [the ones that still care about size and efficiency].

    If MS cared about 64KB… Vista would have to be rewritten basically from scratch (save, possibly, the kernel and its components).

  32. Anonymous says:

    I’m more discombobulated than usual on this series, I totally missed the third article in the series

Skip to main content