Once upon a time, I watched the TV show The Paper Chase. As I recall it, the feared and respected professor Kingman tought contract law, which could be thought as the organic chemistry of the legal profession (organic chem was the class that the chemical engineers at Carnegie-Mellon used to dread, because it was so difficult).
Software contracts are almost as complicated as legal contracts. And we use them every single day without thinking about them.
Every function that’s ever been written specifies a contract. It doesn’t matter if the function is public or private, every single function has a contract.
And just like in contract law, there’s a HUGE body of work that exists to help specify the contract for a function. This is a big topic, and I suspect it will take several posts to cover it (and I’m sure I can’t do it justice). Much like the “programming with style” series, I’m mostly going to be making this up as I go along – I have a huge number of thoughts about software contracts, and it may take a drunkards walk before I can finally get all them out.
What is a software contract?
“Let’s start at the very beginning, a very good place to start”.
A function’s software contract, simply put, defines the behavior of a function. It allows the caller of a function to know what the function does, how it fails, and (in general) what happens when the function executes.
Contracts can be both explicit and implicit – an explicit contract is expressed directly,
A function’s explicit contract is embodied in many forms, and is expressed in different places. It’s embodied in the documentation of the function (either in official published documentation or just in the comments in the function header). It can be embodied in annotations applied to the function definition (SAL annotations are a great example of this). It can be embodied in the names of the parameters to a function. And some contracts are expressed by some well known conventions.
Implicit contracts, on the other hand are never expressed. Implicit contracts are embodied in something that is called “The principle of least surprise”. These are the unwritten parts of a function’s contract that fall into the general area of “common sense”.
The principle of least surprise.
Simply stated, the principle of least surprise is: “Whatever happens when a function executes, it should not behave in a manner that is unexpected”.
So CreateFile shouldn’t reformat the hard disk if it fails. Or, to use a somewhat less drastic version: If CreateFile fails, the filesystem is left in the state it was before the CreateFile call was made. In other words, if you did a DIR of the filesystem before the failed call to CreateFile, and did the same DIR afterwards, you would get the same results. This isn’t to say that under the covers changes were made to the filesystem – that’s possible. But the externally visible behaviors should remain the same.
Similarly, an API that retrieves some data shouldn’t modify that data. Or an API that takes an input parameter shouldn’t modify that parameter without notification. As I’ve mentioned before, CreateProcess fails this. If your function DOES violate the principle of least surprise, you need to make that violation explicit in the contract for your function (usually in the documentation, as was done in the CreateProcess example).
The enforcement of a function’s contract varies. For example, when the RPC runtime library marshals a function from one process/machine/thread to another process/machine/thread, it enforces the contract specified in the MIDL annotations on the function. So when you’re writing the server stub for a function with the function prototype: “
HRESULT Function([out] int * returnedInt);" you don’t have to check for the “returnedInt” parameter being null – the RPC runtime library guarantees that the server side of the function will always have valid storage to hold that value. Other times, the code of a function enforces the contract – it checks for non optional parameters being non null, for example. And sometimes there are external tools that check for contract enforcement (the SAL annotations mentioned above are perfect examples of that – the compiler (with the /analyze switch) ensures that the contract specified in the annotation is met.
Why do we care about software contracts?
We care about software contracts because they help us write better code. In order to interact with a function, you MUST know its contract. In addition, knowing a function’s contract allows you to make certain assumptions, and allows you to avoid pitfalls. For example, the problem in this example becomes blindingly obvious once you knew that CoUnitialize’s contract destroyed the apartment on the last call to CoUnitialize, and that destroying an apartment destroyed all the objects created within that apartment (btw, I couldn’t find any documentation that spelled this out – bad Microsoft :().