I mentioned yesterday that the C/C++ language was defined to operate on an abstract machine. At the time I didn't know of an online reference to the C or C++ language standard, but a little birdie pointed me to this, which is a draft of the C language specification.
In section 126.96.36.199, you find:
- The semantic descriptions in this International Standard describe the behavior of an
abstract machine in which issues of optimization are irrelevant.
- Accessing a volatile object, modifying an object, modifying a file, or calling a function
that does any of those operations are all side effects, which are changes in the state of
the execution environment. Evaluation of an expression may produce side effects. At
certain specified points in the execution sequence called sequence points, all side effects
of previous evaluations shall be complete and no side effects of subsequent evaluations
shall have taken place. (A summary of the sequence points is given in annex C.)
- In the abstract machine, all expressions are evaluated as specified by the semantics. An
actual implementation need not evaluate part of an expression if it can deduce that its
value is not used and that no needed side effects are produced (including any caused by
calling a function or accessing a volatile object).
- When the processing of the abstract machine is interrupted by receipt of a signal, only the
values of objects as of the previous sequence point may be relied on. Objects that may be
modified between the previous sequence point and the next sequence point need not have
received their correct values yet.
- The least requirements on a conforming implementation are:
* At sequence points, volatile objects are stable in the sense that previous accesses are
complete and subsequent accesses have not yet occurred.
* At program termination, all data written into files shall be identical to the result that
execution of the program according to the abstract semantics would have produced.
* The input and output dynamics of interactive devices shall take place as specified in
7.19.3. The intent of these requirements is that unbuffered or line-buffered output
appear as soon as possible, to ensure that prompting messages actually appear prior to
a program waiting for input.
- What constitutes an interactive device is implementation-defined.
- More stringent correspondences between abstract and actual semantics may be defined by
<The standard goes on and gives examples and clarifications associated with those examples>
There are more details scattered throughout the document, but this is the section that defines the machine. A couple of things that I find quite cool in it:
Section 2 which states that side effects of previous evaluations must be complete at sequence points (for example at the ; at the end of a statement) What that means is that a compiler is free to reorder operations in any way it chooses but it can't reorder them across sequence points IF the operations have side effects.
So if you have:
*p++ = p[i++];
printf("p is: %p", p++);
the compiler could generate about 4 different versions (depending on what order in which the post increments occur), but the first statement can't affect the printf statement.
Section 3 states that a compiler can reorder code if there are no side-effects. It also says that the compiler can discard code if it figures it's not used.
What's also fascinating is that the standard says nothing about concurrency or multiple execution units - the C language definition defines what happens within ONE execution unit. It is also explicitly mute about other execution units. Why is this important?
Consider the following code:
static int myFirstSharedValue, mySecondSharedValue;
myFirstSharedValue = 5;
mySecondSharedValue = 42;
if (mySharedValue == 5)
The compiler could optimize the assignment of mySecondSharedValue to occur AFTER the if test (assuming that nothing in <do stuff> depends on the value of mySecondSharedValue. The compiler also might reorder the assignment to put the second assignment first!
What's worse, the processor might chose to reorder how the data was saved in memory. As long as the read of mySecondSharedValue for that execution unit returns the value of 42, it doesn't actually matter when the value of 42 is saved in memory. It might easily be before the first value is written. As long as you've only got one thread running, it doesn't matter.
On the other hand, if you have multiple threads that read those values, it would be easy to depend on the write to myFirstSharedValue happening before the write to mySecondSharedValue - after all, that's what the code says, and that's what the language defines.
But the language is defined for the abstract execution unit above, and that might not match the real system.
This is why people who try to write correct lock-free programming end up tearing their hair out, and why it's so hard to write lock free code.
Btw, Herb Sutter's been chairing a working group that's chartered to define the abstract machine definition for Windows programs, some of his work can be found here.