How are BitBlt raster opcodes calculated?


Commenter R P asks what the low-order 16 bits of the BitBlt raster opcodes mean.

The documentation explains that the high-order 16 bits of the raster opcode are the zero-extended 8-bit value that represents the result of the raster operation given the 8 combinations of three binary inputs (pattern, source, and destination). This is the easy part to understand.

The documentation also says that the low-order 16 bits are an operation code, but gives no information as to how the operation code is determined.

Okay, let's dig through the history.

Initially, the BitBlt raster opcodes were only 16-bit values, consisting of the operation codes. The higher-order 16 bits were added later because it turns out that people preferred table lookups to parsing operation codes.

Okay, that's enough beating around the bush. How do I parse the operation codes?

The operation code is interpreted as follows:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
op5 op4 op3 op2 op1 op6 template bias

Operations 1 through 5 select one of the following logical operations:

Value Operation Abbreviation
0 not n
1 exclusive or x
2 or o
3 and a

Operation 6 encodes an optional final not operation.

Value Operation Abbreviation
0 no operation  
1 not n

The operations imply the number of input parameters required. Each binary operation pops two arguments off the stack and pushes the result back onto the stack, for a net reduction of one. The unary not operation pops one argument off the stack and pushes the result back onto the stack, for no net change. And when we're finished, we want one final result on the stack. Therefore, the number of items that need to be placed onto the stack is one more than the number of binary operations.

The template and bias tell you what those parameters are. First, the template selects one of the following sequences of elements.

Value Template
0 SPDD
1 SPD
2 SDP
3 (not used)
4 (not used)
5 SSP*DS
6 SSP*PDS
7 SSD*PDS

Two of the templates are not currently used and are reserved for future expansion.

The bias specifies how many initial template elements to ignore. After that, take the template elements until you have consumed one more than the number of binary operations. Also take the asterisk, if you encounter one. And if you run out of template elements, then start over from the beginning.

Okay, now we put all the pieces together.

  • For each template element:
    • If it is a letter, then push that input onto the stack.
    • If it is an asterisk, then perform the next operation.
  • After you have used up all the template elements, perform all of the remaining operations.
  • When you're done, there should be one value left on the stack. That is the result of the BitBlt operation.

Consider the raster opcode 0x00010289. The operation index is 1, which decodes as follows:

P 1 1 1 1 0 0 0 0
S 1 1 0 0 1 1 0 0
D 1 0 1 0 1 0 1 0
Result 0 0 0 0 0 0 0 1

The operation code is 0x0289, which decodes as follows:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
op5 op4 op3 op2 op1 op6 template bias
0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 1
0 0 0 2 2 0 2 1
n n n o o   SDP 1

There are two binary operation codes (the two or operations), so we will need three parameters.

  • The bias tells us to skip the first template element, so we skip the S.
  • The next element in the template is a D, so we push the destination.
  • The next element in the template is a P, so we push the pattern.
  • We have run out of template elements, so we wrap around and see that we have an S, so we push the source.
  • Now to use up the remaining operations, in order:
  • Operation 1 tells us to pop the top two values from the stack, or them together, and push the result back onto the stack.
  • Operation 2 tells us to pop the top two values from the stack, or them together, and push the result back onto the stack.
  • Operation 3 tells us to pop the top value from the stack, bitwise-not it, and push the result back onto the stack.
  • Operation 4 tells us to pop the top value from the stack, bitwise-not it, and push the result back onto the stack.
  • Operation 5 tells us to pop the top value from the stack, bitwise-not it, and push the result back onto the stack.
  • Operation 6 tells us to do nothing.
  • The final value on the stack is the result of the BitBlt operation.

In RPN, this encodes compactly as DPSoonnn

It looks like a lot of work, but that's because I spelled it out in painstaking detail. A shorter version would be

  • The template says SDP, and the bias is 1, so we start at the D and take three parameters (wrapping around if necessary), which gives us DPS.
  • Then we append the operations, which is oonnn.

Observe that nn bitwise-negates the top item on the stack, and then bitwise-negates it again. The two operations cancel out, which means that any nn sequence can be optimized out. In practice, these nn operations will appear at the end. If you're encoding operations, and you have an even number of leftover operation slots, then you pad out the unused operations with not. If you have an odd number of leftover operation slots, then you put not in all but the last slot. (Operation 6 is the only one that can be left empty.)

Removing the redundant nn leaves us with DPSoon, which matches the value given in the table.

Let's try one of the more complicated operations. Operation index 0xD4 is 0x00D41D78. The operation code is 0x1D78:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
op5 op4 op3 op2 op1 op6 template bias
0 0 0 1 1 1 0 1 0 1 1 1 1 0 0 0
0 1 3 1 1 1 6 0
n x a x x n SSP*PD 0

There are four binary operations, so we need five parameters. There is no bias, so we start at the beginning with SSP. Next is an asterisk, so we replace it with the first operation, which is x. Then we continue with the template, which gives us PD. And then we perform the leftover operations, which are xaxnn. The resulting RPN is SSPxPDxaxnn, which after canceling out the nn leaves SSPxPDxax. And this matches the value in the table.

If you write a program to apply this algorithm to all of the raster operations, you'll find that decoding the operation code results in the published RPN, with the exception of operation indices 0 and 255.

Let's decode operation index 0, whose operation code is 0x0042.

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
op5 op4 op3 op2 op1 op6 template bias
0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0
0 0 0 0 1 0 0 2
n n n n x   SPDD 2

This decodes as DDxnnnn, which simplifies to DDx, but the published RPN is just 0. Aha, but we also know that anything xor'd with itself is zero, so DDx simplifies further to 0.

Similarly, operation index 255 has operation code 0x0062 which decodes as follows:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
op5 op4 op3 op2 op1 op6 template bias
0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0
0 0 0 0 1 1 0 2
n n n n x n SPDD 2

This decodes as DDxnnnnn, which simplifies to DDxn, and since we learned that DDx is 0, this gives us 0n, which simplifies further to just 1, which matches the published RPN.

Back to history: As I noted earlier, the raster opcodes were initially 16-bit values, and the idea was that BitBlt implementations would parse and execute the expressions encoded in the operation code. In practice, people just looked up the value in a table of precomputed operation codes and used that to decide how to perform the operation. As a result, the operation index was added to the raster opcode, so that implementations who want to use a table lookup have a single byte to look up instead of having to do a binary search on the 16-bit operation code. Both the operation index and operation code are present so that the values work both with drivers that use lookup tables and drivers which parse the operation code and execute the miniature expression language.

Over time, the drivers that executed the miniature expression language died out. All anybody cares about nowadays is the operation index.

Comments (7)
  1. Antonio Rodríguez says:

    After introducing the index, it became clear that you can only use those 256 combination of opcodes. But, before that, was it legal to use any combination, or were the options limited from the beginning?

    1. Fabian Giesen says:

      There are 3 binary inputs (source, pattern, destination). Therefore, there are 2^3 = 8 possible input bit patterns, and the 8-bit (256-value) op code is sufficient for a full truth table that specifies the (1-bit) output for all possible combinations of inputs. All binary-valued functions of 3 binary inputs can be written in this form, and since the older encoding only lets you produce functions of those 3 input bits, everything the expression-style encoding can produce has a corresponding truth table.

      It’s not immediately obvious to me whether the expression-like encoding is sufficient to express all possible functions of the 3 bits (although it probably is). The easiest way to do so would probably be to enumerate all 2^16 options and determine the truth table for all of them, and then you can see whether you hit all 256 possible truth tables.

      The 8-bit encoding of 3-input truth tables turns turns out to be convenient for hardware – the implementation is regular, quite small, and fast. It shows up in other places like the Amiga blitter http://amigadev.elowar.com/read/ADCD_2.1/Hardware_Manual_guide/node011C.html or, much more recently, the “VPTERNLOGD” / “VPTERNLOGQ” instructions in Intel’s AVX-512. Larger truth tables (4 or 6 bits input, typically) are used as primary building blocks in FPGAs.

      1. Clearly the expression-based version was able to encode all 256 operations, seeing as that’s how it was done before indices were added, and there’s a proof via explicit enumeration.

    2. Neil says:

      For each bit there are two values for each of P, S and D. You can therefore build a truth table with 8 rows showing the resulting bit for the input bits of P, S and D. (If you design the truth table correctly, you can then read up the result column to obtain the binary raster operation index.) Since there are 2 output values for each of those 8 rows, there are therefore 256 different truth tables and raster operation indexes.

    3. You had to use the official operation codes, because, as I noted in the article, drivers in practice did not parse the operation code. They just looked it up in a table. So your operation code had better be in the table.

  2. Joker_vD says:

    I am so glad I don’t have to know any of this stuff, so I can read it for a mere entertainment value, to learn how people of the yore had to blit their bits uphill both ways in the snow, and they liked it!

    1. R P (MSFT) says:

      In practice, most of us people of yore just used the predefined values like SRCCOPY or DSTINVERT or whatever. There’s almost no call for any other ROP codes, but if we needed them we could find them in that table.

      So we didn’t need to know this stuff, which was good because we didn’t know this stuff. If it was documented anywhere it was only in the DDK (which was more difficult to get in those days.)

Comments are closed.

Skip to main content