Message Framing, Part 2

Although there are a several different legal sequences of records in the framing format, the first few records are always the same after first establishing a connection. The first record describes the version of the framing protocol, the second record describes the mode of the framing protocol, the third record describes the delivery via, and the fourth record describes the message encoding.

A version record starts with the byte 0 followed by a byte for the major version and a byte for the minor version. The difference between a minor version and a major version is that a minor version cannot introduce any new record types. We've only used major version 1, minor version 0 so far.

A mode record starts with the byte 1 followed by a byte for the mode. Mode 1 indicates that the framing protocol is being used to exchange a single message in a request-reply fashion. This is used for streamed TCP connections. Mode 2 indicates that the framing protocol is being used to exchange several duplex messages. This is used for buffered TCP connections. Modes 3 and 4 are used for packaging MSMQ messages. You'll see those modes if you look at the messages in the queue when sending messages using the NetMsmqBinding.

A via record starts with the byte 2 followed by the size of the via string and followed by the via itself. The via is a standard URI and encoded using UTF-8. To prevent a client from tricking the server into reading an excessively long via, there's a default limit of 2 KB on the via length. A server can use a different limit if it likes or not have a limit at all although we recommend picking some reasonable limit based on the length of addresses that you expect.

The sizes in a variable-sized record are encoded to optimize for messages that use smaller sizes. Each byte of the size adds 7 bits to the size field and uses the most significant bit to indicate whether the size has more bytes. When the most significant bit is 1, the next byte in the stream continues the size. When the most significant bit is 0, the size is done. Therefore, a small size, such as 100, can be encoded in a single byte while a larger size can use multiple bytes. This is similar to how the character encoding of UTF-8 works.

Next time I'll continue by talking about the use of message encodings.