Why are there two incompatible ways of specifying a serial port baud rate?


One of my colleagues tracked down a bug in their code that communicates over a serial port. (Remember serial ports?)

The DCB structure specifies the baud rate as an integer. To request 2400 baud, you set the BaudRate to 2400. There are some convenient defined constants for this purpose.

#define CBR_110             110
#define CBR_300             300
#define CBR_600             600
#define CBR_1200            1200
#define CBR_2400            2400
#define CBR_4800            4800
#define CBR_9600            9600
#define CBR_14400           14400
#define CBR_19200           19200
#define CBR_38400           38400
#define CBR_56000           56000
#define CBR_57600           57600
#define CBR_115200          115200
#define CBR_128000          128000
#define CBR_256000          256000

Meanwhile, the COMMPROP structure also has a way of specifying the baud rate, but it is done by setting the dwMaxBaud to a bitmask:

#define BAUD_075          ((DWORD)0x00000001)
#define BAUD_110          ((DWORD)0x00000002)
#define BAUD_134_5        ((DWORD)0x00000004)
#define BAUD_150          ((DWORD)0x00000008)
#define BAUD_300          ((DWORD)0x00000010)
#define BAUD_600          ((DWORD)0x00000020)
#define BAUD_1200         ((DWORD)0x00000040)
#define BAUD_1800         ((DWORD)0x00000080)
#define BAUD_2400         ((DWORD)0x00000100)
#define BAUD_4800         ((DWORD)0x00000200)
#define BAUD_7200         ((DWORD)0x00000400)
#define BAUD_9600         ((DWORD)0x00000800)
#define BAUD_14400        ((DWORD)0x00001000)
#define BAUD_19200        ((DWORD)0x00002000)
#define BAUD_38400        ((DWORD)0x00004000)
#define BAUD_56K          ((DWORD)0x00008000)
#define BAUD_128K         ((DWORD)0x00010000)
#define BAUD_115200       ((DWORD)0x00020000)
#define BAUD_57600        ((DWORD)0x00040000)
#define BAUD_USER         ((DWORD)0x10000000)

My colleague accidentally set the DCB.BaudRate to a BAUD_xxx value, and since these values are untyped integers, there was no compiler warning.

My colleague asked for the historical background behind why there are two easily-confused ways of doing the same thing.

The DCB structure dates back to 16-bit Windows. It tracks the feature set of the 8250 UART, since that is what came with the IBM PC XT.¹ In particular, there is no need to ask what baud rates are supported by the serial chip because you already know what baud rates are supported by the serial chip: The 8250 and 16650 support baud rates that are divisors of 115200.²

Enter Windows NT. This operating system wanted to run on things that weren't IBM PCs. Crazy. In particular, those systems may have serial communications chips that support a different set of baud rates. That's where the COMMPROP structure came in: It reports baud rates as a bitmask that is filled out by the Get­Comm­Properties function. That way, the program that wants to do serial communications can find out what baud rates are supported by the current hardware. And since it's reporting a set of values, a bitmask seems the natural way of representing it.

The program inspects the bitmask, decides which of the available baud rates it wants to use, and puts the desired value (as an integer, not a bitmask) in the Baud­Rate member of the DCB.

That's my attempt to reverse-engineer the history of the two incompatible ways of representing baud rates.

¹ The PS/2 line introduced the 16550 UART which is backward-compatible with the 8250. In particular, it supports the same baud rates.

² Other baud rates like 110 are approximations. For example 110 is really 115200 ÷ 1048 = 109.92 baud. This article claims that microcontrollers "rarely offer an internal oscillator that has accuracy better than ±1.5%," so an error of 0.07% is easily lost in the jitter.

Comments (24)
  1. xcomcmdr says:

    I remember serial ports all too well. The GPS on some laptop can only be accessed through a (virtual) serial port.

    It's also an USB device listed as a network card.

    Why so compicated ?!

    1. The MAZZTer says:

      Well if you're making a device to talk over a wire serial is a lot simpler to code than a full networking stack, I imagine. It depends on how lightweight the device is going to be and whether you can stuff Linux or some other OS on it to abstract those problems away, I would assume.

      Having the USB interface impersonate a generic device means you don't have to write your own drivers, and it will work out of the bov everywhere.

    2. parkrrrr says:

      See the comment below about NMEA-0183. If your GPS receiver already needs to speak that protocol, which it pretty much does if you plan to sell it in certain markets, then it's just plain cheaper/easier to layer some other protocol on top of that one.

  2. morlamweb says:

    Serial ports are never far from my mind. One of the products that I support transfers data from small lab instruments - balances, pH meters, titrators, and the like - to PCs. The most common port for these instruments is still RS232c with a DB9 socket. Nowadays, many instruments come with an array of modern ports, but you'll still see a DB9 socket in the mix. We use a serial-to-ethernet converter to hook these instruments up to the network, which simplifies things on the PC side, but we still have to work with the messy details of serial ports.

    1. Brian_EE says:

      Pet peeve - there is no such thing as a DB-9 connector. D stands for D-shell (because of the shape), and the second character is the shell size. B is the shell size for a 25-pin connector (ala printer parallel port).

      The correct nomenclature is DE-9. E size shell is for the standard density 9 pin connector, or the 15 pin high density connector.

      1. morlamweb says:

        Off to the nitpickers corner with you.

      2. bhtooefr says:

        Of course, then you've got the 19-pin 2-row standard-density appropriately-sized-shell connectors that Apple used for floppy drives, and Atari used for ACSI (their SASI/SCSI-adjacent protocol), which isn't a standard D-sub size at all.

        A lot of people just call that DB-19, even though it's not in a B-size shell - it's smaller. But, it's not DA-19 either - that only has room for 15 pins. I usually go for "19-pin D-subminiature" and don't specify the shell size...

        Of course, this is getting really tangential to the article...

  3. We still have to deal with them a lot. NMEA 0183 talkers still inherently use them (though they're generally RS422, rather than RS232).

  4. SimonRev says:

    I can attest the the Jitter bit, they really are fairly forgiving. When I was an intern, I used VB6 to talk to a device over the serial port. The device used a baud of 57600 but VB6's serial port used an enum (presumably modeled after the COMMPROP structure) that only offered 56k.

    The code worked for years without any apparent problems. I eventually brought it up with one of the EEs who said that usually you got at least 5% jitter before errors started to creep in.

    1. Ben Hutchings says:

      In an asynchronous serial protocol like RS-232, every transition between 0 and 1 allows the receiver to resynchronise its receiving bit clock. RS-232 specifies a stop and start bit between each byte (up to 8 bits) of user data, allowing resynchronisation every 10 bit times at worst. If the receiver samples the value in the middle of each bit time, that gives you 5% margin.

  5. Why would you divide 115200 by 1048 = 2^3 * 131? Was a confusion of 1024 and 2048? Wouldn't it make more sense to divide 115200 / 1024 = 112.5?

    1. parkrrrr says:

      Probably because 1048 is the divisor that gives you a value close to 110. 110 baud as a standard predates the 8250 by quite a few years.

    2. Brian_EE says:

      That, and this isn't C code doing the counting. It's generally implemented as a counter that loads the value from a register, and counts down to 0. At 0, a timing pulse is sent to the enable of the shift register to clock the bit in.

      And I'll correct Raymond a bit - it wasn't that the 8250 only supported those rates - the divisor value was completely programmable. Those were just the common baud rates everyone used, and the driver enumerated those and programmed the appropriate values into the chip registers.

  6. The MAZZTer says:

    Alright, I have a question. Why are constants so heavily favored in the Windows API for such things? Wouldn't it make more sense to use an enumeration? Especially with modern tools it makes more sense since you get Intellisense or similar functionality, and the compiler will detect when you make mistakes like this. I would imagine there must be some downside if the Windows team went the constants route.

    1. SimonRev says:

      I think that Raymond answered that one way way back (like in the first year of this blog). I cannot find the post now, but IIRC, it went something along the lines that originally Windows headers had to work with both Pascal and C (or maybe just Pascal) and #defines were the lowest common denominator. There might also have been something along the lines of not all C compilers supported enums back in the mid eighties or at least there was no standard way for enums to work back then.

      1. McBucket says:

        "originally Windows headers had to work with both Pascal and C (or maybe just Pascal) and #defines were the lowest common denominator."

        Pascal doesn't have "#define macros, as far as I know. Older versions of the Windows API were defined using the '__pascal' calling convention (https://en.wikipedia.org/wiki/X86_calling_conventions#pascal), but that's no longer true.

        "There might also have been something along the lines of not all C compilers supported enums back in the mid eighties or at least there was no standard way for enums to work back then."
        I used the Lattice C compiler starting in 1983, and enums were supported, as far as I can recall. Enums are just integral constants, so I'd be hard-pressed to understand how they would work any way that would not be compatible with a, um, non-standard way...

        1. Joshua says:

          Enums used to be chars.

        2. ErikF says:

          Enumerations are signed integers, which makes them not so great for bitfields: without casting, you likely will get compiler complaints (and you have to look out for sign extension!)

      2. parkrrrr says:

        I think "not all C compilers supported enums" might be the right answer. K&R 1st edition mentions #define, but not enum, so it likely wasn't a viable option until ANSI C was widely available.

        1. mikeb says:

          I'm pretty sure the main reason for using #define instead of enums is that C programmers back in the day simply were more accustomed to using #define for manifest constants. It still seems to be preferred by many. Maybe because if you want to give a string literal a zero-overhead name #define is the simplest way to go even today, so why not do the same thing for numbers?

          Another reason (though I think this probably happened more after-the-fact than by plan) is that you can preprocess #defines for use in assembly code pretty easily and the same is not true for enums. Remember that MS-DOS was 100% (or pretty close to 100%) assembly language, and I wouldn't be surprised if a decent chunk of 16-bit Windows was too.

    2. Pierre Bisaillon says:

      The main reason is that enum had no defined storage size defined; you can't say if it will be 1 byte/2 bytes/4 bytes which can be a huge problem when defining a fixed interface for methods.

  7. Neil says:

    I've always needed a convenient constant to represent 5 minutes... Now I can use CBR_300!

  8. Karellen says:

    That's where the COMMPROP structure came in: It reports baud rates as a bitmask that is filled out by the Get­Comm­Properties function. That way, the program that wants to do serial communications can find out what baud rates are supported by the current hardware. And since it's reporting a set of values, a bitmask seems the natural way of representing it.

    Wait, it reports a set of values, as a bitmask, in a field called dwMaxBaud?

    1. I'm pretty sure proper API design is an NP-Hard problem in the real world.

Comments are closed.

Skip to main content