How can I generate a consistent but unique value that can coexist with GUIDs?


A customer needed to generate a GUID for each instance of a hardware device they encounter:

The serial number for each device is 20 bits long (four and a half bytes). We need to generate a GUID based on each device, subject to the constraints that when a device is reinserted, we generate the same GUID for it, that no two devices generate the same GUID, and that the GUIDs we generate not collide with GUIDs generated by other means. One of our engineers suggested just running uuidgen and substituting the serial number for the final nine hex digits. Is this a viable technique?

This is similar to the trap of thinking that half of a GUID is just as unique as the whole thing. Remember that all the pieces of a GUID work together to establish uniqueness. If you just take parts of it, then the algorithm breaks down.

For this particular case, you're in luck. The classic Type 1 GUID uses 60 bits to encode the timestamp and 48 bits to identify the location (computer). You can take a network card, extract the MAC address, then smash the card with a hammer. Now you have a unique location. Put your twenty bits of unique data as the timestamp, and you have a Type 1 GUID that is guaranteed never to collide with another GUID.

If you have more than 60 bits of unique data, then this trick won't work. Fortunately, RFC4122 explains how to create a so-called name-based UUID, which is a UUID that can be reliably regenerated from the same source data. Section 4.3 explains how it's done. The result is either a type 3 or type 5 UUID, depending on which variant of the algorithm you chose.

Comments (18)
  1. Falcon says:

    "You can take a network card, extract the MAC address, then smash the card with a hammer. Now you have a unique location."

    To be on the safe side, buy a brand new card for this purpose – hopefully, it's MAC address has never been used to generate a GUID. Then, to get more value for your money, find a creative way to destroy the card and film it!

  2. Stephen Cleary says:

    Or just use a crypto-random MAC address with the broadcast bit set (no need to destroy hardware). I'd recommend the name-based approach, though. Easier to clean up afterwards.

    nitoprograms.blogspot.com/…/few-words-on-guids.html

  3. Joshua Ganes says:

    The customer requirements don't give me enough information to confirm or deny whether an alternative approach would work. The fact that they are generating a GUID for a (supposedly) already unique serial number seems a bit strange to me. It is not clear what purpose this will serve.

    I would suggest storing a mapping from serial number to a GUID in a database. When looking up a device, you can find its GUID very simply by querying the database. If no entry exists, simply generate a new one.

    [That introduces race conditions and coherency issues. Oh, and component in question happens to be a kernel mode driver, so it's not like you can load up SQL Server. -Raymond]
  4. laonianren says:

    "20 bits long (four and a half bytes)"

    What is the C language octadecimal prefix?

  5. Brian says:

    @Joshua: What if the hardware device is used on different machines, all of which must generate the same Guid but which are not always in communication with each other or a database?

  6. Barbie says:

    Ah if only MACs were really unique… (e.g. <forum-en.msi.com/index.php). It's pretty sad to see NIC providers mess that tenet up.

  7. Joshua Ganes says:

    @Brian: Precisely. That's why I can't confirm the alternative approach based on the current information. I always prefer the simple approach to the complicated one. The question still remains — why can't they just use the serial number as a unique identifier?

    [The context here is a driver for a PnP device, and I'm assuming that Windows wants PnP devices to be assigned a unique GUID. -Raymond]
  8. acq says:

    @Brian: the customer is in control of the process, he can write the software which gives guid based on the inserted hardware. "Access to the database" in a sense "connect to SQL server" is just not needed. The question was simply how they can produce always the same "guid" (128 bits) from the fixed 20 bits they have (map their 20 bits to the 128 bits that are then unique GUID).

  9. Joshua Ganes says:

    Raymond, thanks for the additional context information. When you can, do it simple. When you can't, listen to Raymond Chen.

  10. blah says:

    20 bits is 2.5 bytes.

    [I'm better at math than arithmetic. Didn't occur to me that I'd have to check the customer's arithmetic too. -Raymond]
  11. Isn't everyone using Type 4 GUIDs nowadays?  The replace-the-last-20-bits technique should work just fine for those.

    [No, because that's not random any more. -Raymond]
  12. True, replacing the last 20 bits of a type 4 GUID does not preserve the type 4-ness of the GUID.

  13. Ben says:

    @Falcon after filming it, upload the video to youtube and take the unique video id and append it to your guid!

  14. Mike Dunn says:

    The "smash a NIC with a hammer" story is one of my favorite stories ever.

  15. NL says:

    Manufacturers re-use Mac addresses.  It is therefore not precisely unique.

  16. Dmitry Kolosov says:

    "take a network card, extract the MAC address, then smash the card with a hammer"

    Is there a way to find out which one of my three network cards (two Ethernet adapters, one WiFi) I should destroy?

    By the way, can't you fake a card's MAC address programmatically?

  17. pkorona says:

    UUIDGEN will also produce a range of sequential GUIDs. You can use that to provide a base address for SN#0.

  18. Stephen Cleary says:

    @pkorona:

    Sorry, but that approach won't work. It is very wrong, in fact.

    A quote from the link I posted above: "Sequential GUIDs are not actually sequential. In normal circumstances, GUIDs being generated by the same computer will have gradually increasing Timestamp fields (with the other fields remaining constant). However, the Timestamp field is not in the least-significant bit positions of the GUID, so if the GUID is treated as a 128-bit number, it does not actually increment."

Comments are closed.