QueryPerformanceCounter is not a source for unique identifiers


This article happened to catch my eye:

I needed to generate some unique number in my application. I could use GUID, but it was too large for me (I need to keep lots of unique identifiers). I found something like this:

[System.Runtime.InteropServices.DllImport("Kernel32.dll")]
private static extern int
    QueryPerformanceFrequency(ref System.Int64 frequency);

[System.Runtime.InteropServices.DllImport("Kernel32.dll")]
private static extern int
    QueryPerformanceCounter(ref System.Int64 performanceCount);

public static long GenerateUniqueId()
{
      System.Int64 id = 0;
      QueryPerformanceFrequency(ref id);
      QueryPerformanceCounter(ref id);
      return id;
}

This code generates Int64 (long) unique number (at least I hope it is unique). The uniqueness is in the scope of process. So two processes can generate the same number, but it should be always unique in a single process (I am not sure about two threads invoking the same GenerateUniqueId() method.

QueryPerformanceCounter retrieves the current value of the high-resolution performance counter, but there is no guarantee that every call to the function will return a different number.

The frequency of the high-resolution performance counter is determined by the HAL. You might think that the RDTSC instruction would be perfect for this purpose, since it returns the number of CPU clock ticks, a value that always increases at a very high rate. But there are many problems with RDTSC. For example, variable-speed processors mean that the rate at which CPU clock elapse varies over time. A million clock ticks might take one millisecond when the computer is running on wall power, but two milliseconds when running on battery power.

If the HAL can't use RDTSC, what does it use instead? Well, as I said, it's up to the HAL to find something suitable. Older motherboards have to make do with the programmable interval timer which runs at 1,193,182 ticks per second (approximately 0.8 microseconds per tick). Newer motherboards can use the ACPI timer which runs at 3,579,545 ticks per second (approximately 0.3 microseconds per tick).

One of the machines in my office uses the ACPI timer for its high-resolution performance counter, so I threw together a quick program to see how close I can get to outracing the ACPI timer by calling QueryPerformanceCounter in rapid succession. With a 1.80GHz processor, the computer manages to call QueryPerformanceCounter quickly enough that only four ticks of the ACPI timer elapse between consecutive calls. We're getting into shouting range of being able to call QueryPerformanceCounter twice and getting the same value back from the ACPI timer. Of course, if the computer had been using the programmable interval timer, it would have been within spitting distance, and upgrading to a 3GHz processor would have taken us over the top.

In other words, you may be lucky today that your CPU isn't fast enough to call QueryPerformanceCounter twice and get the same value back, but it sure looks like we're threatening to get there soon.

Then again, all this back-of-the-envelope calculation is superfluous. All you need is a machine with multiple processors. Get two of the processors to call QueryPerformanceCounter at the same time (or nearly so), and they'll get the same timer value back.

If you want to generate unique 64-bit values, you can just use InterlockedIncrement64.

Comments (29)
  1. - says:

    I know you don’t own the article you linked, but it suggests using SetThreadAffinityMask, which is terrible, unless that thread is only used to gather timing information. If it’s a thread that does grunt work ("Typically, this is the main game thread"), you just have made an application that won’t scale with multiple CPUs… even if it’s launched several times. If every application did that, they all would run in just one core.

  2. At the risk of pointing out the obvious…

    A GUID isn’t big.  It’s only 128 bits; there’s only a 50% space savings using a 64-bit unique identifier over a GUID.

  3. Reinder says:

    I would avoid "just generate a bunch of bits and hope they are unique" schemes such as GUID-based ones. Even with 128 bits, "once in a billion is next Tuesday" if you really have to generate lots of keys.

    InterlockedIncrement64 gives stronger guarantees about the uniqueness of the numbers it generates (you only get a duplicate once every 10^18 or thereabouts keys), and because it is easy to detect that it has generate a duplicate.

  4. Greg Beech says:

    Reinder – GUID generation is hardly "generate and hope", it’s based on a pretty sophisticated algorithm.

    And with 3.40282367 × 10^38 possible values, no matter how many you generate, that’s hardly next tuesday. Even if you generate a billion a second, it would still take over a thousand trillion trillion years before it was likely to generate a duplicate value.

  5. andy says:

    Reinder: depends on your calendar, I guess, when next Tuesday is.

    From RFC4122:

    "Since a UUID is a fixed size and contains a time field, it is possible for values to rollover (around A.D. 3400, depending on the specific algorithm used)."

    Not very often you’ll get to enjoy birthdays with that calendar :)

  6. = says:

    So, it’s IMPOSSIBLE, completely without doubt, to have two idential GUIDs generated? Is there a mathematical algorithm which suggests this? What algorithm is used when I use the Create GUID tool?

    Excluding accidents: GUIDs have always confounded me since who’s to stop someone from <i>intentionally</i> generating an identical GUID with the expressed purpose of causing trouble?

    [Since a GUID includes a timestamp, you need to wait for clock rollover to get a duplicate. Of course, nothing will stop somebody from intentionally generating duplicates, but the consequences depend on what the GUID is used for. It’s like saying “What’s to stop somebody from intentionally generating an identical random number?” -Raymond]
  7. niklas says:

    Before anyone begin screaming "birthday paradox", let me do a preemptive dismissal of their concerns.

    The probality of generating a collision when drawing n numbers out of k is (1-n/k)^n which is approximately (1-n^2/k), all the vanishingly small later terms are dropped. If k is 2^128, n needs to come close to 2^64 to make a collision likely. so if you generate less than 2^61 numbers or so, you should worry more about winning on the lottery than generating a collision.

  8. niklas says:

    Though it is always of course a good idea to not have your app crash and burn if it found a pair of identical GUIDs.

  9. mikeb says:

    The really nice thing about GUIDs is that you don’t have to worry about which machine/process/whatever is responsible for generating the unique value.

    The only thing I don’t like about GUIDs is that they hurt my eyes.

  10. > Since a GUID includes a timestamp, you need to wait for clock rollover to get a duplicate.

    There’s two problems here.

    1) See RFC4211 4.1.4 – UUIDs don’t include timestamps in later versions.  This is a good thing.

    2) The granularity is only a tenth of a microsecond… so GUIDs generated in the same tenth of a microsecond would have the same timestamp.  GUIDs generated using a low-res system clock would be vulnerable to the resolution of the clock.

    [That’s what 4.1.5 is for. -Raymond]
  11. s/tenth of a second/tenth of a microsecond/

  12. mirobin says:

    The clock sequence is only a portion of the GUID (60bits); it is not required to be unique per generation, just reasonably unique for a single source generating a GUID.  In other words, it is designed to reduce the chances of a collision, not eliminate it.

  13. mikeb says:

    There is no bug in RFC 4122 4.1.5

    The RFC specifies an algorithm that can generate up to 10 million UUIDs per second.

    The RFC suggests a few ways to prevent generating duplicate UUIDs if multiple requests are made within the same timestamp interval, right down to requiring the generator to issue an error or stall if no other option exists:

    4.2.1.2.  System Clock Resolution

      The timestamp is generated from the system time, whose resolution may

      be less than the resolution of the UUID timestamp.

      If UUIDs do not need to be frequently generated, the timestamp can

      simply be the system time multiplied by the number of 100-nanosecond

      intervals per system time interval.

      If a system overruns the generator by requesting too many UUIDs

      within a single system time interval, the UUID service MUST either

      return an error, or stall the UUID generator until the system clock

      catches up.

      A high resolution timestamp can be simulated by keeping a count of

      the number of UUIDs that have been generated with the same value of

      the system time, and using it to construct the low order bits of the

      timestamp.  The count will range between zero and the number of

      100-nanosecond intervals per system time interval.

      Note: If the processors overrun the UUID generation frequently,

      additional node identifiers can be allocated to the system, which

      will permit higher speed allocation by making multiple UUIDs

      potentially available for each time stamp value.

  14. Mr. Black says:

    Riiight ….

    2^128 = 10^35.5

    1000000 * 60 * 60 * 24 * 365.25 * 1000000 = 10^19.5

    In other words, if you generated a GUID every microsecond for one million years, you would still have used one thousand-trillionth of the available keys.

  15. Nix says:

    public static long GenerateUniqueId()

    {

       System.Int64 id1 = 0, id2 = 0;

       QueryPerformanceCounter(ref id1);

       do {

           QueryPerformanceCounter(ref id2);

       } while(id2 == id1);

       return id2;

    }

    [Works great until you create a second thread. -Raymond]
  16. > RFC 4211

    s/4211/4122/ – sorry.

    I see that 4.1.5 will solve the problem across generators.

    But GUIDs generated in the same tenth of a second *by the same generator* are still going to have the same timestamp… and as I read the RFC, they’ll still have the same clock sequence.  So 4.1.5 doesn’t solve the problem.

    (I consider this a bug in the RFC… the sentence “the UUID generator can not be sure that no UUIDs were generated with timestamps larger than the value to which the clock was set,” should probably read “the UUID generator can not be sure that no UUIDs were generated with timestamps *equal to or* larger than the value to which the clock was set,”)

    But anyway, guidgen uses v4*, which has no timestamp at all (or node identifier either) and is the closest of all the UUID schemes to a random number.

    * not necessarily conclusive proof that v4 is better than v1

    [Yeah, should be “equal to or larger than”. It seems clear that the intent was to use the clock to protect against time standing still as well as time going backwards. -Raymond]
  17. Phaeron says:

    There are some dual core systems that have unsynchronized clocks such that QueryPerformanceCounter() returns inconsistent counter values, making it more likely that an identical value can be seen by a thread calling QPC() as the kernel scheduler moves the thread to different cores. It’s even possible for the time value to go backwards.

  18. Drak says:

    Is the internal GUID generator used in VB.Net succeptible to doubles if many are generated? Asking because a quick Google turned up no usable results, just a bunch of pages where people made their own GUID generators in VB…

  19. Mike says:

    Recent motherboards also have HPET support (http://www.intel.com/hardwaredesign/hpetspec_1.pdf) that have a minimum frequency of 10MHz (100ns – coincidentally the same as the resolution of a FILETIME), and is allowed to implement up to 10^15Hz (that’s 1000 tera Hz, I think it’ll suffice for a while :-) ).

    Doesn’t recent MS HAL’s use the HPET functionality (for QueryPerformance*) if it’s available?

    Not that it’s any good for a unique identifier, that indeed should be handled by InterlockedInc, just thought I’d mention it as timers were mentioned.

  20. Mihailik says:

    Don’t be too smart :-)

    Good old ++ operator beats performance counters in virtually any field.

  21. Peter Bassett says:

    What about values that have to be reasonably unique (within a set time period) but non sequential? I.e. they must not enable the user to guess another id based on the one they have been assigned.

    Like the Session ID in asp.net.

    Currently I use the .net RNGCryptoServiceProvider class to generate unique 32 bit ids that give no information about other ids in use in the system. Each id is typically in use for less than 30 minutes so I feel this is a valid solution.

    I’d be interested in what you think.

    Pete

  22. ac says:

    After listening to the SecrityNow podcast it seems like TPM is the perfect solution

  23. Mike says:

    As for GUID’s, it should be noted that they are neither Globally Unique (impossible) nor fully generated ID’s. IIRC "Mario" is a hand-hacked "GUID" identifier part.

    Also, using GUID’s for interfaces can not be trusted. MS themselves have changed interfaces having the same GUID from OS to OS (proof provided upon request).

    I think the CORBA people were one to something, and seemingly MS thikgs so to due to the use of the "lookup by name instead of GUID"…

  24. Mike says:

    Oleg, operator++ for POD types is not atomic. For a single-threaded environment it indeed is superior in performance, but for multi-threaded scenarios you need an atomic operation – where InterlockedInc happens to be the fastest (if inlined).

    As for Peter Bassett’s question, that is indeed an interesting question – one that cryptographers and mathematicians spent considerable time researching. For a likely really, really good implementation I’d suggest having a look at OpenBSD’s network stack. I don’t dubt Vista is also way better than older Windows versions, but last I checked the source code wasn’t available. ;-)

    Basically, what is wanted is a PRGN that doesn’t repeat itself too often (preferrably "never"), and can’t be reproduced without knowing the initial state. For something that must be unique for at least 30 minutes, I’d probably consider also "salting" with at least a millisecond timer, preferrably faster, modulo timer speed vs. value lifetime could suffice. (in terms of 30 minutes vs. QueryPerf*, I’d consider "salt = perf_count % (perf_freq * 1800)"). But how to actually calculate your "unique" PRNG ID… That’s an excerise I leave to you. :-)

    I’d expect that RNGCryptoServiceProvider to be at least decent. Not super-good cryptography stuff, but "good enough" (at least much, much better than MS’s C runtime lib’s rand() function, that even today only gives 15 bits of "randomness", and a child with VB could reproduce the sequences :-) ).

    Wortless knowledge:

    With 2^32 possible values with required unique lifetime of 30 minutes, one could hand out over 2.386.092 unique ID’s every second before a collision occured.

    A gigabit (using 10^9 bits/s here, maybe it’s in reality is 2^30?) NIC could theoretically produce 31.250.000 such 32-bit keys/s, exhausting the keyspace in around 137 seconds.

    A 2GHz CPU making a loop of inc+jmp every clock cycle would wrap around after roughly two seconds.

    Writing the numbers on paper would give you "COBOL fingers".

  25. Steve Nuchia says:

    It took me quite a while to get past the suspicion that GUID weren’t really globally unique.  The timestamp part of the algorithm gets all the attention; it’s actually pretty good but only ensures *local* uniqueness.

    The job of making them globally unique is subcontracted out to the IEEE.  48 of the 128 bits are taken from the MAC address of one of your network interfaces.  How it chooses which interface is not specified.

    So, if all of your network interfaces are using their IEEE-derived globally unique hardware addresses the GUIDs you generate are guaranteed not to duplicate any GUIDs generated anywhere / anywhen else *by any conforming generator*.  If you do not have any NICs or some of them are using software-assigned MAC addresses you’re playing roulette.

  26. Gabest says:

    GUIDs also have their own scope, even if there are two identical in the world there is very little chance of being used for the same thing.

  27. steveg says:

    "One in a million chances happen nine times out of ten" — Terry Pratchett.

  28. The benefit of a GUID over an incrementor is that two different generators can be run at the same time without having to sync up.

    The birthday paradox is always a concern with random generators, but 128 bits is probably enough randomness to serve us well.

    To illustrate the birthday paradox:

    With 2^32 possible values with required unique lifetime of 30 minutes, one could hand out over 2.386.092 unique ID’s every second before a collision occured.

    This is true for an incrementor, which can use all the values.  But for a random generator, this is false… there’s a square-root law in effect if you throw darts at your state space instead of methodically combing it.  So you could probably only get away with about 1400 unique IDs a second using 32-bit random numbers.

    128-bit random numbers, on the other hand, are much much much much better than 32-bit random numbers.  (It’s easy to get 128-bit random numbers; just get four 32-bit random numbers and glue them together.)  I’d even go so far as to say they’re better than GUIDs.

  29. code-o-rama says:

    If you need to accurately time operations in Windows, you’re usually directed to the QueryPerformanceCounter

Comments are closed.