As random as I wanna be: Why cmd.exe’s %RANDOM% isn’t so random


Somebody on my team reported that a particular script in our project's build process would fail with bizarre output maybe once in fifty tries. This script was run from a makefile, and the result was a failed build. Rerunning make fixed the problem, but that's not much consolation when the build lab encounters it approximately every other day. The strange thing about the bizarre output was that it appeared to contain a mix of two different runs. How could the output of two runs be mixed into one output file?

The script was a batch file, and it generated its output in a few different steps, storing the intermediate output in randomly-named temporary files, taking advantage of the %RANDOM% pseudovariable to generate the name of those temporary files. (They were %TEMP%\%RANDOM%.tmp1, %TEMP%\%RANDOM%.tmp2, you get the idea.)

Cutting to the chase: The reason for the mixed output was that the %RANDOM% pseudo-variable wasn't random enough. If two copies of the script are running at the same time, they will get the same "random" number and end up mixing their output together. (And running multiple builds at the same time is something the people in the build lab are wont to do.)

It turns out that the Windows command processor uses the standard naïve algorithm for seeding the random number generator:

   srand((unsigned)time(NULL));

Since time has a resolution of one second, two command prompts launched in rapid succession have a good chance of seeding the random number generator with the same timestamp, which means that they will have the same random number stream.

C> copy con notsorandom.cmd
@pause
@echo %RANDOM%
^Z
        1 file(s) copied.

C> for /l %i in (1,1,3000) do @cmd /c notsorandom.cmd
// hold down the space bar
Press any key to continue . . .
14153
Press any key to continue . . .
14153
Press any key to continue . . .
14153
Press any key to continue . . .
14153
Press any key to continue . . .
14156
Press any key to continue . . .
14156
Press any key to continue . . .
14156
Press any key to continue . . .
14156
Press any key to continue . . .
14156
Press any key to continue . . .
14160
Press any key to continue . . .
14160
Press any key to continue . . .
14160

Notice that the %RANDOM% pseudovariable generates the same "random" number until the clock ticks over another second. (Notice also that the "random" numbers don't look all that random.)

We fixed the script so it generated its temporary file in the project's output directory rather than in the (shared) %TEMP% directory. That way, even if two copies of the project are building at the same time, they will generate their temporary files in different directories and not step on each other.

Exercise: There is much subtlety in that for command. Describe alternative formulations of the for command, both those that work and those that don't. To get you started: Explain the output of this variation:

for /l %i in (1,1,300) do @(pause&echo %RANDOM%)

Obligatory batch file bashing: Every time I write an entry about batch files, you can count on people complaining about how insane the batch programming language is. The batch language wasn't designed; it evolved. (And according to commenter Daev, it followed a form of parallel evolution from what most people are familiar with.) I doubt anybody actually enjoy writing batch files. At best you tolerate it. I'm just trying to make it slightly more tolerable. I bet these are the same people who complain to their tax preparer about the complexity of tax law.

Comments (52)
  1. Marquess says:

    Everytime a batch file is launched, it should display a message (in red) to use powershell already.

  2. akamal says:

    But why does this seem to work without generating the same number twice?

    echo %random%  & echo %random% & echo %random% & echo %random% & echo %random% & echo %random% & echo %random% & echo %random%

    & echo %random% & echo %random% & echo %random% & echo %random% & echo %random% & echo %random% & echo %random%

  3. akamal says:

    Ah, I think Johannes explanation of

    for /l %i in (1,1,300) do @(pause&echo %RANDOM%)

    has already answered my question!

  4. Thomas Léo Horn says:

     for /l %i in (1,1,300) do @(pause&echo %RANDOM%)

    will display the same number every time as %RANDOM% gets evaluated before for.

  5. Peter da Silva says:

    Microsoft should just be including Interix as a standard part of all versions of Windows. It's not that big, it's practically a one-click install, and bash is far more powerful than "powershell".

  6. Jeff says:

    Couldn't you use microseconds for the seed instead of seconds?

  7. Carmen says:

    Peter da Silva: I have to believe, with that statement, you are saying you have not used PowerShell much for serious batching requirements.  The idea that going back to a *nix style stream batching system is more powerful than the object-model that PowerShell uses is fairly outrageous to anyone who has done serious work with both.  You can PREFER the Interix environment because it is familiar and you can get YOUR work done faster, but to say it is more powerful is going too far.

  8. laonianren says:

    With delayed environment variable expansion enabled (cmd /V:ON) you can write:

    for /l %i in (1,1,300) do @(pause&echo !RANDOM!)

  9. Doug says:

    If you want %RANDOM% to be re-evaluated each time the loop iterates, you'd need to enable delayed variable expansion.  There seems to be a lot of ways to do so, from "setlocal ENABLEDELAYEDEXPANSION", to launching cmd.exe with the /v flag.

  10. Alexandre Grigoriev says:

    Where is an obligatory XKCD reference?

  11. The Cygwin bash pseudovar $RANDOM does not seem to have this problem:

    $ for (( i = 0; i < 5; i++ )); do bash -c 'echo $RANDOM'; done

    17548

    22540

    2492

    15387

    30852

    Documentation here: http://www.gnu.org/…/bashref.html

    Looking at the source (bash-4.1/variables.c), it seeds the PRNG with the XOR of seconds, microseconds, and PID (the former two coming from gettimeofday()).

  12. James Gayhart says:

    Oblig XKCD Ref: http://xkcd.com/221/ (For Alexandre) :)

  13. blah says:

    The tax preparer is a valid target of frustration. His industry exists only because the US tax code is on par with Cisco source code. You scratch my back, I scratch yours.

  14. Robert Konigsberg says:

    Can you get a hold of the current process id in batch? If so that should be sufficiently distinct.

  15. James says:

    >  are wont to do

    did you mean "won't do" or "want do do"?

    [I really mean the word "wont". -Raymond]
  16. Michael Entin says:

    .NET Random class has similar problem, it uses  GetTickCount() to seed random values – slightly better than time(), but not much. One would still get into trouble if he creates many Random objects at about the same time.

  17. keithmo says:

    @aaawww if you use a random directory for temp files, you can close the "if exist/type" race:

    set _retry=100

    :Loop

    set _tmpdir=%tmp%%random%

    mkdir "%_tmpdir%" >nul 2>&1

    if not errorlevel 1 goto :GotIt

    set /A _retry -= 1

    if %_retry% GTR 0 goto :Loop

    set _tmpdir=

    echo FAIL

    goto :EOF

    :GotIt

    echo _tmpdir = "%_tmpdir%"

    goto :EOF

  18. James Schend says:

    Other James: It's usually a good idea to consult a dictionary before criticizing the spelling of smart people. :) Your post reminded me of the Maddox article where he complains about people "correcting" his spelling of "inane" ("it's spelled insane!")

  19. James Schend says:

    Michael Entin: Yes, but the documentation for it: msdn.microsoft.com/…/system.random.aspx is very clear on that point, so (hopefully) it should be a source of confusion. It flat-out says you should either use a single Random object, or provide your own seed to avoid this problem. Plus it gives a code sample of the wrong way of using it.

    Of course that doesn't stop people who don't read the documentation but… what can you do.

  20. mikeb says:

    @James (not Schend): http://www.merriam-webster.com/…/wont

    Obligatory (well, maybe not) snarks:

    I'll tell you what – batch file bashing is much more fun than bash file batching. And it won't do to complain about what complainers are wont to do. It may actually be insane (but not inane) to do so.

    OK, I'm done. Hopefully for a long time.

  21. Tom says:

    Hey, don't be hating on the batch language again, Raymond.  It's a great language that has well withstood the test of time.  It's one of the least changed and longest lasting things in the history of PCs.  Sure it's gained some nice new capabilities.  But I can still write batch files today that are just like ones I wrote 20 years ago, just as quick and easy to create, and just as useful.  How many other things in computing can you say that about?  It's not like other languages that have been replaced by vastly better alternatives, such as BASIC, QBASIC, VB, C, C++, and others.

    Long live batch files!  Reliably getting the job done without a fuss for decades.

  22. Timothy Byrd says:

    "First you install PowerShell."

    Now you have two problems.

    I've been using 4DOS/4NT/TakeCommand for years. I'd love to wean myself off of them and go to PowerShell. But every time I try to do something useful in PS, it's like smacking into a wall. Hmmm… Do I spend two hours figuring out how to do this in PowerShell or do I spend 2 minutes getting it done in a batch script? (Typically this revolves around copying files, and the JPSoft products let me specify date ranges among other things.)

    And my solution to the random number issue would be to use the %_pid variable as part of the filename…

  23. Gabe says:

    The reason Powershell is not set to execute scripts by default (and incidentally why Interix isn't installed by default) is because it's just a giant attack surface. For 99% of users, the only Powershell (or bash) script they would ever execute is one that gets disguised as some other kind of file in an email attachment. By making a small, affirmative step to make it easy to execute scripts, the 1% of users who want them can easily get them, while protecting the other 99% of users for whom its existence is only harmful.

    And I agree, that an object-based shell is orders of magnitude more powerfull than a simple text-stream-based shell. This should be obvious the first time you do "ps -aux | grep httpd" and don't get any column headers but do get a line for the grep command!

  24. Alexandre Grigoriev says:

    @Gabe:

    Let's proceed with bashing the "hide extensions" thing.

  25. JamesNT (neither Schend nor the other James) says:

    Thanks, Raymond, this actually solves a problem for me.  You are quite correct in that batch programming isn't fun, but it is a useful tool and gets things done.  Evolution is a messy thing.

    JamesNT

  26. Johannes says:

    Marquess: Except that you can't use PowerShell on older Windows versions. And it only comes with the very latest version of Windows preinstalled. And it's by default set to *not* execute scripts. Face it: Batch files are one of the most reliable ways to automate stuff on nearly every Windows machine – provided the person writing them is proficient enough to avoid common pitfalls and errors.

    Raymond: I enjoy writing them. If you see it as an esolang it can be quite fun to solve problems in it, actually. And as far as those go it's one of the easier ones to write programs in.

    As for the command

     for /l %i in (1,1,300) do @(pause&echo %RANDOM%)

    that's plain obvious: You're using the same PRNG instance every time and with the LCG that cmd (or rather the C runtime) uses the number last generated is the next seed.

  27. Uli Gerhardt says:

    > Everytime a batch file is launched, it should display a message (in red) to use powershell already.

    Yes. On *your* screen.

  28. Teo says:

    Heh, just *installing* Powershell is an epic quest. But batch files are just there, ready for use. Same for jscript+wsh, which can use COM objects so they have access to XML/SQL/http/etc. And using HTAs you have quite nice UI capabilities. Which CAN be groked by a mere mortal in contrast to WPF madness.

  29. Marquess says:

    “.NET Random class has similar problem, it uses  GetTickCount() to seed random values – slightly better than time(), but not much. One would still get into trouble if he creates many Random objects at about the same time.”

    .NET has Guid.NewGuid(). That should help.

  30. aaawww says:

    unserializableone.blogspot.com/…/create-unique-temp-filename-with-batch.html

    call :GETTEMPNAME

    echo "Temp file name is %TMPFILE%"

    goto :EOF

    :GETTEMPNAME

    set TMPFILE=%TMP%mytempfile-%RANDOM%-%TIME:~6,5%.tmp

    if exist "%TMPFILE%" GOTO :GETTEMPNAME

    type NUL >  %TMPFILE%

    :EOF

    if you *really* want that unique temporary name (yes collision may happen if they start within the same centisecond, the type may not prevent them)

    I wonder if there is a way to create create a mutex around the if exist and the type commands

  31. siddi says:

    Still, you could make it work in the boundaries of bat world. Use setlocal to enable delayed expansion so that RANDOM gets evaluated at the execution time. Try this:

    setlocal enabledelayedexpansion

    for /l %%i in (1,1,5) do @(pause&echo !RANDOM!)

    endlocal

  32. Jonathan says:

    Of course, the cardinal sin was placing temp files outside of the project's output directory ($(O)). I'm glad it was fixed correctly, instead of hacking pseudo-unique temp filenames.

  33. Keep on Truckin' says:

    Windows script host with WMI is very powerful.

    It takes effort and ablilty to do a good job as this technology is not trivial.

  34. Mc says:

    Of course random means, random.    The sequence 5,5,5,5   could come out of the random number generator,  not likely perhaps but it's still a perfectly valid random sequence.   So even if the seed etc. was being handled correctly it doesn't guarantee that they still wouldn't have got "unlucky" sometimes.

  35. fail says:

    > 1. The reason Powershell is not set to execute scripts by default is because it's just a giant attack surface

    > 2. You can't use PowerShell on older Windows versions

    Two contradicting arguments against using powershell.

    If powershell is "a giant attack surface" then batch should definitely have been disabled by default, because then batch is a *more than* giant attack surface.

    If the attack surface is giant because most people have powershell installed *), then there's no reason to use batch for gaining any significant more compatibility.

    *) larger install base makes more malware – according to windoze zealots which usually complain there's more malware for windows because windoze has a larger install base (not because windows has tons of security vulnerabilities).

  36. Cheong says:

    @Mc: Theoretically correct, but most random number generators that claims to be "statistically random" or "true random" actually does make an effort to skew away from giving repeatitive / consecutive sequence numbers (because of "Kolmogorov randomness" defination).

  37. jsc42 says:

    To get %random% to expand every time through the loop without setting delayed expansion in your environment, quote the % and use back quotes in an additional nested loop. E.g.

    for /l %i in (1,1,300) do for /f "usebackq" %j in (echo ^%random^%) do echo %j

  38. Judago says:

    Ok I think my comment didn't go through(scripts disabled).

    I think the subtly is "cmd /c" vs "call". "cmd /c" starts a new process that presumably doesn't know about the last, while "call" causes inheritance.

    My guess is that cmd tries to be more random than the current second if called more than once, but the first call is simply the second. Each new process doesn't know about the last so they all settle for the current second.

  39. John says:

    Haha, I suppose I'm maybe one of the few remaining people that actually enjoy writing batch files.  The rest hang out at alt.msdos.batch.nt.  So much so that I created the Batch Library in Progress (blip) library on Sourceforge.

    Just goes to show that there are strange people out there.

  40. Judago says:

    @john

    I think you need to add a delimiter:

    which "findstr=a"

  41. ERock says:

    Pretty unexpectedly loud bashing of Powershell. (unintended pun)

    I can no longer target Windows 3.1 with Visual Studio. Does that mean I shouldn't be writing applications? That Powershell does not work with versions of Windows that are at or near EOL is a silly complaint. Anything you wanted to use Powershell for on older environments you probably already did with a combination of batch files and custom executables.

    Add to that Powershell DOES run scripts by default: signed ones from a trusted source. It does not run unsigned scripts by default. The big issue with batch files in a production environment is that the sysadmins have full access to the batch file and can change things. Maybe that doesn't matter in your environment, but I'm happy that there is separation of duty between the developers that have the keys to sign cmdlets, the sysadmins that have rights to keep the lights on, and the security folks that maintain group policies around those registry keys.

  42. doug.kavendek says:

    @Judago: "but the first call is simply the second."  — hah, got me hung up for a moment trying to parse that…

  43. David Walker says:

    @ERock: Windows 3.1?  Huh?  

  44. Gabe says:

    Peter da Silva: If you think bash is a dataflow engine and Powershell isn't, you obviously haven't use Powershell. To call bash a "dataflow engine" is being quite generous with the term. At best it works if all of your data is in a trivial newline-delimited text format, which means you can't even deal with CSV files because it they can have commas (the field delimiter) and newlines (the record delimiter) inside of fields. Useful manipulation of complex formats like httpd.conf files or email folders is out of the question.

    So, say you want to do something simple, like getting the stats on all httpd processes. Getting even a simple "ps -aux | grep httpd" to do what you expect takes several iterations before you end up with something like "ps -aux | head -1; ps -aux | grep httpd | grep -v grep". In Powershell it would just be "get-process -name httpd" (or using the "dataflow" aspect, "get-process | where {$_.Name -eq 'httpd'}".

    Now say you want to do something slightly harder, like finding out how much CPU time the oldest httpd process has used. In Powershell it's at least as simple as "(get-process -n httpd | sort starttime)[0] | select cpu". I'd love to see how you'd write a "dataflow" version of that task in UNIX.

    As for the security aspect, one problem is the users. By default PS1 files open in Notepad, so you don't have to worry about users double-clicking email attachments and inadvertantly running malicious code. However, that wouldn't stop attackers. They would just send emails like "To see this famous celebrity sex tape, save this attachment to your desktop, then drag it to powershell.exe" along with the attachment. So why isn't this a problem with BAT files and EXEs? It is — it's nearly impossible to email a file with a .bat or .exe extension even among non-Windows users. If they gave Powershell unrestricted script execution by default, it would just be another type of file you couldn't email.

  45. Peter da Silva says:

    Two responses:

    1. I don't use Powershell for the same reason that I don't write scripts in Basic, or install a DCL emulator.

    The UNIX shell is a dataflow engine. The UNIX pipe-filter dataflow model is incredibly powerful, in a way that a conventional Algol-style scripting language can't match… whether it's object-oriented or not. The typical example of "cool powershell scripts" that I've seen touted here are typically 30-40 lines of dense code to replicate something that's a one-liner as a pipeline.

    2. The idea that you deal with the problem of "active content" in email by not installing scripting engines is bizarre.

    First of all, you can embed an executable application in a printable text string that'll pass through email or web undamaged, so someone can feed you a poison pill that's native code. What are you going to do, disable EXE files?

    Second, you deal with "active content" in email by not implementing a mechanism to run "active contentr" directly from email in the first place. Don't treat *any* untrusted content as trusted. Don't launch applications from the mail reader or web browser to display files because you "trust" their MIME-type or extension. Don't use the same set of applications you normally run from the desktop (windows explorer) as handlers and helpers for applications that are supposed to be dealing with untrusted content.

    The whole design of the Microsoft HTML control and the apps and applets that wrap around it boggled my mind when I first saw it in 1997. I responded by banning Outlook and IE at our office. That made me a hero when the flood of email viruses and malware taking advantage of what they originally called "Active Desktop" showed up in the next few years.

    Instead of hiding useful tools, ship them. If there's design flaws in other applications that might make it a little easier to exploit the system if you install them… fix those… and fix them properly. Sheesh.

  46. Judago says:

    @doug.kavendek

    second was unexpected at this time.

  47. MadQ1 says:

    The problem with %RANDOM% is that it's random. If several temporary files are in use at the same time, you'll eventually run into a collision. In one test, 32768 sequential uses of %RANDOM% produced only 12068 unique numbers. If you really want to use %RANDOM%, i would suggest something like this (in the spirit of the original example):

    C> copy con sortarandom.cmd

    @pause

    @set /a number=%number% + 1

    @echo %number%

    ^Z

           1 file(s) copied.

    C> set number=%RANDOM%

    C> for /l %i in (1,1,3000) do @call sortarandom.cmd %number%

    Using the project output directory solved half the problem. This will give you sequential numbers, so you avoid name collisions. It's still far from perfect, though, and you might as well start out with 0, or any other arbitrary number instead of %RANDOM%.

    If you absolutely insist on random but unique file names, use GUIDs instead:

    for /l %I in (1,1,3000) do @for /f "delims=" %J in ('uuidgen') do @echo %J.tmp

  48. John says:

    I also have enjoyed writing batch files, probably the largest system I maintained that was controlled by batch files amounted to around 4000 lines of cmd.  By the end of my tenure I was so engrossed with it I often considered writing my own implementation of cmd but sanity got the better of me in the end.

  49. Yves says:

    If I take the description of the windows help: "If Command Extensions are enabled, …  These variable values are

    computed dynamically each time the value of the variable is expanded. … %RANDOM% – expands to a random decimal number between 0 and 32767. …"

    In Unix/Linux terms, the fact that a second invocation in the same second does not generate a new number would be considered as a bug.  In windows terms, you are given the opportunity to upgrade to PS.  But I still consider it as a bug.

  50. Dan says:

    @Gabe: do not bash what you do not understand (pun intended). Your "complex" PS one-liner is possibly even simpler with Bash (in Linux):

    ps -C apache2 -o bsdtime=|sort -nr|head -1

    PS and bash/sh/csh just approach problems with a different philosophy. Which one is better, well, I do not know, but personally, for complex problems I would use a different scripting language anyways.

  51. Anon says:

    I like the use of "wont" to trip up nitpickers.

  52. Gabe says:

    Dan: I must say that, while it points to an interesting answer, your solution doesn't do what I wanted. My program returned the CPU time of the oldest process, whereas yours returns the one that has used the most CPU. However, as far as I can tell, the "bsdtime" column is formatted as "MMM:SS" which will cause the sort program to not work quite right (does "1000:00" come after "999:59"?). Of course, you left off the column header also, as it's hard to get that sorted properly.

    My reading of the GNU ps man page (more like 18 pages: unixhelp.ed.ac.uk/…/man-cgi) indicates that "ps -C apache2 -o bsdtime=CPU –sort start|head -2" should do it. In other words, the only way to do it is to go against the UNIX philosophy of "Write programs that do one thing and do it well" and instead put all the functionality you need directly into each program. Since text-based pipelines are useless for this task, you have to put the grouping, formatting, column selection, row selection, and sort functions all into the main ps program. Is there any good reason a process listing program needs to know the size of my terminal window?

    Fortunately, Powershell properly embodies the UNIX philosophy. The get-process cmdlet does nothing but return processes (with a few common options); the where cmdlet does only filtering; the sort cmdlet does only sorting; and the select cmdlet does only selection. A PS user can learn the simple select/where/group/sort cmdlets and apply them to files, processes, or anything else. A UNIX user has to learn the grep/cut/sort commands, then learn that they can't apply to processes the way he wants, and has to read 18 pages of documentation on ps to get the job done (assuming he has GNU ps installed, and not some lame old version of ps that just lists processes).

Comments are closed.

Skip to main content