Why is the syntax for touching a file from the command prompt so strange?


The magic incantation for updating the last-modified date on a file is

COPY /B FILE+,,

What strange syntax! What's with the plus sign and the commas, anyway?

The formal syntax is the much more straightforward

COPY /B A+B+C+D

This means to start with the file A, then append the files B, C, and D, treating them all as binary files.

If you omit the B+C+D part, then you get

COPY /B A+

This means "Start with A, then append nothing." The side effect is that the last-write time gets updated, because the command processor opens A for append, writes nothing, then closes the handle.

That syntax has worked since at least MS-DOS 2.1 (the earliest version I still have a virtual machine for).

I dont know where the two-comma version came from, but it most likely exploited a parsing glitch in COMMAND.COM, and somehow this variant gained traction and became the version everybody used (even though the other version is two keystrokes shorter). As a result, this weird syntax has become grandfathered as a special-case in the CMD.EXE parser. Here's some actual code from the part of CMD.EXE which parses the arguments to the COPY command:

if (parse_state == SEEN_TWO_COMMAS)
    copy_mode = TOUCH;
Comments (38)
  1. AC says:

    Somehow this post gives me the feeling the code consists of 90% back-compat crutches piled on top of each other all the way back to V1.0, 5% error handling, and 5% actual functionality.

  2. 12BitSlab says:

    I wrote a compiler in 1994 for a client that generated documents from a description language.  The client wanted me to update it in 2007 to speed things up since they use this compiler to generate about 30,000,000 pages a month with it.  When working on it, I found a bug in the original code, but I couldn't fix it.  There were too many pieces of "source code" that depended on the bug in the compiler.  The good news is that V2 of the compiler shaved about 85% of CPU time off of a compile, but the bug is still there.  Drives me nuts, but the cost of fixing the source descriptions would have been way too high.

  3. Joshua says:

    @AC: You should see modern UNIX's incredible amount of back-compat junk carried around. However in this case, there's a decent chance removing a back-compat junk on a modern UNIX box causes it to not work (/bin/sh: who uses that anymore, I'll just get rid of it (or be smart and link it to bash) oops no boot (bash needs libraries in /usr, sh doesn't)).

  4. Rick C says:

    Seems kind of odd that after all this time, nobody at Microsoft actually wrote a touch utility to replace this hack.

  5. Mathieu Garstecki says:

    > Somehow this post gives me the feeling the code consists of 90% back-compat crutches piled on top of each other all the way back to V1.0, 5% error handling, and 5% actual functionality.

    This is the case for any software I get to maintain, at least any that is more than a year old. At least the code seems readable here !

    This kind of magic tricks in cmd.exe is the reason why I use Powershell or install sh.exe any time I have to script something on Windows.

    Pre-emptive snarky comment : of course, now I have 2 (or 3) problems.

  6. @Joshua: actually, on Linux usually the problem is the reverse: for a long time sh has been symlinked to bash by default, thus many scripts with the #/bin/sh shebang actually exploited features of bash without anyone noticing. For this reason, Debian (and derivatives) switching /bin/sh to dash (a recently written, stripped down POSIX-sh-conformant interpreter, hopefully faster and with a smaller memory footprint than bash) uncovered a lot of "bashisms" buried in many scripts.

  7. alegr1 says:

    Another quirk of COPY: append is done in "text" mode by default. This is why /B switch is there.

  8. > Somehow this post gives me the feeling the code consists of 90%

    > back-compat crutches piled on top of each other all the way back

    > to V1.0, 5% error handling, and 5% actual functionality.

    Which is why, when they turn off the very last desktop, it will be running code that contains the echoes of DOS v1.0.

    In many cases, the alternative to throwing away backward compatibility is to not have your code being run on anything at all :-).

  9. BJC says:

    Is it correct that this method only works if the file being touched is in the current directory? If the filename includes a path to a directory other than the current directory, the file is simply copied to the current directory (and the modified time isn't changed). If the filename doesn't have a path, and the file is in the current directory, then the operation works exectly as I'd expect.

    Certainly that's what I found previously when trying this from a batch file. I've just tried it again with the demo batch file in the referenced KB article.

    Am I doing something wrong, or is this expected behaviour?

  10. kinokijuf says:

    > Somehow this post gives me the feeling the code consists of 90%

    > back-compat crutches piled on top of each other all the way back

    > to V1.0, 5% error handling, and 5% actual functionality.

    Case in point: the AT-compatible BIOS. Except there is no error handling.

  11. Joshua says:

    Incidentally, neither is a valid touch because both will if given pathtofile will copy the file to the current directory.

  12. BJC says:

    @BJC – Ok just realised the mistake.

    It is necessary to specify a destination folder, so the format is as follows:

       copy /b D:mypathtest.file +,, D:mypath

    Sorry to have raised some confusion.

  13. JM says:

    Well, I learned something new today. The touch.exe I have on my system is actually unnecessary. And given my fluency with PowerShell, I guess the rest of the GNU utils could go as well. Still, sed and I have too much nostalgia going on to just discard it.

  14. henke37 says:

    This reminds me of a "feature" I discovered yesterday. It's not an error if one of the files are missing.

  15. Gabe says:

    Just recently I wanted to look for the cmd equivalent to touch, and all I could find was the "copy" variant. Apparently everybody forgets that touch will not only set the time on the file to "now", but it will also create the file if it doesn't exist.

    Since I only wanted to create the file if it doesn't exist (and don't care about its timestamp), the copy command was useless. What I ended up using was "type nul >> file.to.create" (which will have no effect on an existing file).

    So a more complete version of touch would have to have both commands.

  16. dave says:

    >somehow this variant gained traction and became the version everybody used

    Sort of like the way many people type "*.*" to mean "all files, even ones that don't have a dot in the name".

    Me, I just type "*"

    (Yes, I understand the whys and wherefores: it starts with the fact that the "." was not actually on disk in Ye Olde Fatte Phyle Systemmes)

  17. Brad Westness says:

    > alegr1: Another quirk of COPY: append is done

    > in "text" mode by default. This is why /B switch is there.

    That seems entirely reasonable though, doesn't it? After all, concatenating two binary files will most often result in something useless. However, concatenating two text files will result in a perfectly readable text file.

  18. Silly says:

    @Brad Westness: Agreed, and it would be a pain to have to type something like an extra "/T" whenever I need to combine multi-part uuencoded files.

  19. Killer{R} says:

    AFAIK, is a arguments separator, just like space, but visible. So ,, could be an visual hint for human, that there'is 'empty' (nothing) argument

  20. Mark Y says:

    I am going to ask this question whenever I get a chance.  Since I someone mentioned .. in FAT, here is my chance: how did FAT implement them?  It had no "hardlinks" or similar, so what did it do?

  21. Falcon says:

    @Myria:

    The Win32 layer does this. Win32 file names have an optional implicit dot at the end, if that makes sense. I just tested this in CMD with the MORE command with stdin redirection – even if the file name has dots in it, you can add many dots at the end and you'll still be referring to the same file. I made sure that redirection did not work with a wildcard name (wouldn't make sense if it did).

    You can find an explanation in the comments of this post:

    blogs.msdn.com/…/1211409.aspx

  22. Gabe says:

    Mark Y: The . and .. directories are implemented in FAT the same way any file or directory is implemented. A directory entry consists of 8 bytes for the filename, 3 bytes for the extension, some flags, the starting cluster of the file/directory, and a bunch of other metadata.

    The "." entry has the filename set to ".", the extension blank, the directory flag set, and the starting cluster set to whatever cluster that directory starts in. The ".." entry is the same, but has the filename set to ".." and the starting cluster set to whatever its parent directory's starting cluster is.

    This is very similar to how it works in UFS (directory entries contain a name and an inode number, which performs the same function as the starting cluster for these purposes). The reason Unix has "hard links" and FAT doesn't is that UFS has a "link count", so it knows when a file has no directory entries pointing to it and the space can be reclaimed.

    FAT doesn't have a count, so it always assumes that there is only one entry and the space is marked free whenever a directory entry is deleted. Creating a "hard link" in FAT is as simple as creating a directory entry with another file's starting cluster. That would work fine up until you went to delete one of the links.

    Since FAT doesn't keep a reference count, it doesn't know how many links there are to a given directory, but it does know when a directory has entries in it. So it simply prevents you from removing a directory unless it's empty.

  23. Myria says:

    @dave: I think . and .. have been in the FAT filesystem for quite a while, perhaps back to the advent of directories in DOS.

    The *.* thing is because DOS considered filenames and extensions to be separate entities, so you had to match them separately.  NT doesn't give special meaning to . anymore, so * works.  I bet that *.* matches dotness filenames in NT just because of a kernel32 hack in FindFirstFileW.  I'm used to typing * instead of *.* by this point.

    I wish that the special meaning of "com1", "aux", "prn", etc. would go away unless a new compatibility shim is applied.  A colon or \. would be required in newer programs.  I am sick and tired of directories named "aux" getting committed to our Subversion repository and having to clean up the mess so Windows can update Subversion again.  I also wish open-source groups weren't so hostile to Windows and unwilling to rename something trivial to help out Windows users – I'm looking at you, libunwind.

  24. cheong00 says:

    [After all, concatenating two binary files will most often result in something useless]

    I think one of the compression archiver got their first SFX extension in stub EXE form (much like loader of certain virus found nowadays). If you want to create self extracting archive, just compress it and append it to the EXE file you you're done.

    It was till the next version they add SFX as archiver's option. (And officially include SFX as one of their feature)

  25. Neil says:

    @Gabe: That's a good one. I always used to use rem > file on DOS and copy nul file on NT to create an empty file (neither works on the other).

  26. ErikF says:

    Raymond (or whoever has these), how well does the DOS 2.1 VM work and which VM software do you use? I have found that some of the older DOS versions seem to misbehave in v86/emulation, so just wanted your thoughts.

  27. Joshua says:

    @ErikF: New CPUs can't turn off the A20 gate anymore. Try Bochs.

  28. Karellen says:

    What if you want to append the files "FILE" and ",,"?

  29. Yuhong Bao says:

    @Joshua: False. It is still there, though no physical pins are used in modern Intel CPUs, instead Virtual Legacy Wires are used.

  30. Yuhong Bao says:

    @Joshua: Actually, reading the 8-series datasheet looks like they finally ditched it.

  31. Yuhong Bao says:

    @Joshua: In fact, looks like they ditched it in the 5-series chipset generation:

    http://www.intel.eu/…/5-and-3400-chipset-specification-update.pdf

  32. Yuhong Bao says:

    @Joshua: What is funny is this info is in a specification update and was only put there in Dec 2011. I wonder what happened.

  33. dave says:

    @Myria

    >@dave: I think . and .. have been in the FAT filesystem for quite a while, perhaps back to the advent of directories in DOS.

    No, I meant that the "." in "foo.bar" is not stored on the disk in the original FAT (FAT12/FAT16) directory structure.  Therefore "foo." and "foo" cannot be distinguished, for example.  And therefore the pattern "*.*" matches "foo", even by all that is decent it should not.  People have just got into the DOS-induced habit of expecting "*.*" to match all filenames, even those that don't have a dot.

    I am not talking about the specific entries "." and "..".

  34. David Heffernan says:

    On my machine the syntax is

       touch filename

    Gotta get those GNU tools installed……..

  35. Joshua says:

    [After all, concatenating two binary files will most often result in something useless]

    That's what you think. On Windows, I have more occasion to do this on binary files than text files. On Unix, not so much.

  36. Mark Y says:

    Gabe: Thank you for answering.  I don't suppose you're still checking this thread, so my thank you comes too late.  But your answer made me realize my question was not specific enough.  Directories have metadata, such as size, last-modified date, read-only status, etc.  This is stored in the directory entry I believe.  How are all entries kept in sync?  The simplest thing I can think of is to have "." have the valid info, and the rest don't, but is that really what was done?

    [A little experimenting with this web page should answer your question. -Raymond]
  37. Mark Y says:

    I tried posting yesterday, but my comment did not go through.  I am trying again.  The simulator is awesome, but I don't know DOS 2 well enough to figure this out.  The dir command is less flexible than the modern version.  The attrib command doesn't exist at all.  I don't know what to try.

    [The only commands you need are DIR with no options, and MD. Notice that the timestamp on . and .. are not kept in sync with their antecedents. -Raymond]

Comments are closed.