How did wildcards work in MS-DOS?

The rules were simple but led to complicated results.

MS-DOS files were eleven characters long with an implicit dot between characters eight and nine. Theoretically, spaces were permitted anywhere, but in practice they could appear only at the end of the file name or immediately before the implicit dot.

Wildcard matching was actually very simple. The program passed an eleven-character pattern; each position in the pattern consisted either of a file name character (which had to match exactly) or consisted of a question mark (which matched anything). Consider the file "ABCD····TXT", where I've used · to represent a space. This file name would more traditionally be written as ABCD.TXT, but I've written it out in its raw 11-character format to make the matching more obvious. Let's look at some patterns and whether they would match.

Pattern Result Explanation
ABCD····TXT Match exact
??????????? Match all positions are a wildcard so of course they match
ABCD????··· No match space (position 9) does not match T
A?CD····??? match perfect match at A, C, D, and the spaces; wildcard match at the question marks

The tricky part is converting the traditional notation with dots and asterisks into the eleven-character pattern. The algorithm used by MS-DOS was the same one used by CP/M, since MS-DOS worked hard at being backwards compatible with CP/M. (You may find some people who call this the FCB matching algorithm, because file names were passed to and from the operating system in a structure called a File Control Block.)

  1. Start with eleven spaces and the cursor at position 1.
  2. Read a character from the input. If the end of the input is reached, then stop.
  3. If the next character in the input is a dot, then set positions 9, 10, and 11 to spaces, move the cursor to position 9, and go back to step 2.
  4. If the next character in the input is an asterisk, then fill the rest of the pattern with question marks, move the cursor to position 12, and go back to step 2. (Yes, this is past the end of the pattern.)
  5. If the cursor is not at position 12, copy the input character to the cursor position and advance the cursor.
  6. Go to step 2.

Let's parse a few patterns using this algorithm, since the results can be surprising. In the diagrams, I'll underline the cursor position.

First, let's look at the traditional "ABCD.TXT".

Input Pattern Description
··········· Initial conditions
A A·········· Copy to cursor and advance the cursor
B AB········· Copy to cursor and advance the cursor
C ABC········ Copy to cursor and advance the cursor
D ABCD······· Copy to cursor and advance the cursor
. ABCD······· Blank out positions 9, 10, and 11 and move cursor to position 9
T ABCD····T·· Copy to cursor and advance the cursor
X ABCD····TX· Copy to cursor and advance the cursor
T ABCD····TXT  Copy to cursor and advance the cursor

The final result is what we expected: ABCD····TXT.

Let's look at a weird case: the pattern is ABCDEFGHIJKL.

Input Pattern Description
··········· Initial conditions
A A·········· Copy to cursor and advance the cursor
B AB········· Copy to cursor and advance the cursor
C ABC········ Copy to cursor and advance the cursor
D ABCD······· Copy to cursor and advance the cursor
E ABCDE······ Copy to cursor and advance the cursor
F ABCDEF····· Copy to cursor and advance the cursor
G ABCDEFG···· Copy to cursor and advance the cursor
H ABCDEFGH··· Copy to cursor and advance the cursor
I ABCDEFGHI·· Copy to cursor and advance the cursor
J ABCDEFGHIJ· Copy to cursor and advance the cursor
K ABCDEFGHIJK  Copy to cursor and advance the cursor

Sure, this was extremely boring to watch, but look at the result: What you got was equivalent to ABCDEFGH.IJK. The dot is optional if it comes after exactly eight characters!

Next, let's look at the troublesome A*B.TXT.

Input Pattern Description
··········· Initial conditions
A A·········· Copy to cursor and advance the cursor
* A??????????  Fill rest of pattern with question marks and move to position 12
B A??????????  Do nothing since cursor is at position 12
. A???????··· Blank out positions 9, 10, and 11 and move cursor to position 9
T A???????T·· Copy to cursor and advance the cursor
X A???????TX· Copy to cursor and advance the cursor
T A???????TXT  Copy to cursor and advance the cursor

Notice that the result is the same as you would have gotten from the pattern A*.TXT. Any characters other than a dot that come after an asterisk have no effect, since the asterisk moves the cursor to position 12, at which point nothing changes the parse state except for a dot, which clears the last three positions and moves the cursor.

I won't work it out here, but if you stare at it for a while, you'll also discover that *.* is the same as * by itself.

In addition to the rules above, the MS-DOS command prompt had some quirks in its parsing. If you typed DIR .TXT, the command prompt acted as if you had typed DIR *.TXT; it silently inserted an asterisk if the first character of the pattern was a dot. This behavior was probably by accident, not intentional, but it was an accident that some people came to rely upon. When we fixed the bug in Windows 95, more than one person complained that their DIR .TXT command wasn't working.

The FCB matching algorithm was abandoned during the transition to Win32 since it didn't work with long file names. Long file names can contain multiple dots, and of course files can be longer than eleven characters, and there can be more than eight characters before the dot. But some quirks of the FCB matching algorithm persist into Win32 because they have become idiom.

For example, if your pattern ends in .*, the .* is ignored. Without this rule, the pattern *.* would match only files that contained a dot, which would break probably 90% of all the batch files on the planet, as well as everybody's muscle memory, since everybody running Windows NT 3.1 grew up in a world where *.* meant all files.

As another example, a pattern that ends in a dot doesn't actually match files which end in a dot; it matches files with no extension. And a question mark can match zero characters if it comes immediately before a dot.

There may be other weird Win32 pattern matching quirks, but those are the two that come to mind right away, and they both exist to maintain batch file compatibility with the old 8.3 file pattern matching algorithm.

Comments (48)
  1. Puckdropper says:


    I just tried in in DOS 6.22 on a virtual machine, and got the same results you did.  I wonder if DOS 5 handles it differently?  Unfortunately, I don’t have a copy handy.  Somewhere I’ve got DOS 3.3 on a floppy (don’t we all?) but no room to set up a machine to run it on.

    C:WFW311>dir soundrecexe



    C:WFW311>dir soundrec.exe


  2. John says:

    Unfortunately, no one can be told how wildcards worked in MS-DOS. You have to try it for yourself.

    • Morpheus
  3. Steve says:

    Interesting. Out of habit, I still use "del ." to delete everything in the current directory. I never realized that behavior was accidental.

  4. Joseph S. says:

    I seem to remember that one could also type "dir.txt" which was equivalent to "dir .txt". Can anyone with a copy of DOS 6.22 in a VM try it out?

  5. Neil says:

    Was the Christmas reference intentional?

  6. Anders Tellander says:

    My favourite quirk was that if you typed "dir.exe" (yes, without the space) this would not actually invoice an executable called "dir.exe", but was in fact the same as writing "dir *.exe".

    The reason for this was of course that dir was not an exe file, but rather part of itself.

  7. strik says:

    Steve: I do not think that "del." works out of an accident. "." is a valid filename: The current directory. This is how that del works.

  8. Sandeep says:

    If you typed DIR .TXT, the command prompt acted as if you had typed DIR *.TXT

    Heh, that was one "feature" I depended on.

  9. Gazpacho says:

    Each DOS program that needed filename matching had to pass the pattern to FindFirstFile. Matching wasn’t built into the shell, as in Unix.

    The fun thing about CP/M was that it had no definition, even an implicit one, of what characters were allowed in a filename. Some of the core file utilities used different rules, so you could create a file with one and be unable to use it with another.

  10. Gazpacho says:

    Looking at the CP/M 2.2 manual, apparently there were filename rules, but I distinctly remember reading that the utilities had that problem. Maybe they weren’t following the rules, or the spec was added in v2, or someone just made it up.

  11. Nowadays, you have to type


    to list the file (or use wildcards), but I seem to remember, in MS-DOS it sufficed to type


    to list the exact same file.

    P.S. Slightly offtopic: one other feature I miss from Win9x, the "cd …" command. In WinXP you have to type "cd …."

  12. ATZ Man says:

    I’m looking forward to the day when extensions aren’t needed because Posix has a widely-supported metadata API for files, backported to CP/M 2.2

  13. Gazpacho says:

    "No, it means any file with a dot in the name."

    But since the effect of the lookup rules is to put an implicit dot in every name that doesn’t have an explicit one, that is the same as every file. This ensures that old programs that hardcoded the dot into the pattern still work.

  14. BryanK says:

    Aaargh!:  That would work great, if Windows had such a thing as a standard place to store the MIME type in the filesystem metadata per file.  You’re thinking BeOS, or maybe OS X, or maybe also some other OS I don’t know of.


    (Now, yes, you could put it in an alternate data stream — but that’s not standard.)

  15. Gabe says:

    Aaargh!: adding more metadata is not an easy solution, unless you think that it would be easy to update millions of programs and protocols to understand this change.

    I mean, sure, it’s an easy feature to add, at least to Windows. NTFS already stores all kinds of metadata, you would just need a standard way to store and retrieve it. The problem is that only new programs would understand the new file type. In order to keep old programs functional, you would still have to store the extension also. It’s not just old programs though, it’s old file servers, old CDs, and so on.

    There are things like digital cameras that believe all JPEG files to have a filename that ends in .JPG and MP3 players that look for a certain extension. How would you deal with CD-ROMs? You can’t go back and change ISO 9660, you can only extend it. Then there are protocols like FTP which don’t know anything about filesystem metadata. You would have to manually set the filetype of any file transferred. And archive formats like tar don’t support arbitrary metadata either.

    Maybe 20 years ago when most computers were used standalone you could get away with creating a feature like this. As a matter of fact, Apple did just that. They stored the filetype as an invisible piece of metadata, and even kept other resources related to the file in a separate stream. Nowadays most Mac files use extensions to indicate filetype and have their resources in a separate file.

    I’m not saying that you should store MIME types in filesystem metadata, I’m merely suggesting that you will never be able to get rid of extensions.

  16. steveg says:

    metadata: pffft! No-one ever gets it right (if there’s a manually-entered component). At least a file extension is correct.

    I’d guess meta-data based filesystems are inherently less effecient than an extension-based system for certain operations (whether it’d be measurable or not, I don’t know). This’d be at the CMD/API level, I doubt Explorer would be any slower. eg: "dir {TextFiles}".

  17. Roger Binns says:

    For win32 matches are against the long and short filenames.  For example you could have a directory of .html files but they will match the wildcard *.htm as that is the short filename extension.

    There are so many nooks and crannies, special cases etc when trying to be compatible with Windows wildcard matching.  In the CIFS/SMB I worked on over a decade ago I ended rewriting the code 4 times!

    The Samba guys got even more frustrated.  In the end Tridge wrote a "proxy" server that took a wildcard pattern and sent it to various versions of Windows to see which files they returned.  IIRC no two Windows versions behaved exactly the same, and in some cases behaved differently depending on what the client operating system was.

  18. ChrisMcB says:

    [Is there still any need for this extension weirdness ? It used to be a hack to indicate the type of file, but nowadays it’s much easier to just store e.g. mime-type in the filesystem metadata.]

    Seems to me to be WAY easier to store this metadata with the name of the file. Perhaps we can stick it on the end of the file name. Perhaps "extend" the filename?

  19. Igor Levicki says:

    Now, yes, you could put it in an alternate data stream — but that’s not standard.

    Yes, and it brings up the question of why they were added in the first place if they weren’t meant to be used for anything except Explorer’s "unsafe file that came from web" dialog and Kaspersky AV checksums.

    adding more metadata is not an easy solution…

    CreateFile() could have performed the work behind the change. For example, if you passed "blah.avi", it could’ve just created file "blah" and set its metadata based on the extension. Then when you want to open "blah.avi", it would just strip the ".avi" and convert it into proper metadata and open the matching "blah" file.

    Apart from the fact that it would not work with FAT/FAT32, there is only one problem which hasn’t been solved so far in any of the filesystems I know of:

    What if you had two "blah" files? Sure they would differ by metadata, but you would still have the name conflict. So far no file system supports the creation of several files with the same name which I believe is not coherent with real life.

    So, it all boils down to the simple question — why we can’t have multiple files with the same name when it is possible to implement it?

    I believe that in order to break free from the extension hell filesystems have to start using something else than filename to differentiate between files (for example some GUID).

  20. Goplat says:

    I’m getting different behavior than described here, both in’s “dir” as well as in a program I wrote to show the parsed filename at address 5Dh. (The two seem mostly the same, but “dir” allows any extension if you don’t use any dots – probably fills it with question marks initially)

    First, I can’t get that effect where ABCDEFGHIJK would run into the extension. “dir scandiskexe” shows both the .EXE and the .INI, so the extension part of the mask didn’t get touched.

    Second, I can’t get just a single * to fill out the whole mask – though in “dir” there’s no way to tell since it initializes the extension to question marks anyway, if you pass * to a program it receives in address 5D just eight question marks, and “del *” only deletes files without an extension.

    Third, I can’t get the dot to move the position backwards to the start of the extension – * acts identical to *.foo.

    Of course there are other DOS programs that do their own mask parsing and might work differently but this is what I’ve gotten from COMMAND.COM in both DOS and the Windows VDM. What program and version were you using?

    [I tested on CP/M 2.2; I never noticed any discrepancy between MS-DOS’s parser and CP/M’s but I guess there is. -Raymond]
  21. Puckdropper says:

    A*B.txt = A*.txt

    Yep, I remember stuff like that.  I was quite happy to find out that Windows command line interpreted A*B to mean A{anything in between}B rather than A{Anything} .

    Interesting to find out how it worked.

  22. Roger Binns says:

    See the first 2.5 pages of this from 2000.  Basically if you don’t get wildcards exactly right then various programs break.  It also turns out that there are 5 wildcard characters, not just * and ? as detailed above and in the CIFS spec.

    Metadata can be stored elsewhere.  You have alternate data streams in NTFS, extended attributes in OS/2 and data forks in Mac/HFS.  BeOS also used a similar concept to store mime type information.  

    The problem is that interchange with others treats files as a stream of bytes plus some meta information – name and dates usually.  So you could make your email clients, zip programs, web servers etc also use/provide the extra information but people on other platforms will either not see that extra information, or be unable to process the files correctly.  (Ever see HQX files outside of the Mac world?)

    MacOS X tried to solve the problem in a compatible fashion with .DS_store folders.  See and Google it to see how much grief it also causes.

    Something else they did nicely on MacOS X is to make extensions meaningful on directories (actually borrowed from Next).  Applications are basically a directory structure where the top level directory has an extension of .app.  That allows you to trivially move it around, uninstall by deleting it etc.

  23. KJK::Hyperion says:

    Roger: the extra wildcard characters are, IIRC, "DOS dot", "DOS asterisk" and "DOS question mark", which emulate the DOS quirks Raymond explained. They are documented in the IFS SDK (

  24. dave says:

    steveg: "At least a file extension is correct."

    I want to move to your planet.

  25. edlin ftw says:

    "dir.txt" worked because in "." was a delimiter. Similarly, "echo." printed a blank line.

    I was bitten by this in cmd too :p

  26. ScottR says:

    I *so* miss being able to type "dir .txt".

  27. Dean Harding says:

    KJK::Hyperion: I think Roger’s point is that the implementation differs from the documentation. Have a look at the document he posted: "The CIFS documentation dismisses these three extra wildcard characters by providing a very simple mapping between these characters and the usual * and ? characters. Unfortunately this simple mapping is so simple because it is completely incorrect, as a few minutes of testing confirms."

  28. Aaargh! says:

    Where did the globbing take place ?

    IIRC ‘dir’ was a built in command, but what happend if you did "someapp.exe *.txt"

    Would someapp.exe see "*.txt" as it’s first argument or would it see "1.txt" "2.txt" (whatever was in the directory) as it’s arguments ?

  29. Oh, I forget to mention my favorite CP/M wildcard parsing feature:

    Guest what happened if you typed


    on the command line?

    Yep, it invoked the first executable file found in the current directory. Made for a thrilling life. Needless to say, *that* particular feature was not ported to QDOS/PC-DOS.

  30. jcoby says:

    I’ve seen other oddities in filename matching.  Recently, we had a HDD formatted as NTFS with 99 directories, each with about 45k files, all with numeric filenames.  This, of course, is very slow to deal with.  So I tried to break up the directories even further by moving files with the same first digit into their own sub-directories.  To figure out what to move, I ran:

    dir 1*.jpg

    This, for whatever reason, is equivalent to:

    dir *1*

    But apparently only when you hit some magical threshold for number of files in a directory (smaller test cases worked perfectly).

    [Interesting but incorrect theory. I discussed this two years ago. -Raymond]
  31. anonymous says:

    Why is the behaviour in Win32 considered as quriks?

    *.* simply means exactly this: any filename, any extension. Of course this includes all files with no extension. The dot simply serves as delimiter between the filename and the extension, and it’s only natural that this separation is implied in the Win32 subsystem.

    [That may be what you think it means, but a literal reading of the wildcard rules (before the quirks are applied) says that “*.*” means any string (*), followed by a period (.), followed by any string (*). The non-quirk wildcard rules don’t know what an “extension” is. -Raymond]
  32. Aaargh! says:

    "*.* simply means exactly this: any filename, any extension. Of course this includes all files with no extension. The dot simply serves as delimiter between the filename and the extension, and it’s only natural that this separation is implied in the Win32 subsystem."

    No, it means any file with a dot in the name. Wasn’t the whole filename dot extension thing dropped in Win95 when LFN’s where introduced ? A filename is just a string and it could have one or more dot’s in it.

    Is there still any need for this extension weirdness ? It used to be a hack to indicate the type of file, but nowadays it’s much easier to just store e.g. mime-type in the filesystem metadata.

  33. jcoby says:

    [Interesting but incorrect theory. I discussed this two years ago. -Raymond]

    Interesting.  If memory serves me, it did the same thing for 2*, 3*, up to 9*.  There was a very good chance for 8-char conflicts (each file was a 7-digit id, followed by a dash, followed by a sequence number).  Each of the 99 directories broke the file names up by the last two digits of the 7-digit ID.

    In the end, we could not find any way to break the contents up by filename from the command line.  Using Explorer wasn’t an option – that HDD brought every computer it touched to its knees.

    25GB of images at about 12k each across 99 directories.  NTFS wasn’t really happy with it.

    [NTFS inherently is fine with that much data. You may have had better luck if you had disable 8.3 filename generation. -Raymond]
  34. foxyshadis says:

    "I’d guess meta-data based filesystems are inherently less effecient than an extension-based system for certain operations"

    Macs do fine with resource forks, it’s not a big deal. Looking shell handlers up in the registry and initializing them is orders of magnitude slower. The real crying shame is that attributes aren’t extracted to ADS as soon as they’re scanned – photo, music, and movie metadata in particular is crazy slow because it has to be re-extracted every time explorer’s cache is invalidated.

    Hmm, sounds like a good idea for a custom shell extension, overriding the default shmedia.dll.

    "You can’t go back and change ISO 9660, you can only extend it."

    Thankfully UDF has completely displaced ISO in all but the oldest equipment, even if most discs still have ISO/Joliet fallbacks.

  35. not gary kildall says:

    "MS-DOS worked hard at being backwards compatible with CP/M."

    That’s putting it rather charitably.

  36. Anon says:

    @not gary kildall

    It’s not illegal to reverse engineer something and then emulate it, unless it’s patented. Which CPM wasn’t, as far as I can tell.

  37. Miral says:

    My (admittedly possibly fuzzy) recollection has it that *.* and * behaved differently in *some* version of DOS — although I’m certain that they were identical in DOS 7 (aka Win95).

    Then again, maybe I’m just getting it confused with "*.", which I’m certain meant "files with no extension" (and agrees with the rules Raymond’s posted).

    The space character thing was quite useful; I recall a common trick for copy-protection and the like was to create a file with a space or char-255 (which looked like a space) somewhere in the middle of the filename.  DOS didn’t care, it could let a program open and read the file just fine, but it made it trickier to deal with it on the command line, since many utilities wouldn’t accept wildcards and the filename-quoting thing hadn’t become generally established yet.

  38. Gazpacho says:

    "So, it all boils down to the simple question — why we can’t have multiple files with the same name when it is possible to implement it?"

    – It makes command-line file utilities impossible.*

    – It breaks file operations in every programming language that has them.*

    * Unless you’re prepared to use a filename syntax even worse than the one in VMS.

    In case you’re wondering who’s _really_ to blame for all this "filename extension" business:

    – Gary Kildall got it from the DEC PDP operating systems.

    – DEC got it from CTSS, which created the modern concept of a file system.

  39. Gazpacho says:

    Oh yes please, let’s interrupt this discussion of Windows technical details to rehash myths and legends about what happened in 1980.

  40. Reinder says:

    It always is good to learn something, but in this case, I will stick to what i already knew as the answer to "Q: How did wildcards work in MS-DOS?". It is shorter, reasonably correct for anybody who did not use MS-DOS that much, and way easier to remember: "A: Not" :-)

  41. John Elliott says:

    Further to Goplat’s comment, the parsing in CP/M 2.2 and MS-DOS 5 seems to be the same, but not what is described in the original article. (I’m using CP/M 2.2 under the z80pack emulator, ).

    Using underlines instead of spaces:


    * goes to ????????___

    *.* goes to ???????????

    .TXT goes to ________TXT

    (This is looking at the parsed filename at 5Ch.)

    Other notes:

    In CP/M 2.2, DIR .TXT behaves like a naked DIR and lists all files (because if the first character in the FCB is a space, it replaces the whole FCB with ???????????). So that behaviour must have been introduced in MS-DOS.

    Source for the CP/M parser is actually available (os2ccp.asm in on the above site; start at setname:). It has separate loops for the first 8 characters and the last 3.

    In CP/M 2, the parser wasn’t available to application programs, so if a program wanted more than the two filenames at 5Ch and 6Ch it would have to roll its own, which probably didn’t follow the same rules.

  42. Dan says:

    I’m glad to see other commenters picked up on some of the things I noticed (* matches the same as *., not *.* in MS-DOS 6, if a dot isn’t present in a filename it’s implied at the end, before the dot is clipped to 8 chars, after the dot is clipped to 3).

    One thing that wasn’t is "spaces were permitted anywhere".  I’m pretty sure they weren’t, because there was no quoting to enclose path names with spaces in those days.  Short file names generated from LFNs have the spaces stripped out.

    Once, my brother managed to save a file on an MS-DOS system with a space in the name.  It created havoc!  Any program that tried to open the file could not (which was why he asked for help).

    I tried both MS-DOS commands and Windows 3.1 File Manager.  Both DIR and FILEMAN would display the file correctly, but interacting with it would result in an error.  DEL assumed you were trying to pass it TWO filenames and would of course not find either (or did it just ignore the second parameter?  Whatever).  Fileman probably did something similar internally.

    I don’t believe substituting a ? for the space worked either, although I’m not sure now (happened a looong time ago).  I think I eventually got rid of it by making a mask with a * at the end before or at the space’s position.  IE if it was SOM FILE.EXT I did "del so*.ext")

  43. Neil says:

    So, at the risk of being obvious, one of Raymond’s test cases was ABCDEFGHIJKL, which turned into ABCDEFGH.IJK, thus, No(e)L.

    Merry Christmas!

  44. sascha says:


    "spaces where permitted" on the file system, they where still accessible for the normal api’s.

    of course all command line parsers where unable to handle such a name.

    having a file named "con" on the filesystem was worse ^^

  45. Dan says:

    Actually neil it turns into ABCDEFGH. :(  When no dot is specified, it’s assumed to be on the end.  And of course anything before it is cropped to three characters.

    sascha: there’s "nul" and "com#" and "lpt#" and "prn" too (prn maps to lpt1 IIRC).

    nul was good for detecting directory existence in batch files (IF EXIST dirnamenul etc).  prn/lpt# was for redirecting output to the printer.

    Con was also useful for input as well as output (copy con file.ext == a more usable text editor than vi! </hatingvi> CTRL+Z to signal end of file to copy.  copy file.ext con == type file.ext).

    And by "API" I assume you mean BIOS interrupts or however the OS did it’s magic because all you ready had were your standard C (which for the filesystem meant fopen() etc).  No WinAPI, or DOSAPI for that matter.  IIRC.  I didn’t really do a lot of C when I was 8.

  46. Gabe says:

    Dan: DOS APIs in this case refer to Int 21h, which was how you called into that part of the OS.

  47. Jonathan says:

    And for terminally stuck files, there was always Norton Disk Edit, which let you edit the dir entries byte-by-byte. And the FAT. And the partition table. It really helped me when I had to tend to 20 CIH-virus victims (which overwrote the MBR).

    I miss the simpler times, when a binary disk editor was all you needed to really understand what’s going on.

  48. no space says:

    "dir.txt" was very useful, to bad cmd doesn’t have this capability.

Comments are closed.

Skip to main content