A discovered quirk is just few steps away from becoming a feature


Commenter Cherry wonders who invented all those strange syntaxes, like set " to show all environment variables, including the hidden ones.

An interesting historical note is the origin of the convention in unix that files whose names begin with a dot are hidden by default (here's the relevant portion). That article highlights how a discovered quirk is just a few steps away from becoming a feature.

As Master Yoda might put it: Discovery leads to dissemination. Dissemination leads to adoption. Adoption leads to entrenchment. Entrenchment creates a compatibility constraint.

As I've noted many times, the batch language was not designed. It simply evolved out of the old CP/M program SUBMIT, which was an even more rudimentary batch processor. (The original SUBMIT.COM didn't have conditional branches. It merely ran every line in your batch file one after another.)

One of the consequences of something that old is that any quirk, once discovered, can turn into a feature, and from there it becomes a support burden and compatibility constraint. We've seen this many times before: Counting the number of lines in a file by exploiting a buffer underflow bug in FIND.COM. Update the last-modified time of a file by using a magic sequence of punctuation marks. Echoing a blank line by typing ECHO.. All of these were accidental discovered behaviors (just like unix dot files) which became entrenched. Even when the underlying program was completely rewritten, these special quirks had to be specifically detected and painstakingly reproduced because so many programs (i.e., batch files) relied on them.

For set ", it's a case of taking advantage of two quirks in the implementation: The first quirk is that a missing close-quotation mark is forgiven. That means that set " is logically equivalent to set "".

You are therefore asking for a filtered list of environment variables, but passing the logical equivalent of no filter. Specifically, you're asking for all environment variables which begin with the empty string, and it so happens that every string begins with the empty string. The second quirk is that when an explicit filter is applied, the set command disables its default filter of "Hide environment variables whose names begin with an equals sign."

In other words, the code goes like this:

foreach (var entry in Environment.GetEnvironmentVariables()) {
 if (prefixFilter != null ?
     entry.Key.StartsWith(prefixFilter) :
     !entry.Key.StartsWith("=")) {
  Console.WriteLine("{0}={1}", entry.Key, entry.Value);
 }
}

Perhaps this is a bug, and it should have been written like this:

foreach (var entry in Environment.GetEnvironmentVariables()) {
 if (!entry.Key.StartsWith("=") &&
     (prefixFilter == null || entry.Key.StartsWith(prefixFilter))) {
  Console.WriteLine("{0}={0}", entry.Key, entry.Value);
 }
}

But it's too late to fix it now. People have discovered the quote trick, so it's now a feature and therefore a compatibility constraint.

Comments (24)
  1. Anon says:

    It might be argued that set " is a quirk NOT because of any of the above… but because '"' is not an empty string. "''" is. '"' is just a quotation mark.

    It might further be argued that this was intentional behaviour, as there's no other way to get "set" to produce a full list of environment variables, it was just not documented at the time.

    [There is no indication in the code that this behavior is intentional. It just looks like a bug to me. -Raymond]
  2. Gabe says:

    You'd have to imagine that if somebody were designing the SET command to have a way to show all variables, they would just create a parameter like SET /A or SET /ALL rather than have some secret backdoor.

  3. Herbie says:

    I was unaware of this quirk until today. Now I will use it everywhere I can…

  4. Joshua says:

    [There is no indication in the code that this behavior is intentional. It just looks like a bug to me. -Raymond]

    Unless the purpose was to escape being discovered before release.

  5. anonymous says:

    what is the meaning of the '=::=::' at the start of the environment?  

  6. Marc says:

    Out of curiosity, I just tried it in a DOSBox window.  Doesn't work.  WAHHHHH!  Incompatibility!  (Yes, I do know that DOSBox is not a Microsoft product AND doesn't promise full compatibility.  Still – it's like they're not even trying!  ;)  )

  7. SimonRev says:

    @Marc — I still use 4NT for my command prompt at work and it doesn't work there either. In fact I forgot that 4NT would handle the set command and was wondering why I didn't have any hidden environment entries.

  8. Joshua says:

    @Marc: Given its purpose and the purpose of the hidden variables, DOS wouldn't have it.

  9. Dominic says:

    @anonymous: since the next entry tracks the currently logged directory for the C drive, I expect that "=::" somehow controls the default directory for drives you haven't logged onto yet.

  10. James Curran says:

    I'd say the current winner in this category is JavaScript, where the abuse of an implementation detail (functions as first class objects) has allowed it to considered an "Object-oriented" programming language.

  11. amroamroamro says:

    @anonymous: those variables whose name start with an equal sign are used as a trick to pass the current working directory to a newly created process

  12. 640k says:

    @amroamroamro:

    =ExitCode does only contain =00000000, not any directory.

  13. Karellen says:

    @James Curran: Can you expand on that a bit? What is the particular bit of abuse[0] you are referring to, and what particular facet of "Object-oriented" programming languages do you consider so vital[1] that without it JS would otherwise fail to qualify?

    [0] If there's one thing JS doesn't lack, it's abuse ;-)

    [1] If there's one thing OO theory doesn't lack, it's facets like typing, encapsulation, data hiding, interface inheritance, implementation inheritance, etc…

  14. amroamroamro says:

    @640K: I was referring to variables like "=C:" and "=D:"

    see this post:

    blogs.msdn.com/…/10008132.aspx

  15. Dominic says:

    @Karellen: JS's object model is based on calling a function (the constructor) which adds function references to its activation record (this) and returns a reference to the activation record. In that respect it very closely follows Simula 67's approach, and Eich probably didn't do it by accident, and if he did he probably wasn't surprised by the implications which are widely known in the lisp/scheme community.

  16. yuhong2 says:

    An example in HTML is legacy color parsing. I personally figured out where legacy color parsing is in the Netscape classic source: stackoverflow.com/…/12630675

    It is so subtle even Netscape's own Gecko rewrite did not get it completely right the first time: bugzilla.mozilla.org/show_bug.cgi

  17. Azarien says:

    One quirk that got *not* reproduced was that in MS-DOS 6 and Windows 9x executing

       dir.exe

    was the same as

       dir *.exe

    which was a nice shortcut saving three keystrokes, but it doesn't work in NT-based Windows.

    [Yup. Because otherwise how would you deal with a file called ".exe"? -Raymond]
  18. voo says:

    @Dominic: Eich did it, because he was forbidden from implement a "serious" language by management (source: an interview in Coders At Work, but I doubt he only said that once) but still wanted something resembling an useful language (conjecture).

    A prototype based approach was pretty much the best he could do in that situation, but really it makes optimizing JavaScript needlessly complicated and if you actually use the features offered by prototyping that you don't easily get with a class-based approach you end up with a pretty hard to maintain mess in my – limited – experience. Also did I mention that it makes writing an optimizing VM for JS even worse than it already is?

  19. DWalker59 says:

    To see all variables, I just type SET in a command box.  I never knew about this '' or " thing.  :-)

  20. John says:

    I enjoy these types of posts the most. Keep em coming Raymond!

  21. evacchi says:

    @Dominic:

    >In that respect it very closely follows Simula 67's approach, and Eich probably didn't do it by accident, and if he did he probably wasn't surprised by the implications which are widely known in the lisp/scheme community.

    Do you have references I could lookup both about how Simula implements object constructors, and these "implications" in the Lisp/Scheme community you speak of? I would like to read more on that! thanks!

  22. Dominic says:

    Regarding Simula, see these lecture slides:

    carlstrom.com/…/simula-smalltalk.pdf

    Regarding Scheme/Lisp, see "Structure and Interpretation of Computer Programs" chapter 3

  23. Ken says:

    Sometimes you run into a "feature" so bad, your only reaction is "Thank God they finally removed "THAT" feature"

  24. e.vacchi says:

    @Dominic

    thanks for the pointers, those were enlightening in many ways!

Comments are closed.