Aargh, Part Five: Comment Rot

Gripe #6: Comment Rot

If you’ve been reading my SimpleScript code you might have noticed that there are very few comments in my code.  That’s deliberate. 

Why do we have comments in the first place?  We have comments because sometimes the semantics of the code are not clear from the syntax.  Remember, the syntax tells you what the code does, the semantics tell you why the developer wrote that code, what the purpose is. 

I try to write code so that the semantics are clear just by reading the code, no comments required.  Code with one comment for every line drives me nuts!  Imagine if people wrote books that way — that their English was so unclear, so hard to understand that they needed to have a footnote on every sentence and paragraph explaining in some other language what the intended meaning was!  It would be awful.

Worse, excessive comments cause “comment rot”.  I don’t know how many times I’ve been reading a piece of code, read a comment that said “there is such and such a problem with this code, I’ll come back and fix it later”, and sure enough, there is no problem anymore.  Or there is a problem, but it’s a different problem.  People change the code, compile it, test it, and never fix the comments.  The more comments there are, the more likely it is that this will happen.  Wrong comments are usually much worse than no comments at all.

Slightly less horrid than wrong comments are useless comments — comments that do not explain the semantics, but restate the syntax.  How many times have you seen this:

j = j + 1  ‘  Increment j

Thank you so much, developer, for letting me know what j = j + 1 does!  In my advanced old age I was beginning to forget the first thing I ever learned about programming!  Why do people do this?  This would be better, as it explains the semantics:

j = j + 1  ‘  Go to the next line

but surely this is an opportunity to eliminate the comment entirely:

currentLine = currentLine + 1

When I write a piece of code with comments I want those comments to stand out, not get lost in a sea of triviality!  If I write a comment, you’d better believe that it’s usually because the implementation of the intentions simply cannot be clearly expressed by the code, or because something really bizarre and important is going on.  (Like the weird garbage collection protection I mentioned earlier today.)  Comments should be signposts, your eye should be drawn to them because they’re important.  Commenting style should train people to read the comments as they come across them in the code.  Writing lots of irrelevant comments trains people to ignore the comments!

Earlier today I was talking about the default property semantics of IDispatch, and earlier I commented that I’d blog about a conversation Matt Curland and I once had about the differences between VBScript’s and VB’s default property resolution in their respective Intellisense engines.  Now’s a good chance to kill a few birds with one stone; one of my favourite comments I’ve ever written, and certainly one of the longest, is found in the VBScript Intellisense code that I wrote for Visual Interdev:

        // [Eric Lippert 1998-03-27] — see Scripting bug 847
        // We now have the following problem.  We have “foo.bar(“.
        // What if bar returns an object that has a default property?
        // Suppose bar is a collection, for instance, with a default
        // property “item”.  Then this should show statement completion
        // information for the “item” method, not for “bar” — provided
        // that bar takes no arguments.  If bar takes arguments, then
        // we should show statement completion for bar.
        // This generalizes to chains of defaults — if bar is the default
        // property of foo, and bar takes no arguments, and bar
        // returns an object that has default property baz that takes
        // an argument blah, then
        // foo(
        // foo.bar(
        // foo.bar.baz(
        // should all return statement completion information for baz(blah).
        // So here’s what we’ll do:
        // First we determine if “bar” returns an object.  If it does,
        // we check and see if it has a default property chain.  If so, we
        // get type info for the final available function on the default
        // property chain.
        // Second, if bar is a function that takes arguments, we return
        // function information for bar.
        // If bar does not take arguments and bar returns an object with
        // a default property, then we return info on the default property.
        // Otherwise, we return information on bar.
        // This is not actually what Visual Basic does.  VB does the
        // following: (from email by Matt Curland, 1998-03-27)
        // “If no parameters are supplied never call the default unless
        // you’re in an assignment statement without a Set. If you’re
        // in a non-ending statement of a call chain then only do
        // the default resolution if the specified function doesn’t have
        // its own parameters and a parameter is actually specified.”
        // That is not exactly what we’re doing here, but it’s close enough
        // as far as I’m concerned.

Gross, gross, gross.  And actually I’ve cut it short — the comment then goes on to describe even more weirdnesses with the case where foo.bar(blah)( gets Intellisense on a method that returns a collection.  This is very complicated code; I wanted anyone who was about to change it to understand fully what the meaning behind it was.

The other thing that amuses me greatly about this comment is that apparently there were only 847 bugs in the scripting database at the time!

Comments (24)

  1. Pete Cole says:

    Allegedly, there used to be a guy at Cambridge who would only comment code, at appropriate points, with the following:

    // Watch this bit; its tricky.

  2. mike says:

    Jeez, Eric, now I have to go through all my code, look at the comments, and ask "What Would Eric Do?"


  3. Eric Lippert says:

    Well, in your case Mike, the audience for the code is slightly different. Not only should the code already be a model of clarity, but you also have to deal with people cutting and pasting snippets out of the doc directly into applications.

    Also, snippets exist independent from their larger context.

    I think a lot of sample code benefits from comments that I’d consider prolix in a "development" codebase.

  4. Eric Lippert says:

    One other thing: I meant to give credit where credit is due in the blog entry but I forgot. Most of these ideas on comments I got from a lecture by Brian Kernighan when I was an undergrad — he has a very engaging lecture style and a lot of the material stuck with me. I vividly remember him saying "Pop quiz everyone: what gets run, the CODE or the COMMENTS?"

  5. Eric Lippert has a great post about comment rot. Eric found a way to very eloquently state something I have been trying to say for a very long time. My problem with the majority of comments are: The state the…

  6. Anon E. Mouse says:

    Oh man, my cow-orkers do that all the time (don’t fix comments, I mean).

    When I start a coding task and intend for someone else to finish it off, I’ll leave "TODO: Jack put your stuff here" comments so he knows where to look. Then when I’m in that code later, I’ll see Jack’s code, along with the TODO comment still sitting there. ARGH!

    Since no one else on the team even writes comments, I would expect them to get to my stuff and say "whoa, what’s all that green stuff? oh, comments! I think I’ve heard of those" but I guess even that is asking too much.

  7. Frans Bouma says:

    Bad comments don’t make ‘no comments’ a good thing.

    Your lengthy example tells me 1 thing:

    there is no design document where this information is discussed / described and solved.

    Comments indeed have to describe the semantics of the code. If a complex piece of code is not described with SOME comments, the programmer is simply lazy.

    The balance has to be found between: when do I put my description in a design document and when do I put my description in the code. I think the general descriptions of how the functionality works / is constructed should be in a technical document, and the comments should guide the reader, with that document in hand, where to find the code which describes a given piece of functionality in the technical document and vice versa.

  8. Eric Lippert says:

    I have no idea where the design documents for the script engines are now. There are a couple checked into the source code tree, but the majority of them were tracked by separate file management systems. I don’t know where the old bug databases are either. Someone knows this stuff, but not me.

    Don’t get me wrong — specs are great, and we need them, but when I’m reading a complicated piece of code, the last thing I want to do is call someone up and ask where they stored their design documents seven years ago so that I can dig it out and try to understand it.

    No, the code has to stand on its own. When I write code I assume that the only documentation people are going to be able to find is the code itself.

  9. Tirion says:

    [ Waste Book ] 04-06-05

  10. russ says:

    another comment phenomina I’ve noticed is ‘non-normalised’ comments. a side effect of a ‘anaemic anti-pattern’ where a method appears in n tiers just passing things straight thru. so the same comment appears on the pass-thru comments (i’m guilty of this). but when something changes there 3 comments exactly the same that need be changed!

  11. Here’s a question I recently got about VBScript, where by "recently" I mean August 28th, 2003. This…

  12. Here’s a question I recently got about VBScript, where by "recently" I mean August 28th, 2003. This…

  13. Tom B. says:

    Very well said.

  14. Bill says:

    I really dislike comments written in the first person plural. To whom or what does "we" refer? If it’s the programmer it should be "I". I prefer not to use personal pronouns in comments at all, unless I’m obviously referring to myself.

  15. John L says:

    I agree that sometimes comments are overused to compensate for sloppy coding. For example, when coding User.Load(), the programming has various opportunities to clarify his ideas:

    1. the object name – the programmer conveys this is the object that handles users.

    2. the method name – it is a method that loads a user into the system

    3. the convention – the overall system should adhere to proper convention, so Load() should have similar meanings across different objects.

    4. the code itself – properly named variables, properly formatted code. They all facilitate easier code understanding.

    I would usually prefer to let the programming language convey my meaning for me, since it’s sometimes a better language than English. Comments are used to convey the overall purpose, or some hard to understand code, but I think good comment does not compensate for confusing/inconsistent overall design.

  16. Alex S says:

    should I comment the following? :

    if (!(myFloat  >= 0)) myFloat = 0;

    this seems equivalent to

    if (myFloat  < 0) myFloat = 0;

    but isnt because it catches Nans as well as negative numbers.

    Now I know that, but then I would, wouldnt I. (MDRA)

    The young turk tasked with "sorting out" my code might not.

    What can one expect of the reader? How can we best help the reader?

    As pointed out our most asiduous reader is the compiler, and she wont care.

    Should we link every line to a specification?

    Why is someone reading my code?

    Looking for a bug?

    (if there is a bug, it is not often pointed out in the comments)

    Implementing something new?

    Trying to learn?

    I like some (very few) comments, but I suspect the optimum level depends on the reader.

  17. EricLippert says:

    Transforming a NaN into a non-NaN is in my opinion an "unusual" thing to do. If the intention of the code is to do so, and if the logic depends upon that happening and continuing to happen, then I would consider commenting that.

    However, every comment is an opportunity to say "how could I write this more clearly without the comment?"  That would probably lead me to write:

    // Turn all nonpositive nonzero doubles (including NaN and -Infinity) into zero

    inline double Normalize(double d)

    { return !(d>=0) ? 0 : d }

    and then the call site becomes

    myFloat = Normalize(myFloat);

    That makes the call site easy to read because the control flow is abstracted away. The code now becomes about the semantics and not the control flow.

  18. Alex S says:

    I like that answer. It lights up the heart of the matter.