Code is not self documenting


Nothing revolutionary about that statement.  Yet I keep reading the opposite on various comment threads and message boards so I thought it a good idea to explore it again.

Code is not self documenting.

The "code is self document" argument often comes up when commenting conventions, patterns and overall usage is discussed.  People who are typically against writing more than the existing set of comments will throw out the argument "if we need a comment then the code should be written to be self documenting." 

To at extent I agree with this.  Code should not be obfuscated and the usage should be clear.  I personally strive to make my implementations as clear as possible and enforce that belief on anyone who asks me for a code review. 

Yet while you can write code so that an individual algorithm or function is close to self documenting you cannot write it in such a way that it will explain it’s greater purpose in a program.  Only comments can do that.  Self describing code can only describe itself, not it’s purpose in the bigger picture. 

Comments serve to both 1) explain the algorithm and 2) explain the greater purpose of the algorithm in the program.

Yet people still cling to the code is self documenting mantra.  In my experience there are several reasons for this belief.   The first is that people have only worked on projects small enough that for most purposes they can be kept entirely in their mind [*].  Until you work on a big enough program #2 is not even a factory because you intimately understand how every function fits into the big picture. 

Another is that they have never worked on a project with people they weren’t very familiar with.  People you don’t know well or have worked with before will likely have different ways and practices of approaching a problem which you have not encountered before.  What is obvious to them won’t be obvious to you.  The bridge between these approaches are comments.

I’ve seen DRY (don’t repeat yourself) brought up as well [**].  The code clearly says what it does so adding a comment is just repeating yourself.  I find that to be patently untrue.  If we’re even having the conversation then the code is clearly not self documenting.  Also in the cases when an individual function is documenting you will still run into #2. 

Commenting your code benefits both people who are reading your code and yourself.  You will eventually come to a point where you’ve forgotten what a particular piece of code did or how it fit into your bigger program.  Your comments will save you.

[*] The temptation to say "kept in memory" here was huge but I avoided it.
[**] In general I think that DRY is a great approach to programming but I feel it's being taken to far here
Comments (17)

  1. Dave says:

    I agree that comments need to be included to the extent that they help make the code more clear.

    However, two points need to be made in defense of "the code is the comment" mantra.

    First, before we had long variable names, comments became manditory only because the code couldn’t be clear.  Many managers STILL believe that ALL code should be commented,  even if all you are saying is, "this line increments X by 1" (which hopefully is clear from the code… and hopefully, they didn’t use X as a variable name).  So, if a manager is asking for the comments, I’m a lot more likely to argue that the code IS the comment.  If another programmer is asking for a comment, I’d first ask, "what exactly isn’t clear about the code?" That would allow me to address that specific issue either with my code, so the code becomes the comment again, or by adding a comment.

    Second, in the 20+ years I’ve been programming, I have NEVER worked on a program where the comments describe what the code currently does.  So, even when I was working on programs with 10 character variable names, I always had to rely on the code as the ultimate authority on what it was doing.  In fact, I currently ignore comments in the code unless I just can’t figure out what the code is doing, and even when I do look, I only use it as a historical annotation (where the code came from, not necessarily what it is doing now.)

    The fact of the matter is, comments are difficult to keep in sync with the code.  While code will always stay in sync with itself.  

    Instead of arguing for or against comments, what we SHOULD be arguing for is code that can be maintained.  If that means comments are required, so be it.

  2. ~AVA says:

    You did not provide any example of comments that 1) explain the algorithm and 2) explain the greater purpose of the algorithm in the program, so I can only guess what you mean.

    From my experience, comments like:

    // Initialize the list of students

    bool Init( HWND ctl);

    // Turn the lamp on

    on = TRUE;

    // engine states

    enum State { off, warming, working };

    can be transformed to names like:

    bool InitializeTheListOfStudents(HWND listControl);

    TurnLampOn();

    enum EngineState { engineIsOn, engineIsWarming, engineIsWorking };

    Comments explaining the interface of some class or function indicate the weakness in the design: if the purpose of a function ("server") is not expressed by the name, you have to repeat the comment with an explanation in each place where the function is called ("client code").

    From other hand, the comments explaining the implementation of some algorithm are OK:

    // collect responses

    while ( total_descriptors_in_the_set )

    {

    FD_ZERO( &socket_set );

    int number_of_open_sockets = ::snmp_select_info(

    &total_descriptors_in_the_set,

    &socket_set,

    &timeout,

    &need_block

    );

    // …

    Typically calls of OS API, 3rd-part libraries, or your published functions, whose names cannot be changed, require more attention (and comments) than your private functions. You have full control over names of your private functions, so you can make them self-explanatory.

    Comments are also good for cross-references pointing to other documents, like:

    // Adapted from Big Book of C++ Tricks, p.274

    MakeJuice( apple ); // <SomeHeader.h>

    // See "Project Code Conventions.pdf" for details

  3. DevTopics says:

    A programmer can write code in such a way that it is "almost" self-documenting.  In other words, by using descriptive local variables, by breaking complex expressions into multiple simpler statements, etc.  But even though the code itself can convey the "what" it’s trying to do, rarely can it explain the "why".  This of course requires explicit comments.

    http://www.devtopics.com/13-tips-to-comment-your-code/

  4. dclayton says:

    Comments should only exist to point out what is not obvious in the code.  In any programming assignment, an assumption must be made as to the base competence of whomever will be reading the comments subsequently.  Only document that which cannot be assumed to be clear to a subsequent developer of moderate skills.

  5. jaredpar says:

    @~AVA,

    True I didn’t give specific algortihmn’s which I documented one way or another.  I thought the post was long enough without adding explanations of algortihmns that readers wouldn’t have context into :)

    You can use the names of functions to make them explanatatory but only to a degree.  

    I think DevTopics hit the nail on the head though.  If you want them to be truly self documenting you have to include the "why" in addition to the "what".

    Taking threading for example.  Extremely clean code can potentially identiy the "what" part for a threading question but it’s very unlikely to include the "why."  

    1) Am I in the background for speed?

    2) Am I avoiding blocking the UI

    3) Do I need to switch my COM apartment

    4) Am I accessing a thread affinitized object?

    5) All of the above.

    You can cram that into a function name but it’s not going to be very easy on the eyes.

  6. Timmy Jose says:

    Refreshing read! Most well-meaning folks (who are no push-overs when it comes to coding) who generate hundreds of thousands of lines of code for an enterprise product do not even have the freakin’ patience/ inclination/ common sense to even have a pretense of a documentation. And the code is anything but self-documenting! I have a gala time reverse engineering the "code" to understand what it does and where it is actually used! And I am talking about a Fortune 10 company!

  7. I think no one really suggests to write no comments or documentation when following this mantra. Self documenting more means to express the flow of the code properly. Of course you still need documentation on the class, the module and the conceptional model of the application. But you should not clutter your code.

    See http://vafer.org/blog/20050323095453

  8. I think no one really suggests to write no comments or documentation when following this mantra. Self documenting more means to express the flow of the code properly. Of course you still need documentation on the class, the module and the conceptional model of the application. But you should not clutter your code.

    See http://vafer.org/blog/20050323095453

  9. Timmy Jose says:

    Refreshing read! Most well-meaning folks (who are no push-overs when it comes to coding) who generate hundreds of thousands of lines of code for an enterprise product do not even have the freakin’ patience/ inclination/ common sense to even have a pretense of a documentation. And the code is anything but self-documenting! I have a gala time reverse engineering the "code" to understand what it does and where it is actually used! And I am talking about a Fortune 10 company!

  10. jaredpar says:

    @Tortsen,

    Unfortunately I’ve had that very argument made to me on several occasions and the most recent of which inspired me to write this post.  I don’t think this is the majority case though.

  11. giku says:

    This problem is analogue… So both : black and white are wrong, all depends on Everything

  12. luweewu says:

    Good code is self-documenting.

    No amount of commenting will help reading bad code.

  13. ncloud says:

    I try to name my variables appropriately so they make sense in their given context.  I find that for most obvious operations, this is sufficient.  However, there are cases where just knowing "what" is happening is not enough — the developer needs to know "why" it is happening.  Just today I was reviewing code that I wrote nine months ago, and while the code was completely legible and all the variables had obvious names, I couldn’t figure out *why* I had written a specific branching statement, and I had not bothered to comment it at the time, so now I’m stuck.

    Another good reason to comment your code is to justify changes that were made at given points in time.  On two separate occasions, my manager has asked me why certain behavior was being manifest in our application, and I was able to pull up the code and note that, in both cases, he had made the decision to implement that specific behavior (I also included the exact date in the comments to be more specific).

  14. The statement is taken too literally and imho misinterpreted a little extremely. Self documenting code can be made better by annotating it with comments (think of it as inserting little notes inside your MS-Word document). Strip the comments and if the code is still readable and roughly understandable I would say it is self documenting. Throw in a few comments and you just improved its readability – you didn’t suddenly end up documenting the code.

    Another interpretation is that along with all the annotations if, the resultant generated documentation (eg. javadocs) can help you get a hang of the design and usage without having to consult any other documents, that would also be self documenting in a different sense (ie. all the documentation requirements are serviced by the code files themselves)

  15. M.C. says:

    erm… I think the argument is that code should be written to be as self-documenting as possible, not that code is self-documenting.

  16. jaredpar says:

    @M.C.

    I’m a huge fan of clear code and enforce that principal in my code reviews.  Just because code is not self documenting doesn’t mean it can’t be readable and easily understood.  Code should be both readable and documented.  

  17. I Hate IT says:

    Code is "self documenting" in at least one respect.

    If you replace a human driven task (say compilation, and deployment) with a programmed script. The script is typically more "self documented" than the human process (baring well established SOPs, followed precisely).