What Do Programmers Really Do Anyway? (aka Part 2 of the Yardstick saga)

Article
01/04/2006

Way back in 2002 when we started working on Whidbey, I captured my thoughts on the direction we should take for C# and Visual Studio in two large emails. In the first email The Yardstick I spent a lot of time saying that you must evaluate features against the amount of time they save versus the amount of time they cost. A big saving in an infrequent task is not worth even a small cost in a frequent task, and more importantly, even a small saving in a frequent task will have a much more significant effect than a huge saving in an infrequent task.

This follow up email captures my thoughts on "Where we as tool developers should focus to provide the most benefit for the programmers that use our tools?" To answer this question we must first answer the question:

What do professional programmers really do with their time anyway?

Here is the result of my observation way back in August of 2002, and after reviewing it, I find that it still rings true today. I'd be delighted to hear how your experiences as programmers stack up...

So after taking out the time spent playing Halo and consuming mass quantities of caffeine, here's what I do all day:

Design Code

Write New Code

Understand Existing Code

Modify Existing Code

Verify Existing Code Still Works

Let me expand on each of these tasks, to explain precisely what I mean.

Design Code

Design involves analyzing a new problem, and mapping out the broad flow of code which will be used to solve the problem. To date the #1 tool for this activity is a big whiteboard (note the 8' x 4' monster on my North wall) and 1 to 3 engaged brains focused on the problem. There are some tools which attempt to formalize the design process using visual designers to view and edit the relationships in your code. My whiteboard is pretty effective at this task however, and I have yet to really want to even investigate a formal tool for an informal process of brainstorming and design.

Write New Code

For me, writing new code means typing in code in a virgin source file and getting it to compilable state. I'll also include writing a new method from scratch in this category. I do not include adding code to an existing method, or copying code from an existing method and modifying the copy to do something new. I'll leave those activities in the modifying existing code task. Intellisense is THE power tool for writing new code.It has easily increased the speed at which a programmer can write new code by a factor of 2x to 5x. Visual designers (like the Winforms and Data designers) and code wizards are useful for getting some common coding patterns started.

Understanding Existing Code

Understanding existing code means taking a look at some code to understand precisely what is going on. Answering questions like: Exactly what are the inputs to this method? What is the output of this piece of code? Exactly what does the callstack and data look like when this particular code path get hit? Why is this code doing what it should? Why does this code not do what I want it to? Note that this includes understanding code that I have source for (usually written by me, or someone in my organization) but it also includes understanding the details of libraries (like the BCL and WinForms) that I don't have source for. Lastly this also includes understanding the overall structure of an existing library or application. Answering questions like: What are the main data structures and how do they interact?

The primary tool used in this activity is the editor - think 'lots of time staring at source code'. Code outlining (collapse to definitions) is a great feature in this area. A lot of time is spent tracing up and down call stacks, trying to understand the flow of the bits. In theory class view and symbolic navigation (goto definition and goto reference) would be very useful, but I still find myself using find in files as my primary navigation tool for source code. The Object Browser, ildasm and Help are the primary tools for understanding libraries of managed code. Looking at the code spit by Visual Designers is a great way to understand how the underlying code libraries work. Perhaps the most useful tool is the debugger. The 'find the bug' part of debugging fits entirely in the understanding code activity. Symbolic debuggers aid greatly in understanding of code flow, but perhaps more importantly the debugger shows the shape of the real data structures that your code is operating on. Still, most of understanding code happens without the debugger, and relies on the editor.

Modifying Existing Code

Modifying code is related to writing new code, but it is different enough to call out separately. Modifying existing code is either bug fixing or adding new features. When fixing a (simple) bug in a method there is some existing behavior in the code which must be preserved. The coder must ensure that the fix doesn't break any of the existing non-buggy behavior. Adding new features to code is different from writing new code in that it never involves just adding code. New feature work almost always starts with redesigning the existing code base so that it is sympathetic to the new feature area. Only once the code has been refactored can the adding of new code begin. Once the new code has been written it must still be hooked into the existing code which requires some modification of existing code. As an example, I am currently working on adding generics to the C# compiler. This requires a fundamental change in the data structures used by the compiler which I've been working on for most of the last week, and will continue well into the next. When I'm done this change I will have changed about 5% of the lines of code in the compiler (some 4,500 lines of code) while adding exactly no new code. Only once the refactoring is complete can the real work of adding the individual parts of generics into the compiler begin. Not all features require this dramatic a refactoring, but it is not atypical either. Currently the tools used to modify existing code include intellisense and find in files. Visual designers can also help here, provided that the code was authored in the designer and the designer is not confused by all that grungy real world code which you've added to your project.

Verifying Existing Code

Verifying existing code means writing and running test suites to ensure that recent bug fixes haven't caused regressions. It also means stepping through recently changed code in the debugger to verify the new behavior. Industry thought leaders have recently made a big fuss about regression testing, and have expounded testing frameworks like JUnit. It has been my experience that regression testing always pays off, but that is another mail entirely.

Priorities

Of the above tasks VS is the primary tool for 3. Writing new code, understanding existing code, and modifying existing code. The real question is "What percentage of time do real developers spend in each of these three activities?". The answer may surprise you, so think hard before continuing?

My answers are:

New Code: 2%

Modifying Existing Code: 20%

Understanding Code: 78%

Now, I fit solidly in the Einstein user profile. I code all day every day. I'm a C++ power user. I read x86 assembly natively and can decipher the raw bytes of machine code in a pinch. I've been working on the same code base for 3.5 years, and will likely continue to work on that code base for another 3.5 years. A better question is, what do these numbers look like for Elvis. Now, many, many moons ago I used to be Elvis. I had a spanking new 386 on my desk. My primary tool for writing code was a pencil. Browsing code involved a printer and a highlighter. Yes, yes the year was 1988, and I'm writing a Futures and Options accounting system for Citibank in DBASE III+ as an intern... Oh my god I'm wearing the most god-awful blue pinstripe suit.

So what are Elvis's numbers like:

New Code: 5%

Modifying Existing Code: 25%

Understanding Code: 70%

No, I am not making this up.

Let me explain where these numbers come from. First: Why is 5 times more time spent modifying code than writing new code? The answer is that new code becomes old code almost instantly. Write some new code. Go for coffee. All of sudden you've got old code. Brand spanking new code reflects at most only the initial design however most design doesn't happen up front. Most development projects use the iterative development methodology. Design, code, test, repeat. Repeat a lot. Only the coding in the first iteration qualifies as all new code. After the first iteration coding quickly shifts to be more and more modifying rather than new coding. Also, almost all code changes made while bug fixing falls into the modifying code category. Look at VS, our stabilization (aka bug fixing) milestones are as long as our new feature milestones. Modifying code consumes much more of a professional developer's time than writing new code.

Secondly, why does understanding code take 3 times more of a developers time than modifying code? The answer here is that before modifying code, you must first understand what it does. This is true of any refactoring of existing code - you must understand the behavior of the code so that you can guarantee that the refactoring didn't change anything unintended. When debugging, much more time is spent understanding the problem than actually fixing it, and once you've fixed it, you need to understand the new code to ensure that the fix was valid. And lastly, even when coding new code, you never start from scratch. You will be calling existing code to do most of your work. Either user written code or a library supplied by Microsoft or a third party for which no source is available. Before calling this existing code you must understand it in precise detail. When writing my first XML enabled app, I spent much more time figuring out the details of the XML class libraries than I did actually writing code. When adding new features you must understand the existing features so that you can reuse where appropriate. Understanding code is by far the activity at which professional developers spend most of their time.

Visual Studio is not Focused on Real Coders

I recently asked a couple of PM friends of mine (who will remain nameless to protect the innocent) where they figured developers spent their time. Their estimates ranked writing and modifying code above understanding code. Looking at the feature list for Whidbey I think that this is consistent with many folks across the division. It sounds reasonable that writing code is the primary activity for folks who write code for a living but it is in fact very far from the truth. I think there are several reasons why our current focus is on writing code rather than understanding code.

Many of our new features are designed by folks who write small demo apps. Lets face it, PM's aren't professional developers. They need to come up with code snippets and examples which are trimmed down for presentations and conferences. This results in features which demo well, but will in fact provide little or even negative benefit to the real coding task of understanding code. Staring at code for hours just doesn't demo well.

Our usability studies are run over a timeline of 3 - 4 hours. This includes problem definition, problem solution and post mortem. The last 2 usability studies I saw were basically "write this self contained dumbed down piece of code from scratch". This does not even remotely resemble real world professional coding. The last time I had a coding project like that I was in college. Early in college. A much more representative task would be to send a coder at an existing piece of code that they'd never seen, that was undocumented, badly written, badly architected and had several bugs. Then tell them to add a new feature while maintaining the existing behavior as much as possible. It may be difficult to get anything useful out of a short study, but it would be much more representative of real professional development. Even in the usability studies that I have seen, the users have spent a ton of time in help trying to understand our class libraries (aka understanding existing code).

Many of our ideas for new features are a response to questions from users on our newsgroups. Again, this skews the data towards the new user, writing their first program. Most of our questions on the newsgroups come from new users who have just installed the product. Well, guess what? No matter how good the product is they will always ask 'dumb' questions while they try and get their head around this monster new product they've just installed. Once they've gotten a little familiar with the product their usage of the product will change drastically. The things which they spend time on in the first few weeks will be dramatically different the things that they spend their time on after they have learned the basics of the product. We really don't get any good feedback from newsgroups on the real day to day usability of our product.

In The Yardstick, I outlined a system for evaluating features. You count up the time saved and compare that against the time cost to give a net time saved. Lets run a couple of examples here:

Example 1:

Time Saved: 10% of new code time

Time Cost: 1% of understanding code time

This looks like a pretty attractive feature. saving 10%, costing 1% looks good. But now factor in the fact that much more time is spent understanding code then writing new code and try again:

Real Time Saved: 10% of new code time = 0.5% of Total Dev Time

Real Time Cost: 1% of understanding code time = 0.7% of Total Dev Time

Net Time Saved: -0.2% of Total Dev Time

Now we realize that even this small sacrifice of understanding code results in a net loss in productivity for our customers.

Example 2:

Time Saved: 10% of understanding code time

Time Cost: 10% of writing new code time

At first glance this looks like a wash, no saving, but factor in the relative time again:

Real Time Saved: 10% of understanding code time = 7% of Total Dev Time

Real Time Cost: 10% of writing new code time = 0.5% of Total Dev Time

Net Time Saved: 6.5% of Total Dev Time

Again, because much more time is spent in the understanding code task this feature is in fact a significant usability win.

So now the question comes : where should we spend our resources to improve VS in ways which will most benefit our customers. Clearly we should focus our efforts on making it easier for users to understand existing code, but the conclusions are in fact much more dramatic than that. If we could reduce the amount of time spent understanding code by a mere 10%, that would save the user more time than a 100% reduction in the amount of time spent writing new code. Think about what that really means for a moment. If we spent a ton of work, making intellisense, designers and wizards so good that writing new code took no time at all. Zero time. The ESP coding interface. That would still have less developer impact than a 10% reduction in the amount of time developers spend understanding the code base they are working in. The other conclusion to draw from the above, is that any new feature which impedes the developers primary focus of understanding code will allmost certainly be a net usability loss. In fact I'll go one step further and say that cutting existing features which impede the user's ability to understand existing code will result in more productive programmers.

Peter

What Do Programmers Really Do Anyway? (aka Part 2 of the Yardstick saga)

Additional resources