Lines Of Code (LOC) As A Metric

I posted this on an internal DL today and I thought it was worth sharing…

There are two important things about LOC:

  1. The only thing it is a measure of is the size of your codebase.
  2. Size isn’t necessarily a good thing.

My personal opinion is that LOC is completely useless for project management, it’s only real use is for being able to say things like:

  1. Module A is twice the size of Module B.
  2. Module C has grown 25% over the last 2 months.
  3. We seem to have a lot of bugs in Module D given its size.

For project management purposes you’re better off looking at feature/story delivery rates and the like because they are more meaningful. Someone could add 1,000 lines of code to a code base and still not have delivered a feature, conversely, someone could have removed 1,000 lines from the code base and delivered three features. Negative LOC is (usually) the result of refactoring done as part of implementing a new feature and it can have significant impact on the size of the code base.

Now that I’ve told you that LOC is useless here is some data from Code Complete 2nd Edition by Steve McConnell about the lines of code per staff-year (Cocomo II nominal values in parenthesis):

Project Size (LOC) LOC per Staff-Year
1K 2.5K – 25K (4K)
10K 2K – 25K (3.2K)
100K 1K – 20K (2.6K)
1,000K 0.7K – 10K (2K)
10,000K 0.3K – 5K (1.6K)

The most interesting thing to note here is that the number of LOC a developer can write decreases as the project size increases. This is because complexity increases, impact increases, testing increases, bug counts increase etc. So the idea developer has a high feature to LOC ratio because they’re delivering value to the customer while keeping the code base small which enhances the productivity of all of the developers on the team.