Coding with Style

AKA, Software archaeology, a working example.

I've touched on this this a couple of times before, but I'd like to talk a bit about style.

This isn't an issue for people working on a single person project, but once you start working in a team, the issue of coding style comes up often.

The problem is that everyone has their own style, and they're usually pretty different.  K&R lays out several different styles, and Kernighan and Plaugher lay out still more.  Style encompasses everything from where the braces go, to how far they're indented (and where they're indented), to the names of identifiers (both variables and function names), to where the variables are declared.  Even if source files contain spaces or tabs (and what the tab setting is) is an aspect of programming style.

Most programmers learn a particular style when they start to learn how to program, and then they tend to stick with that style for their career (with minor modifications).  One thing to keep in mind about style is that it's personal.  Nobody's style is any better than anyone else's style, they're all ultimately a matter of personal choice (having said that, some personal style choices can be rather distracting, for a simple example, see this Daily WTF post).

When you're laying out a project at the beginning, you essentially have two ways to go when determining the style for your code.  The first is to get all the developers together in a meeting, hash out the details, and write a specification for the coding style for your project.  And then you've got to be diligent in enforcing that style, especially if the style you're enforcing is different from that of an individual developer in your group.

The other way is to be somewhat more ad-hoc in coding style - typically each developer owns one or more components in the system, you let them develop those components with their own style, and trust that the final product will be harmonious.  If you're using paired programming, or if there is more than one owner of a piece of software, then clearly the individual developers need to come to consensus on their style.

There are times that the ad-hoc coding style is effectively forced on an organization.  This happens a lot when dealing with legacy code - often times the code was written by developers who have long left the organization, and thus they are no longer maintaining the code.

There's one critical thing to deal with when you're dealing with disparate styles - you should never, ever change existing code to match your personal style.  I've seen this happen in real life, two developers on a team each have their own idea of what a variable should be named, and as the two developers work on the module, each of them goes through and makes sweeping changes to rename the variables to meet their personal naming convention.  The biggest problem with this is that sweeping changes like this cause huge issues with code reviews - each gratuitous change to the code shows up as a change when diff'ing source files, and they increase the signal-to-noise ratio, which reduces the effectiveness of the code review - the code reviewer can easily miss a critical error if it's buried deep in a sea of variable name changes.

For those of you who have worked on long running projects, how many times have you run across code like this?

typedef int CB;typedef wchar * SZ;#define length 20void MyFunctionName(SZ szInput, CB cbInput){    CB stringlength = strlen(szInput);    char myBuffer[length];    if (stringlength < length)     {        if (szInput[0] == 'A')            {                <Handle the case where the input string starts with 'A'>            }        else {        if ('B' == szInput[0]) {            <Handle the case where the input string starts with 'B'>        }        }    }}

What happened here?  Well, the code was worked on by four different developers, each of whom had their own style, and who applied it to the source.  The one that authored the routine used strict Hungarian - they defined types for SZ and CB, and strictly typed the parameters to the routine.  The next developer came along, and renamed the local variables to match their own personal style, and added the check for case "A".  The second developer used the following style for their if/then statements:

if (if condition, constants in conditionals appear on the right)    {        <if statement>    }else     {        <else statement>    }

Now, a 3rd developer came along, this developer added the check for case "B".  The third developer used the following style for their if/then statements:

if (if condition, constants in conditionals appear on the left) {    <if statement>}else {    <else statement>}

And then, finally the 4th developer came along and added the buffer overflow case.  Their if/then style was:

if (if condition, constants in conditionals appear on the left) {    <if statement>}else{    <else statement>}

None of these styles is wrong, they're just different.  But when put together, they turn into a totally unmaintainable mess.

So it's utterly critical, if you're working on a project that uses ad-hoc coding standards that you adopt your coding style to match the code.  The future maintainers of the code will thank you for it.  Even if your project has strict coding standards, the instant you have to deal with legacy code, you MUST adopt the coding standard of the original author of the code, even if it doesn't match your groups standards.