A friend of mine pointed out an interesting post by Scott Hanselman that used a clever phrase: “having a High Bus Factor” which is to say: if the original developer of a bit of code is ever hit by a bus, you are toast.
The example that Scott gave was a particular regular expression that I just have to share. To understand the context, read his blog.
private static Regex regex = new Regex(@”\<[\w-_.: ]*\>\<\!\[CDATA\[\]\]\>\</[\w-_.: ]*\>|\<[\w-_.: ]*\>\</[\w-_.: ]*\>|<[\w-_.: ]*/\>|\<[\w-_.: ]*[/]+\>|\<[\w-_.: ]*[\s]xmlns[:\w]*=””[\w-/_.: ]*””\>\</[\w-_.: ]*\>|<[\w-_.: ]*[\s]xmlns[:\w]*=””[\w-/_.: ]*””[\s]*/\>|\<[\w-_.: ]*[\s]xmlns[:\w]*=””[\w-/_.: ]*””\>\<\!\[CDATA\[\]\]\>\</[\w-_.: ]*\>”,RegexOptions.Compiled);
I must admit to having developed code, in the (now distant) past that had a similar high bus factor. Nothing as terse as the above example, thank goodness, but something kinda close. On two occasions, actually. I look back and hope that I have learned, but I’m not certain that I have.
The trick here is that I do not know the developer who follows me. He or she will know some basic and common things. The problem lies deeper… It is where my expertise exceeds the ability of a maintenance developer to understand my code… that is where the break occurs.
So how do we avoid this? How does a good developer keep from creating code with a High Bus Factor?
It isn’t documentation. I have been using regular expressions for decades (literally) and the above code is wildly complicated, even for me. No amount of documentation would make that chunk of code simple for me to read or maintain.
Pithy advice, like “use your tools wisely” won’t help either. One could argue that regular expressions were not being appropriately used in this case, and in fact, the blog entry describes replacing it because it wasn’t performing well when larger files were being scanned. That isn’t the point.
I would state that any sufficiently powerful technique (whether regex, or the use of an advanced design pattern, or the use SQL XML in some clever way, etc) presents the risk of exceeding the ability of another developer to understand, and therefore, maintain it.
Where does the responsibility lie for insuring that dev team, brought in to maintain a bit of code, are able to understand it? Is it the responsibility of the development manager? The dev lead? The original developers? The architects or code quality gurus? The unit tests?
Is it incumbent upon the original dev team to make sure that their code does not have a High Bus Factor? If so, how?
I’m not certain. But it is an interesting issue.