If you’ve ever tried to create a software application of any decent size, you have probably realized that what works with small tools more often than not does not scale well to larger sizes. There are whole books written about how to scale up and out, new languages and technologies invented to help manage it, and millions of dollars in educational budgets spent every year to achieve that elusive goal.
And yet, the landscape is still rife with examples of poorly designed software.
A significant part of the problem is not that engineers are not smart enough–it’s that we outsmart ourselves. The conventional wisdom is that humans can store 7 ± 2 primitives (e.g. numbers) on the top stack of short-term memory. The number skews higher for strongly associative items, and vice versa. We can store fewer abstract items, often only 2 or 3 at a time. The issue is not one of comprehension, it’s one of focused retension. When the number of data items in a class exceeds our capacity to retain them at once, the brain tends to move the perceived “less important” fields/properties into peripheral storage. This, I think, leads to not fully comprehending the interactions in the class. We engineers like to think of ourselves as intelligent, and that we can understand very complex ideas. This may be true, but it misses the point; the point of software engineering is to create reliable, performant, maintainable software. It is not to create complex applications.
Well, there’s the problem, isn’t it? In order to create the really cool applications, you must have complexity. It’s simply unavoidable, especially given the advanced competition for consumer dollars. You can’t very well create a next generation RPG that has realistic physics, with a “real” economy, kickass graphics and brutally smart AI without complexity.
Or can you?
Actually, I believe the answer is that you can, at least the bad kind of complexity that causes the pain. The key is differentiating between behavioral complexity and system complexity. Behavioral complexity is the aforementioned features like smart AI, or realistic physics. Those sorts of things are monstrously difficult to get right, but that’s not what holds back the very large application development projects. The major cause for schedule slips, regressions, out-and-out bugs, and general angst is system complexity.
Before I go on, I’d like to define what I mean (yes, I do get to hijack words like this) by system complexity:
System complexity a property of a system that is directly proportional to the difficulty one has in comprehending the system at the level and detail necessary to make changes to the system without introducing instability or functional regressions.
In other words, the higher the system complexity, the harder it is to upgrade, maintain, and develop the system. Whether the system is a single class, an object library, an inheritance heirarchy, or an entire application, someone “schooled in the art” of software engineering should be able to understand it, given a reasonable time to examine the system. This ability should scale through the system. When one examines a particular object, it should be apparent without examining the private implementation details how to use it. From top to bottom, from bottom to top; the entire system should be accessible to the non-super-geniuses out there. What goes for good UI design also goes for good C++, C# and COM design: your target audience should be able to use it without extensive systems training.
System Complexity is what makes the Mythical Man Month true. It’s what keeps that guy who wrote the accounting system from scratch in 1986 employed. It’s what causes games, productivity applications, and even (some might say especially) large operating system projects to slip their schedules.
So what can you do about it? The first step to recovery is admitting you have a problem. Pedantic, but true. Getting back to the point I tried to make earlier about engineers–we like to think we are smart. Egos loom large in the software industry, and often with good reason. One of the smartest developers I have come across worked on a critical piece of the code for several years, and guarded his territory jealously. Make no mistake, this guy was brilliant–and he knew it. To make a long story short, when he left, the code he left behind was an engineering marvel [he says wryly] that no one else understood. Oh, we know what it does, but there’s a crucial difference in understanding what something does, and grokking fully how it does it. Tracing execution paths through this code is difficult, and 1200-line methods named “Run” are common.
The author understood it all, or said he did, but my opinion is that he was fooling himself, too. Once you get a piece of the system working, after all, you don’t need to touch that piece again unless the design changes, a bug is found, or something like that. Over time, though, people started to notice that even small fixes in this particular area would touch a dozen or more files, and churn lots of code. That’s a symptom of high system complexity; he would go in to make the fix, then discover that 100 variables influenced the behavior, or the implementation in one area was strongly dependent on the implementation in another, or most often, both. But since the author had traveled this area so much, he was able to turn around the fixes in an acceptable timeframe, so he “got away” with it.
It’s no surprise that when the author decided to move on to other things, as employees often do, that we had a tough decision to make. Do we maintain this code that’s working? Do we refactor? In the end it came down to pragmatism and schedule: the problem was recognized, and we chose to refactor the system in measured stages, minimizing the risk to the product and getting back to a good, maintainable state.
That’s all I have time for today… my build is done, so back to work I go. Windows Vista waits for no man!