Brad's Carl Sagan-style post on code size stirs up some familiar questions about complexity management here at MacBU. With every release of Office, we remove, rewrite and add new code. Often we can remove significant amounts of code without losing functionality – say, remove use of a custom HTML renderer when the OS offers a built-in one instead. I recall how many of us took it as a badge of pride during the Carbon transition of Office v.X that after a year of intensive coding our net contribution in terms of new lines of code was significantly negative. Those were the days, hacking and rewriting. But of course, as the total functionality of the suite increases over time, the total volume of code used to build it tends to inevitably follow. There has been a lot of functionality added in twenty years.
Is it necessary? It's pretty easy to find people who agree that Office has more functionality than they personally need. Where the agreement tends to fall apart is when you ask what the first feature to be cut should be, at which point one person's "bloat" is another's "only feature I use every day". It's also nice to know the functionality and compatibility is there, as you find the need or the advantages of it. I use all of our apps every day, not to test but just to get work done – Exchange-based mail and calendaring in Entourage, analyzing data and lists in Excel, IMing over corp chat – it's an interesting job being able to use the software every day to talk about the job of making the software. My Office skills were pretty minimal when I got here, but (and some coworkers might smirk at this) they've come a long way. You may not have heard of (say) pivot tables, but once you start using them for certain kinds of analysis you can't imagine how you ever lived without them.
Of course, our goal isn't to just pile features on top of features, but to harness the increasing power in the suite with an effective and focused user experience that puts functionality in reach and in context. In some cases that can be pretty easy, but generally it's actually pretty hard, and something we have to think about with everything we do.
Is it slow? Generally, there's no linear relationship, where adding 10% code means everything is 10% slower. There's quite a lot of engineering happening down at the metal to swap in the currently executing code as the application runs. The problem is in how much work the application is trying to do – if it's trying to do a lot, it's necessarily going to take longer to do. When we work on performance bottleneck areas (such as boot speed) that is generally what we are looking at, i.e. do we really need to be doing all this now? For this release, running native on Intel is orders of magnitude more important for speed than trying to trim raw code size, for example.
Is it sustainable? It's important to understand that while the code has been written and expanded over the past twenty plus years, it has also been tested and validated for twenty plus years. Insane amounts of testing. Stabilized or "baked" code provides a baseline for correctness and quality – as new changes introduce unintended problems (regressions), we know we just need to hunt down which new change is the culprit and understand how it needs to be redone. The real impact is on productivity: while correct, that baked code can vary wildly in how well factored it is (well factored being the opposite of "spaghetti"), and therefore how sensitive to regression. We definitely have some pasta mixed in as part of our overall diet, and we have to develop strategies for dealing with that. Sometimes it's best to simply say "do not disturb the water", but that's not always an option (say, for example, you decide to blow out the Excel cell table.)
Shouldn't we just rewrite it all from scratch, and "do it right" so that *all* of the code is perfectly well factored according to modern sensibility? JoelOnSoftware made some great observations on that topic in this article: Things You Should Never Do, Part I. (I believe link-master Brian sent that to me originally. Pyramid was before my time here so I'll leave it to others to comment on whether Joel's observations on that project are fair and accurate or not.)
Some people keep souvenir copies of tech editorials from ten years ago predicting that Microsoft will be unable to release any further versions of Excel, because the code base has simply gotten too big and Microsoft will have to rewrite it from scratch. That was quite a few releases of Excel ago.
Joel points out that code doesn't rust, but the complexity problem is real because massive change is inevitable if you are trying to keep your software on the leading edge – especially on the Mac. We plan to continue shipping Mac software into the foreseeable future, so it is easy for us to make major architectural decisions with the very long term in mind. Where it gets hard is balancing this against short term customer need – what resources can we apply to deep rearchitecture or local refactoring, given that these resources will not be available to solve the most pressing problems that are actually visible to customers? While we made significant investments in future capability earlier in the current product cycle, right now that customer need is pretty extreme and we're 100% focused on getting this release out into the market.
Finding this sort of balance overall, and dealing with this level of complexity in general, is clearly our defining challenge with Office. It takes some time to get your head wrapped around it all. We might go days without seeing a particular new developer, and we'll have to send a veteran search and rescue squad into their area of the code to pull them out. Of course, it doesn't just impact developers, it's in the density of the test cases, the maintenance of the automation, the range of functionality and interactions our designers have to understand, the scope of documentation, the pace of change. Managing complexity is where we live. It's not for everyone, but fortunately I find it all pretty fun at the end of the day... gives my head something to do.