Do we need a new measure of software complexity to calculate the TCO of a portfolio?

A few days back, I blogged about a formula I'd suggest to measure the TCO of the software portfolio.  One responder asked "how do you measure complexity, since it is a large part of your formula?"

To be honest, I was thinking about traditional models of complexity measurement that are often built into static analysis tools.  Things like Cyclomatic Complexity are useful if you have a codebase that you are maintaining, and you want to estimate the risk of changing it.  I would charge that risk is proportional to cost when it comes to maintenance.  That said, the complexity should get as low as possible, but no lower.

However, after reading the question, and rereading the entry on Cyclomatic Complexity on the SEI site, I realized that the definition on that site is wildly out of date.  The page references a number of methods, but not one had any notion of whether an app's complexity goes down if it is assembled from configured components (See Fowler's paper on Dependency Injection and Inversion of Control).

In addition to the advent of patterns, we have made great strides in removing code complexity by placing the business rules external to the code.  In some respect, this has a payoff in reducing the cost of ownership.  On the other hand, you have to account for the complexity of how the business rules are encoded and/or maintained.  Rules encoded as VBScript are hardly less complex than code.  But they may be less complex (to maintain) than rules encoded as a hand-built static linked list or tree structure stored as database records. 

We have also removed complexity from code by placing some of it in the orchestration layer of an integrated system.  In fact, this can be a problem, because complexity in orchestration can be quite difficult to manage.  I've seen folks install multiple instances of expensive server software just because they felt that they could better manage the proliferation of messaging ports and channels if they dedicated entire instances of integration server software to a specific subset of the channels. 

Not for performance is this done. It may even make deployment more difficult. But if your messaging infrastructure is a single flat address space, then fixing a single message path is like trying to find a single file in a directory with 12,000 files, each having a GUID for a filename, and the sort option is broken. 

So complexity in the orchestration has to be taken into account.  Remember that we are talking about the complexity of the entire portfolio.  If you say that neither App one nor App two own the orchestration between them, then are you saying that the orchestration itself is a new app, called App three?  How will THAT affect your TCO calculations? 

Most of the really old complexity measures are useless for capturing these distinctions.

Of course, you could just measure lines of code or Function Points.  While I feel that Function Points are useful for measuring the size of the requirements, the TCO is not derived from the size of the requirements.  It is derived from the size of the design.  And while I feel that LOC has a place in application measurement, I do not feel that it is useful for providing any useful mechanism for the total cost of owning the application, since a well architected system may require more lines of total code, but should succeed in reducing the amount of 'cascading change' since well architected systems reduce coupling.

On the other hand, complexity, to be useful, must be measurable by tools. 

I'm not sure I if I can whip up a modern complexity calculation formula to replace these older tools in the context of a blog entry.  To do this topic justice would require the time to perform a masters thesis. 

That said, I can describe the variables I'd expect to capture and some of the effects I'd expect each of these variables to play on total complexity.  Note: I would view orchestrations to be 'inside the boundary' of an application domain area, but if an area of the architecture has a sizably amount of logic within the connections between two or more systems, then I'd ask if the entire cohesive set could be viewed,f or the sake of the complexity calculation, to be a single larger application glued together by messaging.

Therefore, within this definition of application, I'd expect the following variables to partake in some way in the function:

Variables for a new equation for measuring complexity

number of interfaces in proportion to the number of modules that implement the interface: an interface that has multiple children shows an intent to design by contract, which is a hallmark of good design practice.  That said, each interface has to have at least two modules inheriting from it to achieve any net benefit, and even then, the benefit is small until a sizable amount of the logic is protected by the module.

Total ports, channels, and transformations within the collaboration layer divided by the number of port 'subject areas' that allow for grouping of the data for management.: The idea here is that the complexity of collaboration increases in an S curve.

Total hours it takes to train a new administrator on how to perform each of the common use cases to a level of competence that does not require oversite. 

Total number of object generation calls -- in ther words, each time the 'new' keyword is used, either directly or indirectly.  By indirectly, we want to count each call to a builder as the same (or slightly less) complexity for each call to the 'new' keyword.

Total count of the complexity as it is measured by coupling -- There are some existing tools that appear to do a fine job of measuring the complexity by measuring the module coupling. 

I'm sure I'm missing some more obvious ones, because I'm tired. 

Once the list is understood and generated, then creating a formula that models the actual data isn't simple.  That said, I'm sure that we need to do it.