Evolution, Complexity, and Software Platforms

Webster's Dictionary defines evolution as, "a process of continuous change from a lower, simpler, or worse to a higher, more complex, or better state."

I really hate pretty much every speech or writing which starts with the dictionary definition of something. Certainly it is a starting point, but it is probably the least creative of all starting points. It insinuates that somebody does not know the dictionary definition of a particular term. Generally, the author must arbitrarily select one definition among many, inherently including bias into the selection of the specific definition used. And did I mention that it's boring and overused? How ironic that I used it myself.

Do most people agree with this particular definition? Certainly, it appears in the mass media. This is the way that I tend to hear it referred to in common speech. In my experience, this definition is similar to the one that most people would use to define the term evolution. That doesn't necessarily mean that it applies at the scientific level.

Biological evolution, after all, is the nonrandom selection of random mutations. Does nonrandom selection always occur in the direction of additional complexity? The existence of complex organisms seems to suggest that this is the case. However, the existence of highly complex organisms instead more likely indicates that the variation has increased, not that selection is directional. And, of course, with mutation we expect variation to increase.

This actually is somewhat intuitive, once you think about it. Given a particular starting point, a single random mutation will make the organism either somewhat more complex, or else somewhat less complex. The direction will be completely random. When the more complex mutation mutates again, it will again be either more complex or less complex. If one particular series of mutations that increased complexity happens to survive, then the result at the end of that chain is an organism that is much more complex than the starting organism. At the same time, if one particular series of mutations that decreased complexity happens to survive, then the result at the end of that chain is an organism that is much less complex than the starting organism. And, of course, there are a number of potential combinations of mutations that exist in the middle - where some of them have increased complexity and others have decreased. All that we can really say is that the variation has increased - not that complexity has been specifically selected.

Where this gets complicated is when you have a boundary condition, which makes it appear as if overall complexity has been increasing. For example, if your starting point is a very simple single celled organism, it is difficult to make that organism any less complex. Any variation that occurs as mutation takes place can only take place in the direction of additional complexity, because there is not much room to simplify a bacteria while there is an enormous amount of room to make it more complex, to the point where it can be as complex as a human being sitting at his tablet PC thinking about such things as evolution, complexity, and software platforms. So, the true effect of mutations over time is that variation increases. At the end, the branch that happened to increase complexity more often than not can eventually be extremely complicated. That does not mean that the evolution was directional - just that the variation happened to manifest itself in this way. There are still plenty of simple bacteria around (more than all other life forms combined, according to all of the literature that I have come across), but because they can't get much simpler you don't see the other side of the tail. And, in fact, massive complexity truly is the tail of the distribution. A small tail of highly complex organisms does not make evolution directional. It simply represents variation.

D. W. McShea at Duke University (I don't know him personally) has been doing research specifically on the nature of evolutionary trends and complexity. A sample of his work on this topic that makes an interesting and reasonable approachable read can be found here. We will simply jump to the conclusion (although the article itself is worth a read):

The results here - two cases in which probability of increase was greater and two in which probability of decrease was greater ± are consistent with and support the null hypothesis that increases and decreases are equally probable (or would if they had been randomly chosen).

In other words, the appearance of additional complexity can not be assumed to be anything more than the appearance of additional variation.

This phenomenon has some interesting parallels with regards to the evolution of software. Software itself seems to be getting more complex over time, but are we deceiving ourselves by placing too much weight on the right tail of the distribution? In fact, much software is seemingly simple, and consists of surprisingly few lines of code. Consider the hobbyist, who is just putting together relatively few lines of code to do something interesting or useful.

What does vary dramatically, however, is what those few lines of code can actually do. If you go back to Petzold-style C code for windows, it takes quite a few lines of code just to get a window to appear. With the same number of lines of code today using Windows Forms or the Windows Presentation Foundation (formerly code name Avalon), you can likely do a lot more than just show a window, such as do some custom drawing or interesting animation. As platforms evolve, what that bit of code can do becomes increasingly sophisticated. At the same time, as people become more sophisticated with using the platform, they can squeeze even more out of those few lines of code. The variation increases as the body of code increases, and the right tail of the distribution becomes extremely interesting.

You can see some of this effect by flipping through old issues of your favorite trade journal, such as MSDN Magazine. If you look at an issue from the beta days of the .NET Framework 1.0, you will see relatively basic samples and articles by today's standards. The same magazine today, targeting the same platform, will have much more sophisticated articles, and at the same time it will continue to have relatively straightforward and introductory articles. The variation increases along with the sophistication and maturation.

Where am I going with it? Well, to some extent, I just think it's neat. But there is a point to be learned as well, even though software - following the principles of intelligent design - is not bound to the same restrictions as biological life and random variation. What point is that? Understanding when to migrate platforms in order to take advantage of the additional sophistication of the new platform.

This is a decision that most organizations face on an ongoing basis. There are constantly new platforms released, which offer a huge number of features which may or may not be compelling. It is comparatively far more typical to maintain an existing application then it is to completely write a new application from scratch. How do you decide whether to port from one platform to another?

Obviously, there is no easy answer, and I can not possibly understand all of the economics and the starting point of every application. But, considered in the abstract, it's important to understand exactly where your application sits on a scale of complexity, compared to where it needs to sit in the ideal sense. Assuming that your platform begins with a certain complexity built in (which it presumably does, in order to be selected over competing platforms). This complexity will have a minimum, and a practical maximum defined by the investment to achieve that maximum compared to the returns.

An application that sits far in the right tail, demonstrating extreme complexity on the given platform, may not necessarily benefit from replatforming right away. What drives the replatforming decision is how much further you would like to go with your application, compared to the investment you make to undertake the replatforming.

This is getting fairly abstract, so an example is probably in order. Assume that a platform begins with complexity 1. It is reasonable to expect that an application developed for this platform will eventually reach complexity 10 as developers become more sophisticated using it. Now, a new platform is released that makes development easier. The simplest application developed for this platform has a complexity of 4. The maximum practical complexity on this platform we reasonably expect to reach complexity 15.

Since most applications developed are not particularly complex, they would immediately benefit from replatforming because they would already reach a complexity of 4 from a lower complexity. However, say that you spent a long time developing an application, and it had reached a complexity of 8. When you replatform, you immediately revert to a complexity of 4, and have you work your way back up to a complexity of 8 just to get to where you began. The up side? You can then keep moving, and achieve a theoretical maximum complexity of 14, much higher than before.

This, in my opinion, is why you don't see applications such as Microsoft Word relatforming to .NET immediately - the investment to regain the complexity they already have is dramatic, and the decision to replatform will be made when the theoretical maximum of an existing framework is no longer sufficient.

Of course, there are a huge number of other factors to weigh in to the decision, such as how much additional work is going in to the application, the knowledge base of your existing employees and those available to you in the market, the investment available for a particular application, but this gives a sense as to why you might want to replatform, and why you might want to wait or have this exercise happen in the background. It's all a part of the delicate balance between completing a new project quickly and making a new project as complex as possible to meet the needs of the marketplace.

Comments (5)

  1. Jonathan Esters says:

    In biological complexity is about higher levels of organization, not random variation. Variation in the genetic code does not make the organism more complex, because it does not make it more organized. You are mixing up variation with complexity, and using this word  incorrectly in context of organic systems. I know its fun to simplify things like mutations to a coin flip- more complex, less complex- but this just reasserts the deliberate ignorance of evolution proponents. Nice try though.

    I would like to see you add entirely random variations to the ones and zeroes of a software program, and see it become more organized and more useful. I will even allow you millions of years.

  2. cjacks says:

    Oh, clearly random variation does not move towards complexity. Rather, random variation moves towards entropy. Evolution does not happen by mutation alone – you need a way of selecting for survival those mutations that are better sutied for the environment.

    So, clearly I would not suggest that anybody who wanted to dive into evolutionary algorithms as a means to attempt to solve particularly complex problems do so simply by flipping ones and zeros randomly. Rather, you absolutely must have some form of natural selection.

    See, for example, The Blind Watchmaker application. I wish I could find the link, but Daniel Dennett put toghether a speech a while back with some great video demos illustrating evolutionary machine learning.

  3. Jim Thio says:

    Square water melons and genetically engineered food are samples that once in a while, life is created. Not a proof, but a plausibility.

  4. cjacks says:

    Erm … OK then … I don’t think that anybody would suggest that the only way that things can come into existence is by evolution. Some things are designed. It doesn’t follow logically that, therefore, everything is.

    But this is not the forum to debate biological evolution. Regardless of whether or not you accept this truth, the process and principals can potentially be useful in crafting more efficient ways to solve some of the most challenging problems we face today. And solving problems is what I find interesting!

  5. cquirke says:

    Sometimes when posting comments on this site, the process reloads a different page, rather than the one that was being commented.

    This is IE8b1 over IE7 on XP SP2, Standards mode, set to prompt on all active content.  I can’t repro at will (though certain articles seem to "stick" more than others) and it doesn’t seem to matter whether I OK or Cancel the active content alerts.  

    This particular article page has been affected.

Skip to main content