Versioning - source of all good or evil?

Article
04/15/2004

I don’t have a real succinct point tonight, just a long hopefully interesting rant/whine.

I work on Fusion where we’re trying to make the world a better place for people who want to use software.

It’s a real uphill battle though against the people who write software.

The first problem we worked on tackling was versioning. We made a reasonable assumption that people had to follow: if you were going to release software in our world, you had to assign unique identifiers to each version of your software. Our identifier scheme is a property-bag type of approach called an “identity” and the general thing is that two identities which differ only in the “version” attribute could be assumed to be “the same” component, except for version.

That’s pretty obvious but the opposite case has proven to be too onerous for some folks to get their heads around – that it’s illegal to have two components with the same identity but which are different. Instead the model has been muddied in some cases with notions of “file versions” as well as “component versions” where the real version was component version + file version. I guess it’s just a matter of time until it’s really component version + file version + file hash, followed by component version + file version + file hash + hash of some file in a relative path. (These are the kinds of rules that Windows Update has to have for recognizing component versions and whether updates are needed.)

Our goal is to provide a “DVD player” like experience for software on windows. You can bring it on to your system and when you take it off, for the most 99% part, it’s gone. Yes, some DVD players will cache track lists which can conceptually pollute the state of the player even after removing the disc but really these 2^nd order and 3^rd order effects are nothing compared to the 1^st order effects that stop people from feeling warm and fuzzy about installing software on windows today.

But man, people aren’t willing to give an inch on the bad old loose way that things are done in order to move forward. We’ve introduced concepts of self-descriptive components and applications so that you can do accurate inventory, impact analysis and repeatable deployment of software without having to burn the bits onto unchangeable media. What’s the result? It’s seen as too inflexible for the desperate command line tool programmer.

I realize that this sort of communications medium is self-selective towards software developers so maybe you don’t want to hear that your job should be a little harder but the real problem we have is that as long as software deployment on Windows is as fragile as it is, there will always be a significant value proposition offered by devices which offer only a small number of functions and which aren’t generally programmable.

Maybe that’s the answer – if you want a DVD or game console like experience that when you remove the content, it’s gone, you should use those, but it does seem that it’s just a bunch of prima donnas who don’t want to take responsibility for the fact that their software messes up customers’ machines stopping us from making real progress. And in the end it just looks like Microsoft can't figure our how to dig itself out of the quagmire it created.

Back to the main topic of versioning… If you want to have “console” like install/uninstall repeatability but still some value in centralized servicing you have to have some way to tell the bits that application “A” brought onto the system apart from the bits that application “B” brought onto the system.

A little secret to those who don’t know the real secret to Windows’ success. It’s not that Windows is necessarily by itself so wonderful and useful – that’s somewhat important when competing against other integrated devices, but the real reason is that it’s a way for other people to make money. Other people can write software for windows and there’s a whole “virtuous” cycle where 3^rd party platforms can be layered on the Windows platform and we make money, the platform folks make money and the application folks make money and hopefully if there’s any point to the application of technology at all, either end users are living happier more fulfilled lives or they’re making more money themselves. (Yes, I’m a capitalist… ;-)

The problem is that all these platforms quickly become a cesspool on the target machine. If your “platform” is just a super duper matrix multiplication library that games can use to manipulate their 3d object spaces, then it’s not a big deal – the next first person shooter probably doesn’t even want to be exposed to a new numerical algorithm improvement without having time to verify that it does what they expect.

But if your “platform” is a database or something that provides shared functionality across the machine, you have a bigger problem. Maybe there can only be one version, but we don’t want to get into the traditional “Dll Hell” problem that libraries like MFC42.DLL have “enjoyed” for ages. (In case you’re not aware of it, it goes something like this. Application “A” comes on the system with MFC42.DLL version 1.0. Application “B” comes on the system with version 1.1 which is obviously “better”. Hunh… Application “A” stopped working. Ok, the last thing I did is uninstall application “B”. Wow, all the uninstallers in the world know that since MFC42.dll version 1.1 is better than 1.0, you should just leave it there. The end user is left with the only supported action being to reformat their hard drive and reinstall windows. Bletch.)

Our solution is that Application “A” says that it came with 1.0 and Application “B” says it came with “1.1”. When application “B” is uninstalled we apply the rule of “what would be the version to use as if application B had never been installed”. We keep the bits for 1.0 and 1.1 separate and we apply “publisher policy“ to ensure that clients, by default, get 1.1. (A local administrator always has the right to lock back to 1.0 just like they had the right to not install the service pack or QFE in the first place. It may not work, and it may escape any reasonable support boundaries but the alternatives are that either line of business applications stop working or enterprises delay deploying critical security fixes until their IT departments have had time to verify and fix any compat problems.)

I hope it’s clear how this is a better model. A lot of attempts to make the current world better have been tried. None of them can really handle uninstall well because they can’t derive what the state of the system should be after uninstall.

Another way in which software deployment has always been messed up is that people don’t test their software in the way that they expect it to run on target machines. There are magic scripts, etc. that you run in the build/test environment (or you maybe trust Visual Studio’s F5 handling to magically set up the environment correctly) which bear absolutely no relationship to what is done to get the software onto an actual customer’s machine. (You bet your bippy that the dev teams have tools that do the “full uninstall” because otherwise they’d be forced to wipe/reformat their Windows installations for each test pass.)

So one of our other mantras is that if you can run the software, you can deploy it. Meaning that we want to verify that the descriptive information needed to deploy the software has to be there to actually get the code to fire up and print “Hello, World!\n” in the first place.

The end result of all this is that versions need to matter, even during development and debugging. If you have to assume that versions might not change, you can’t do any aggressive validation or self-checking. Reproducibility goes down the tubes and you’re left in the current quagmire of trying to figure out not just what’s on your machine now but what may have come on and off the machine over time.

It seems that we’re lonely in these opinions. We get continual resistance from teams who don’t want to give up the looseness of how they’ve done software development in the past (even though it’s exactly this looseness that allows the runtime customer machines get so messed up). They don’t even want to change version numbers.

It’s hard to get started on providing a deep and useful solution when people don’t even want to have to change version numbers.

Versioning - source of all good or evil?

Additional resources