Exposure Control: Software Services Peep Show

Article
12/13/2010

It’s About Software Quality

Not sure what you were expecting here at Your Software Has Bugs, but ignore that picture on the left, and learn how Exposure Control let’s you roll out dangerous software while minimizing risk to your users by giving them only a peep into the your new service.

peep n. a quick or furtive look or glance.

Dangerous Software?

Are you talking about disk erasers or malware or something?

No, I am talking about Vnext of your service. You may have limited your BUFT (Big Up-Front Testing) so as to better leverage TiP (Testing in Production). This can be a smart way to redistribute your Test/QA dollars to optimize for quality results. But like most things that are worth it, it carries risk. Or you may have BUFT’ed the heck out of your service, but if it has user exposed changes to behavior or UI elements, then you risk annoying your users…even driving them away. The goal here then is to mitigate this risk when deploying Vnext to replace Vcurr

Exposure Control, Controlling the Exposure of your New and Dangerous Software to Your Users

There that about says it. But some details and implementation will help drive the point home.

You are launching Vnext, and I have already told you it is dangerous. So do you just throw the switch, put all your users on the new service, and hope for the best? No let’s control which users see the new and dangerous Vnext, while keeping most of our users on the tried and true Vcurr. With software services, you control what is deployed, and you control what bits your users end up exercising.

Exposure Control to Limit Scale

This is about how many users see Vnext. First expose it to only a few, and evaluate the results, then as you gain confidence dial it up. And keep your Get-Out-Of-Jail card in your hip pocket… the ability to roll completely back to Vcurr if all seems to be going amiss.

The caveat here is data and state. If the Vnext screws up your database, then it is not as simple as rolling back the code. Protect your data, keep redundant stores, and consider read-only access on ramp-up

Exposure Control to Limit Diversity

I know…I know… you’ve heard diversity is a good thing, and it can be, especially in the scope of your Test Cases. But Rapid Prototyping is a different story. When creating a new feature or entirely new service, it is expensive to insure it works for everyone and every case, before you even know if it is even the right thing to do. Rapid Prototyping let’s you get your new features and services out there quickly for evaluation in the real world, without paying all that BUFT. How about if you design and test your new web app prototype to only work on IE8. That would sure save a lot of development and test time. But then you have to insure that only IE8 users will see the Vnext. In other words you need to limit the diversity of browsers to only users that are on IE8

Or say you are launching a new Geo-aware feature, but before you commit to assembling all the data you need to work everywhere, it is a lot easier to just prototype a version that only has California data. In his case you only want California users to see the new feature. So you limit the Geo-Diversity to just the Golden State:

Other ways to limit diversity are by: Time of Day, Operating System, Recent Purchases (for an eCommerce site), or anything that we can know about the user.

What’s in a Name?

The concepts I talk about here have been evolving for a while… I take not credit other than to stand on the shoulders of giants. Some examples:

Ronny Kohavi, et. al. have been banging the drum for a long time on the value of Online Experimentation in assessing user exposed changes. For example see, Controlled Experiments on the Web: Survey and Practical Guide from 2007.
In 2009 Timothy Fitz of IMVU shared how they roll out their Vnext in IMVU, Continuous Deployment at IMVU: Doing the impossible fifty times a day

But as for the term Exposure Control, that is something Ken Johnston and I came up with in his office sometime back in August of 2009. We used it extensively in our Sept 2009 Microsoft Thinkweek paper (internal Microsoft only… sorry), TiP-ing Services Testing – Why the most critical mind shift for Microsoft’s success in S + S is Testing in Production (TiP) , that we co-wrote with Ravi Vedula (MS Word tells me the term occurs 22 times in that paper… yipes!). I then discussed it again at the Better Software Conference in June 2010 when I presented, Testing with Real Users: User Interaction and Beyond, with Online Experimentation. Since then fellow Microsofties and TiP enthusiasts Nathan Dye and Eric Brechner have also picked up the term.

So while there are many names for the mechanism/implementation of what I am talking about here: Flighting, A/B Testing (a bad name since you can also have C,D, and E), or Online Experimentation, the term Exposure Control is intended to convey a software quality strategy, that enables you to move your testing and evaluation to production (TiP) while mitigating risk to your users.