Office Online and "Content Watson"

Hmmm... it's been a while since I've been here, hasn't it.. Guess I've been busy...

One of the things my team does is the Office Online site.

Now that's probably a bit arrogant, because the reality is that there are hundreds of people who "do" the Office Online site - from content writers on our User Assistance (UA) team, to site managers (who do the daily programming of the home pages for Assistance, Training, etc.) to international folks who manage the 42 (count 'em, 42) markets in which we publish online assistance content for Office. But my team is the "program management" team which means we design features and write the specifications for how those features work. Don't like how print works on the site - that's my team's fault. Don't like the topic about printing from Excel - that's UA. (Of course, I will take any and all complaints, problems, etc. and get them to the right place).

One of the really cool things we did with Office Online is something called "Content Watson". To understand the term, it's helpful to have a little background...

Watson is a term used at Microsoft (it has probably attained buzzword status within the company) for an actually rather elegantly simple set of technologies. The idea started with what we call "crash watson" - called "Error Reporting" externally. The idea was that we know people crash in our software but we don't always know why and without being able to fire up a debugger on their machine, it's rather hard to really know why. Sometimes it's something wierd about their machine - the kind of drivers they have, some combination of software, etc. The brilliant guy behind Watson - he actually used to work for me, but before he did Watson - asked, what if we captured "mini dumps" of a crash, uploaded those to a web site and did some automatic analysis on them to identify common "buckets" - based on the "crash signature" (what's on the stack, the register values, etc.) these seem to be the same crash. Then you could do some triage and address the most common problems first. In fact, this has worked spectacularly well for products from Office to Internet Explorer to Windows. All now have this automatic error reporting built in and it helps us improve stability because we get live reports on actual crashes customers have encountered - the most commonly occuring ones are top priority for service patches.

So distill this to the next level of abstraction and you have Watson = collect a bunch of information about problems, do some automatic analysis to bucketize the problem reports and address the most common ones first.

Content Watson does this for our online assistance content. We have in (just about) every piece of content on Office Online feedback UI - rate this template, "was this article helpful?", etc. You may wonder where this feedback goes - or you may be one of the cynical ones who suspect it just goes into the bitbucket. Well, you'd be wrong. It goes into a SQL Server database on our site backend which tags it with an "Asset ID" - the unique identifier for the piece of content you rated (including language). Daily, we roll that data up into a summary database that aggregates the individual feedback records on assets into an asset-level summary. The cool thing about an asset is that it has a writer - an actual human being to whom we can expose what people on the site were saying yesterday about his or her content.  Want to know which of the 80 articles you've written on Word are most frequently read and have the lowest ratings - here they are. So whadaya gonna do about it? Well, when we collect the feedback, we also collect verbatims - why was it unhelpful, what was wrong about it? We show the authors these, too. They read through those (the ones that don't say "you suck", or "Tell Billy Gates to stop making so much money" - those ones really help our writers out, thanks a lot - you know who you are) and usually can pretty quickly understand how to improve their content.

One great example is we have a topic for Outlook called "About CC and BCC". This topic has been around since Outlook 98 and the content hasn't really changed, because the feature hasn't changed. So we've shipped it again and again, unmodified since it was first written. When we first went live with online content on Office Online - boom, this one showed up as a top negative for the Outlook team. When we looked at the verbatims, it was really clear why: "But how do I show it?" "How do I show BCC???" "How do I get BCC to show???". Turns out our article said "If the BCC isn't visible when you create a message, you can show it." But we didn't tell you HOW to show it. Now it turns out we have another topic called "View the BCC field for new messages in Outlook" - so that was covered, but we didn't link to it from the overview topic that explained what BCC was. The fix: change the last few words of the "About CC and BCC" topic to "you can show it." - i.e. make "show it" a hyperlink to the other topic. Republish the content - we can do this since everything's online - and voila, the negative ratings quickly went down on this topic. Turns out that probably FOR YEARS people had been saying, "yes, but how do I show it?" - but we couldn't hear them. Once we could, it was easy to fix.

There are dozens of stories like this. We've republished or modified literally thousands of pieces of content in response to customer feedback like this. On our template site, we've added hundreds of new templates in response to user requests.

Of course, my team can take only indirect credit for this because it's really our content team that's doing the heavy lifting here. We just build the weight lifting machines.