Office Online and “Content Watson”

Hmmm... it's been a while since I've been here, hasn't it.. Guess I've been busy...

One of the things my team does is the Office Online site.

Now that's probably a bit arrogant, because the reality is that there are hundreds of people who "do" the Office Online site - from content writers on our User Assistance (UA) team, to site managers (who do the daily programming of the home pages for Assistance, Training, etc.) to international folks who manage the 42 (count 'em, 42) markets in which we publish online assistance content for Office.  But my team is the "program management" team which means we design features and write the specifications for how those features work.  Don't like how print works on the site - that's my team's fault.  Don't like the topic about printing from Excel - that's UA.  (Of course, I will take any and all complaints, problems, etc. and get them to the right place).

One of the really cool things we did with Office Online is something called "Content Watson".  To understand the term, it's helpful to have a little background...

Watson is a term used at Microsoft (it has probably attained buzzword status within the company) for an actually rather elegantly simple set of technologies.  The idea started with what we call "crash watson" - called "Error Reporting" externally.  The idea was that we know people crash in our software but we don't always know why and without being able to fire up a debugger on their machine, it's rather hard to really know why.  Sometimes it's something wierd about their machine - the kind of drivers they have, some combination of software, etc.  The brilliant guy behind Watson - he actually used to work for me, but before he did Watson - asked, what if we captured "mini dumps" of a crash, uploaded those to a web site and did some automatic analysis on them to identify common "buckets" - based on the "crash signature" (what's on the stack, the register values, etc.) these seem to be the same crash. Then you could do some triage and address the most common problems first.  In fact, this has worked spectacularly well for products from Office to Internet Explorer to Windows.  All now have this automatic error reporting built in and it helps us improve stability because we get live reports on actual crashes customers have encountered - the most commonly occuring ones are top priority for service patches.

So distill this to the next level of abstraction and you have Watson = collect a bunch of information about problems, do some automatic analysis to bucketize the problem reports and address the most common ones first.

Content Watson does this for our online assistance content.  We have in (just about) every piece of content on Office Online feedback UI - rate this template, "was this article helpful?", etc.  You may wonder where this feedback goes - or you may be one of the cynical ones who suspect it just goes into the bitbucket.  Well, you'd be wrong.  It goes into a SQL Server database on our site backend which tags it with an "Asset ID" - the unique identifier for the piece of content you rated (including language).  Daily, we roll that data up into a summary database that aggregates the individual feedback records on assets into an asset-level summary.  The cool thing about an asset is that it has a writer - an actual human being to whom we can expose what people on the site were saying yesterday about his or her content.  Want to know which of the 80 articles you've written on Word are most frequently read and have the lowest ratings - here they are.  So whadaya gonna do about it?  Well, when we collect the feedback, we also collect verbatims - why was it unhelpful, what was wrong about it?  We show the authors these, too.  They read through those (the ones that don't say "you suck", or "Tell Billy Gates to stop making so much money" - those ones really help our writers out, thanks a lot - you know who you are) and usually can pretty quickly understand how to improve their content.

One great example is we have a topic for Outlook called "About CC and BCC".  This topic has been around since Outlook 98 and the content hasn't really changed, because the feature hasn't changed.  So we've shipped it again and again, unmodified since it was first written.  When we first went live with online content on Office Online - boom, this one showed up as a top negative for the Outlook team.  When we looked at the verbatims, it was really clear why: "But how do I show it?"  "How do I show BCC???" "How do I get BCC to show???".  Turns out our article said "If the BCC isn't visible when you create a message, you can show it."  But we didn't tell you HOW to show it.  Now it turns out we have another topic called "View the BCC field for new messages in Outlook" - so that was covered, but we didn't link to it from the overview topic that explained what BCC was.  The fix: change the last few words of the "About CC and BCC" topic to "you can show it." - i.e. make "show it" a hyperlink to the other topic.  Republish the content - we can do this since everything's online - and voila, the negative ratings quickly went down on this topic.  Turns out that probably FOR YEARS people had been saying, "yes, but how do I show it?" - but we couldn't hear them.  Once we could, it was easy to fix.

There are dozens of stories like this.  We've republished or modified literally thousands of pieces of content in response to customer feedback like this.  On our template site, we've added hundreds of new templates in response to user requests.

Of course, my team can take only indirect credit for this because it's really our content team that's doing the heavy lifting here.  We just build the weight lifting machines.


Comments (8)

  1. Nektar says:

    When you update the content do you update the help system in Office as well. I know that you can set you Office programs to get their help content from the web but if you fix the web content how do you ensure that this content is also available to customer who are not using the online content but are instead reading from the offline help system. Do they have to wait for the next version? Why don’t you autoupdate help topics?

  2. Mike Kelly says:

    We don’t currently update the offline content, only the online.

    The main reason is that we’re using HTML Help as our offline content repository and it’s not possible to do incremental updates to that store – it is a compiled binary format. We’ve organized our HTML Help files on an application level – so there’s one for Word, one for Excel, etc. So we would have to rebuild the entire CHM (content) and AW (index)file for Word if we changed just a single Word topic. We could possibly do this as part of the service packs, but we decided that since most people download the service packs and the binary HTML Help format is not patchable, we would significantly bloat the service pack downloads if we were to add updated HTML Help content to them.

    As more and more people have constant Internet connections through broadband (that number has surpassed 50% of all Internet users in the US, and is on the way there in many Western European countries) offline is less imporant. Not unimportant, but less. We believe the best experience is online – you get the most up to date content and can provide feedback.

  3. It comes from the Sherlock Holmes stories.

Skip to main content