Wait, that was MY bug? Ouch!


Over the weekend, the wires were full with reports of a speech recognition demo at the Microsoft’s Financial Analysts Meeting here in Seattle that went horribly wrong. 

Slashdot had it, Neowin had it,  Digg had it, Reuters had it.  It was everywhere.

And it was all my fault.

 

Well, mostly.  Rob Chambers on the speech team has already written about this, here’s the same problem from my side of the fence.

About a month ago (more-or-less), we got some reports from an IHV that sometimes when they set the volume on a capture stream the actual volume would go crazy (crazy, for those that don’t know, is a technical term).  Since volume is one of the areas in the audio subsystem that I own, the bug landed on my plate.  At the time, I was overloaded with bugs, so another of the developers on the audio team took over the investigation and root caused the bug fairly quickly.  The annoying thing about it was that the bug wasn’t reproducible – every time he stepped through the code in the debugger, it worked perfectly, but it kept failing when run without any traces.

 

If you’ve worked with analog audio, it’s pretty clear what’s happening here – there’s a timing issue that is causing a positive feedback loop that resulted from a signal being fed back into an amplifier.

It turns out that one of the common causes of feedback loops in software is a concurrency issue with notifications – a notification is received with new data, which updates a value, updating the value causes a new notification to be generated, which updates a value, updating the value causes a new notification, and so-on…

The code actually handled most of the feedback cases involving notifications, but there were two lower level bugs that complicated things.  The first bug was that there was an incorrect calculation that occurred when handling one of the values in the notification, and the second was that there was a concurrency issue – a member variable that should have been protected wasn’t (I’m simplifying what actually happened, but this suffices). 

 

As a consequence of these two very subtle low level bugs, the speech recognition engine wasn’t able to correctly control the gain on the microphone, when it did, it hit the notification feedback loop, which caused the microphone to clip, which meant that the samples being received by the speech recognition engine weren’t accurate.

There were other contributing factors to the problem (the bug was fixed on more recent Vista builds than the one they were using for the demo, there were some issues with way the speech recognition engine had been “trained”, etc), but it doesn’t matter – the problem wouldn’t have been nearly as significant.

Mea Culpa.

Comments (98)

  1. Larry, a long time Microsoft blogger, and key member of the audio team, speaks out about the root cause…

  2. notquitsure says:

    May I repeat a never answered question I put in Larry’s blog:

    Note that the "delete" and "select all" commands are perfectly recognized and written down (but not executed as wanted). If "all the audio data that was being received by WSR was being clipped, and thus was incredibly distorted", how does it come that easy words are distorted while longer ones are perfectly understood? And what about commands not being executed but written down, is it also an audio gain issue???

  3. notquitesure: Beats me, it’s tied up in DSP and how those algorithms work.  The SPL for words like "delete" and "select all" may be higher than the ones for the other words.

  4. Bob says:

    Why did the presenter not rehearse his demo ten times before he went up to make sure everything was working?  Even if you have identified the bug now, I don’t see how a presenter could not have tested the software before he went up.

  5. Bob, see Rob Chamber’s post – they did try it multiple times.  This problem is subtle and timing related, it’s hideously unfortunate that the one time it showed up was the one time it mattered the most.

    Murphy’s law at its finest.

  6. Juha says:

    I’m surprised there’s no indication or warning that oversteering/clipping is occurring. If it has such detrimental effects, especially with analysts and TV crews present :), surely it’d be useful to notify users about it?

  7. Chris says:

    We’ve all been there.  Nothing like watching your app crash and burn on the public stage.  You seem to be taking it well though.  Keep your chin up.

    -Chris

  8. Ervin J says:

    Thanks for the good laugh.

    I wouldn’t worry too much about it. It has happened to Windows, too, and it’s still the most popular OS.

  9. Baron Davis says:

    You dog, I got some advice for you…

    Get a Mac! They got some ill voice recognition that I use at the crib in Oak-town. I walk in an say "Bitch, fix me a chicken pot pie!" and the lights come up and the stereo starts playing my Wu-Tang.

    That shit is straight baller son. Believe that!

  10. Via Scoble: Wait, that was MY bug? Ouch! Microsoft’s Larry Osterman steps up and claims responsibility for the code that botched the demo I posted on Saturday. Here’s something new and interesting from Microsoft. Someone stepping up and saying that the

  11. Erwin Alva says:

    Here’s the correct digg link (not that it matters, though)…

    http://digg.com/tech_news/When_good_demos_go_very_very_bad

  12. Let’s set Wagged PR response: “ambient noise”, so double delete the killer bug reports, select all “silent room”, double the.

  13. Norman Diamond says:

    > and root caused the bug fairly quickly.

    That’s what you get for running as root.  If you ran as a limited user then you wouldn’t cause bugs.

    > The annoying thing about it was that the bug wasn’t

    > reproducible

    So just mark it as resolved, not reproducible, and close it.  That’s what Microsoft does with lots of my Vista beta bug reports.

    Unlike ordinary Windows bugs where Microsoft requires fees to be paid before allowing bugs to be reported, with Vista betas it’s easy.

    (1) Find a secret link posted by Microsoft to get a bug reporting tool,

    http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=467250&SiteID=17

    (2) Download, install, and run the tool.

    (3) Get a reply containing some combination of:

    (3a) A request for more information, telling you to view a page on the Connect site but not allowing you to view the page because Microsoft intentionally prohibits you from viewing your own bug reports and Microsoft’s replies to them (that is, if you paid for the beta as part of an MSDN subscription),

    (3b) A resolution that Microsoft couldn’t reproduce the bug, accompanied by a request to try it again under build 5472, but not accompanied by a link from which build 5472 could be downloaded (MSDN has links for build 5472 in English and German but not Japanese),

    (3c) A bunch of question marks, for which Microsoft’s original words can’t be guessed, because Microsoft doesn’t know how to select an encoding for Microsoft’s own e-mail that can send Microsoft’s own words to the victim.

    Mr. Osterman, all you have to do is conform to your employer’s practices.  (3b) would make things easy for you.

  14. Neo1 says:

    It was a very good technical definition on what happened during the demo. But a bug is a bug and other people just see it that way. I will not be suprised if Vista will not ship by early next year. Anyway, we can all look forward for the Zune instead of Vista 🙂

  15. Everyone is talking about that Windows Vista Speech Recognition demo at the Financial Analysts Meeting….

  16. FrankSchwab says:

    Reading about this one took me back to college in, oh, say, 1982.

    We were doing a speech-recognition project for an engineering class.  We wrote code to interface to the speech recognition boards (for a PDP-11, IIRC).  We built some interfaces to control various 110V items – lights, etc. We practiced the presentation that we were to make before the entire engineering class and various industry visitors multiple times.  We were smoking.

    And it went off well.  Right up until the time the presenter said "Make Daiquiris", and the blender started up.  See, we hadn’t actually practiced with the blender plugged in – it was too loud.  While the rest of the audience roared with laughter at the gag, our presenter is walking to the back corner of the stage, bending over the mic trying to shield it from the noise, and finally gets the blender turned off.  Talking to people later, almost no one in the audience noticed the gaffe.

    Sometimes, it’s good to be lucky, and not to be a Microsoft representative in front of an audience with recorders.

  17. Bob Jones says:

    Since Microsoft is so fond of ‘ahem’, ripping off Apple, why not steal Steve Job’s presentation preparation routine.

  18. Phil Wheat says:

    And people wonder why some developers try to deflect any bug from their perfect code – this is the perfect example of what happens when you stand up and say "That was my bug and I squashed it!"  (The fun in the comments, that is.)

    Good job Larry!  It’s a beta, you found an error and fixed it for RTM.  It sounds like both jobs were done – both testing in real world situations to find the bug and a solution was found.

    Oh ye who have not released code with bugs – go ahead and admit you haven’t written any code.

  19. John Walker says:

    Larry,

    Thanks for the technical explanation of this. It’s why I read your blog! Please don’t be disheartened by some of the negative comments.

  20. Keith Hill says:

    There’s a reason presenters joke nervously about "the demo gods".  No matter how many times you rehearse something the odds of something going wrong increases proportionally with the number of people watching.  I recently presented a session at a local code camp.  Rehearsed the demos dozens of times.  I get in the room, connect my laptop to the projector and go to login and XP does not recognize my password!  I tried it a couple of times slowly, made sure caps lock was off but still no joy.  Unbelievable.  So I restarted the laptop and got lucky – it allowed me to login!  Next time I will figure out the appropriate sacrifice to "the demo gods" before presenting a session like that.  ðŸ™‚

  21. Bob, I think that Jobs ripped his presentation routine off from us.  There were errors all around here.

    One of these days, I’ll write about the Steve Ballmer Windows 1.0 demo at the Winter ’85 company meeting (that’s the one where he did the "how much would you pay for this" fake ad that’s on the internet).

    Stuff happens.  It sucks when it does, but it does happen.  In this case, the problem was caught by our test team (while the original report came from an IHV, the test team had test cases that explicitly caught this defect), we fixed it long before the next release cycle, and the bugs would normally never have seen the light of day.

  22. Lots of great comments from listeners, you all keep me on my A game. I think your gonna like the new intro tonight we get you in the mood for a rocking podcast. Please submit your votes at Podcast Alley….

  23. Gino Pascali says:

    The blog entry of the speech team member Rob Chambers is funny. It cites Wikipedia as source of a definition of a term of digital signal processing (clipping).

    Shouldn’t they have more scientific and accurate sources than wikipedia? I mean for a college student would be fine to cite such sources but for a member of the speech team of the biggest software company in the world?

  24. Chris Sch says:

    Dude, if this is a timing issue with your traces on , you should try doing what we do – set up a ram drive on your test machine and send all debug out to it. It reduces the timing issues involved with tracing ..

    Alternatively, just ship vista with tracing on, and have a scheduled job that cleans it up once every so often?

  25. Donald Trump says:

    You are fired!

  26. Chris, we actually use WPP (ETW based logging) for our traces, it’s essentially overhead-less (not quite, but close).

    Gino, I use the wikipedia on occasion too.  It has its issues (it can have serious accuracy issues (look at the history for "The Overlake School" for an example), but it works, and sometimes there’s no more convenient source on the web.

  27. Lon Ingram says:

    Norman "Mommy Never Loved Me" Diamond said:

    >> and root caused the bug fairly quickly.

    >That’s what you get for running as root.  If you ran as a limited user then you wouldn’t cause bugs.

    Norm, you are a douche.

  28. JoeM says:

    Speech is not that bad in Vista.  My big issue is that when typing documents in MSFT Word, the Office 2003 Speech does a better job.

    Larry thankyou for keeping us up to date.

  29. swa says:

    Your candor is refreshing and as a non-microsoftie investor I appreciate it very much. Just take extra care before you ship the product out. Good luck.

  30. Simo says:

    Larry, I think you’re allowed to hold your hand up and say that one was mine. Most people wouldn’t have the courage to do that. But it’s not you fault some apeth decides to demo it to investors.

  31. Norman Diamond says:

    Tuesday, August 01, 2006 10:52 AM by Lon Ingram

    > Norman "Mommy Never Loved Me" Diamond said:

    >> and root caused the bug fairly quickly.

    >That’s what you get for running as root.  If you ran as a

    >limited user then you wouldn’t cause bugs.

    Close.  It’s "grammar never loved me".  Sorry my pun had too many layers for you.

    Interesting that you have no comment on the way Microsoft treats more widespread bugs.  Though oddly, after I posted that comment, it seems Microsoft decided that they really will let me see some of my bug reports and their requests for further information.  I hope to try it out this weekend.  Of course even this much is only being allowed because Vista’s bugs aren’t yet being sold as a product.

  32. Doug says:

    I was present at the meeting and I didn’t really think the ‘glitch’ was as bad as the media have led everyone to believe. I made a blog post about it <a href="http://blogs.3sharp.com/Blog/dougv/archive/2006/08/01/1672.aspx">here</a&gt;.

  33. Larry and Rob blog postings about the Vista Speech Recognition bug is a great indicator on how far Microsoft has come from the ‘’Borg’’ days.

    As a partner dealing with Microsoft, it gives me a lot of …

  34. TV says:

    This is a joke. Microsoft speech team lost all of its stars.

  35. enguillem says:

    The tipical excuse is say that the bug was solved in a early version but not in the demo version. To solve this there is a very good free product that is SUBVERSION and it’s works very well, simply with tagging one revision

  36. A Friend says:

    I’m glad to see individuals at MSFT own up to problems and suggest the approp. measures to address them.  Larry, thanks for your candid commentary on the issue and best of luck during the next demo!

  37. Thank you larry for your explanation. You are clear and sincere as ever. Is your blog that made me dreaming of working on OSes, and even post like this one are a push forward to achieve my goal in the future, and study and work even more hard. Errors and bugs happens even to the greatest ones, the important thing is to not be disheartened and go forward!

  38. Reuben Harris says:

    > The annoying thing about it was that the

    > bug wasn’t reproducible – every time he

    > stepped through the code in the debugger,

    > it worked perfectly, but it kept failing when

    > run without any traces.

    In the unlikely event you didn’t know, the technical term for this is a ‘Heisenbug’. See http://en.wikipedia.org/wiki/Heisenbug#Heisenbugs.

  39. Peksi says:

    Don’t you think you should retire or something? I think there are many skilled 20 years old programmers who could make that work in 5 minutes?

    Man you are getting too old..

  40. Jasmo says:

    When next public demonstration takes place, I hope the presenter starts "Dear aunt.."

  41. furroy says:

    someone forgot some unittest cases at a lower level 🙂

  42. furroy, no, the unit tests didn’t catch this one, because that’s not what unit tests are supposed to catch.

    It WAS caught by the FVT (feature verification tests), but it was intermittent even for them.

  43. Well, sometime it also sounds like it was done intentionally? just to catch more customer attention. Microsoft was sure it will be hosted on videos.google.com.! Viral Marketing.!

    Wait – till they do a demo again – with everything – up and running. ! It will have more positive affect!

    Ah! BTW – I am not Microsoft fan, But I am indeed a marketing fan!

  44. Jasmo Chair says:

    I know that bug since 3 day but I did want say.

    If you cong that error really stupid, protent any conference.

    I do’nt love Microsoft but I use really Microsoft Productions.

    We are wait big antle Vista, if we’ll wait a long time, dont working at the Vista.

    If you are ask my cong, you are very houldying stupid.

    I’s speech properties, wait vista at the properties.

    Microsoft is wur00z.

    I kiss you everybody.

  45. Martin Bennedik says:

    And now it is on the UserFriendly comic strip:

    http://ars.userfriendly.org/cartoons/?id=20060805&mode=classic

    Keep going Larry, and don’t get intimidated by some of the negative comments. I am sure those guys haven’t worked on any big projects.

  46. Will Parker says:

    >> The annoying thing about it was that the bug wasn’t

    >> reproducible

    > So just mark it as resolved, not reproducible, and close it.  

    > That’s what Microsoft does with lots of my Vista beta bug reports.

    When I was a tester at Microsoft (Yay, MacBU!), a bug of this severity would have been call a ‘Pri 2’ – the app didn’t actually crash, but the user couldn’t precede with their planned tasks. In most cases, Pri 2 bugs are accorded ‘Must Fix’ status unless the repro steps are deemed quite unlikely.

    I can accept that the test and/or dev teams might not have had time to actually chase down a known bug prior to the demo, but I do have to wonder why the demo team went ahead with that bug lurking in the background.

    Better to forego a demo and risk talk of slow development than to have a demo fail in an embarassing manner and create discussion of quality issues.

    For those who’ve suggested that Microsoft adopt the ‘Apple Way’ for demos, here are the two main reasons one rarely sees a glitch in Apple demos:

    1) Since Apple never reveals its development plans ahead of time, there’s no pressure to demo half-baked software (except to Steve Jobs himself).

    2) Conversely, product teams are under intense pressure to clear up high-priority bugs as early as possible in the development cycle so that their app gets OK’d to be publicly demo’d.

    Microsoft could do with a bit *less* talk about its future plans and a good deal more focus on planning for excellence at Version 1 instead of at Version 3.

  47. こんな悲惨なデモを、僕は見たことがありません。 これは、マイクロソフトの社員が、Windows Vistaの音声認識のデモを行ったときの様子です。 間違えまくっています。ひとつとして正しく認識できていません。 やればやるほど事態がひどくなっていきます。一方的に殴られ続けるボクサーのようです。 このひどいデモを生み出したエンジニアのサイトには、善意のひとびとからなる大量の情報が送り込まれ、大変なことになっているそうです。 Fun with Vista’s speech recognition [Guardian

  48. Norman Diamond says:

    Saturday, August 05, 2006 7:09 PM by Will Parker

    > When I was a tester at Microsoft (Yay, MacBU!), a bug of

    > this severity would have been call a ‘Pri 2’ – the app didn’t

    > actually crash, but the user couldn’t precede with their

    > planned tasks. In most cases, Pri 2 bugs are accorded ‘Must

    > Fix’ status unless the repro steps are deemed quite unlikely.

    Wow, Microsoft’s ratings of "must fix" vs. "won’t fix" are still astounding.  A bug in the DDK compiler’s code generation a "won’t release a hotfix", a different and older bug of severity that destroys entire hard disk partitions is a "won’t fix", but a bug in speech recognition is a "must fix".  When a DDK Even my previous level of cynicism was inadequate.

    When the DDK’s compiler causes a BSOD every app crashes and the user can’t proceed with their tasks.  When an entire hard disk partition gets corrupted the user can proceed with their tasks a day later by restoring all files from another backup, unless the backup was on a partition created by the same FDISK program.  Therefore these rank far below such a critical app as speech recognition.

    Thank you very much for this information though.  Sad as the facts are, it’s better to know them than not to know them.

  49. You’ve do doubt heard about the Microsoft Financial Analyst Meeting two weeks ago where&amp;nbsp;a demonstration…

  50. Fduch says:

    @Norman Diamond

    Yes! They sre like that.

    What would you say about perfectly reproducible bug in IE7 that causes all other apps and even Windows itself crash/hung/become totally unusable? (When doing things as simple as opening new tab).

    Microsoft thinks that it’s "By Design". Now I finally have proof that MS products are designed to crash everything.

    P.S.

    And why you people ignore Norman Diamond? Does his truth feel too painful?

  51. Dr Obvious says:

    The bottom line is MS has 5 years for this kind of stuff and they cram it in at the last minute.

    Poor planning.

    Vista is buggy and no where near ready.

    This problem is a symptom of a MUCH larger problem.

    Vista ETA: Summer 2007.

    Vista Stable release: 2010.

    Without WinFS Vista is nothing anyway.

    it’s like a big security upgrade (10 years overdue).

  52. Dont be dishearted, keep going, you cant acually judge what you have done untill you are finished and you stand back and watch, but then again, technology is never "finished" it is ever expanding…

  53. Dustin says:

    I’m not all too sure how this became a Microsoft bashing blog, but I must say that Mac voice recognition is not nearly as capable as Vista’s is. Mac’s are not required to function with a vast array of software and still easily perform in unknown platform areas.

    I must also defend Microsoft on the "charges" of stealing Apple ideas and so forth, the reality is that you really can’t avoid doing the same thing others are doing in the software business, when your consumers want something you have to deliver. And I have to say, when your consumer base is between 60-80% of all persons living, then you can never say no. Simple as that.

    Mac users may also benefit from doing some research, as many items that they claim MS stole from Apple are actually a microsoft first.

    Windows Sidebar (dock) – Primitive version on Windows 95 (or was it 98?)

    Fast app switching – Windows 95

    Voice recognition – PC Compatible software (non-mac)

    And generally, if you want to fight over who had begun to make their OS sleek first, it’s a definite win by MS. (windows 95 was one hell of a makeover)

  54. Ron says:

    Microsoft doesn’t know how real people experience these kind of bugs several times a day.  Bill should make a few less billion and pay to fix the problems.  Mac commercial can’t be answered because they are true.  MS stopped trying to be better because they are the "most popular OS in the world."  Sounds a lot like Big Blue twenty years eariler…

    Google-type web based programs seem like the future so Bill can lose a couple of billion now or lose it all in ten years.  My computer reports the same bugs to MS each week but MS never fix them.

  55. From my son, Joseph, speech recognition the Vista way… And for those who think I take delight in bashing Microsoft, here’s a follow-up video that shows the demo wasn’t really as bad as it seemed. It was worse… Seriously, though,…

  56. ndiamond says:

    Here is some really good news.

    Yesterday Vista build 5472 proved itself more powerful than Windows 95 and more powerful than some observations in Windows 98 and Windows 2000.  Instead of destroying the contents of one or two partitions on external hard drives (or in the case of Windows 98 deleting the wrong logical drive in the extended partition), Vista destroyed the entire contents of an internal hard drive.  Right down to the MBR no longer having its signature, and not having any partitions on it after being re-signed.  On a different physical drive internal to the same machine I have Windows 2000 SP4 with the registry key to recognize 48-bit LBA so it could also show the partitions when they used to exist, it agreed with the Vista DVD’s repair console about the fact that the partition table was gone, and it could re-sign the MBR but still not bring back the partition table.

    Now, here is why that is good news.

    Windows Media Centre crashed three times.  Two of those times it yielded Vista’s equivalent of BSODs, though the BSOD text wasn’t visible.  After the last time, that is when Vista wouldn’t reboot, the booter told me to try to do a repair, and the repair tool in the Vista DVD failed.

    Therefore this problem results in the user not being able to use Windows Media Centre for its intended purpose.  Therefore Microsoft might rate it as a must-fix.  Maybe.  Let’s hope.

    If it were just deleting the entire hard drive including deleting the Vista installation without Windows Media Centre having been installed, it would have rated its usual zero.

  57. ndiamond says:

    Saturday, August 05, 2006 7:09 PM by Will Parker

    > When I was a tester at Microsoft (Yay, MacBU!), a bug of

    > this severity would have been call a ‘Pri 2’ – the app didn’t

    > actually crash, but the user couldn’t precede with their

    > planned tasks. In most cases, Pri 2 bugs are accorded ‘Must

    > Fix’ status unless the repro steps are deemed quite unlikely.

    Well, you’ll be glad to know that Pri 2 is still Pri 2.

    When Vista build 5472 deletes the entire contents of an internal SATA hard disk including the Vista installation itself, that must be a Pri 1.  It takes priority over Pri 2.  So it no longer matters that the user of Windows Media Centre can’t proceed with their planned tasks.  Even this Pri 2 collateral damage isn’t enough to gain a "must fix" or "might fix" or "find permission to consider fixing".  Deletion of an entire hard drive is a "must not fix", Pri 1.

    https://connect.microsoft.com/windows/feedback/ViewFeedback.aspx?FeedbackID=181905

    Closed without explanation.  But who needs an explanation.  Microsoft is still Microsoft.  The only question is, why, after all these years, do I still find it so hard to believe.  I’m still not cynical enough.

  58. a good speech recognition tool for you!

    Real-time conversion of speech to text! Dictation 2005 brings you the combined power of several top-quality speech-recognition tools.

    http://www.sharewarecheap.com/business-finance-word-processing/dictation3821-1.htm

  59. Larry Osterman (ya, the same Larry Osterman that got bit by the Audio bug) has a great post on why XGL…

  60. Larry Osterman (ya, the same Larry Osterman that got bit by the Audio bug ) has a great post on why XGL

  61. You’ve no doubt heard about the Microsoft Financial Analyst Meeting two weeks ago where a demonstration

  62. When our new speech recognition for Windows Vista was demonstrated at the Microsoft Financial Analyst

  63. Lots of great comments from listeners, you all keep me on my A game. I think your gonna like the new intro tonight we get you in the mood for a rocking podcast. Please submit your votes at Podcast Alley….