Happy 2010

Happy 2010, folks!

The end of last year brought some very interesting developments in the speech space.

Bing - We saw the introduction of Bing for mobile, a killer mobile search application that allows you to use your voice to search for what you want. This feature is powered by Microsoft’s speech recognition engine, and I’m happy to see folks using Bing for mobile across all types of phones—Windows Phones, Blackberrys, and the iPhone.

Exchange 2010 - We also saw the introduction of the latest and greatest version of Exchange Server. Exchange 2010 sports a brand new feature for people on the go: Voice Mail Preview. Essentially, the feature records your voice mail in your unified inbox, and provides you with a text preview of what the caller likely said. That way, you don't even have to listen to the voice message.

Windows 7 - In addition to all the coolness of Windows 7 you've probably already heard about like Touch, Aero Peek, etc ... Windows 7 also has a ton of speech features built-in, many of which are new. Features like: better accuracy, faster performance, support for array microphones out of the box, and better integration with even more applications.

Kia UVO - Capping off a crazy week of all things speech at CES was the announcement of KIA’s new UVO (“Your Voice”), which takes advantage of the latest speech technologies from Microsoft, allowing you to get directions and control your music using the power of your voice. An overview of UVO can be found here

I think these new products and services from Microsoft are just the start of a new wave of experiences that will be speech-enabled, from mobile devices, to server products, to desktop and embedded use cases. And ... it won’t just be speech. In many cases, these technologies will take advantage of more natural ways to interface with technology. These interfaces will include speech and things like touch and gesture. We call these new interfaces the natural user interface, or NUI.

Judging by recent developments in the industry, including those showcased at the 2010 International Consumer Electronics Show (CES) happening right now in Las Vegas, 2010 is going to be a year in which speech really goes mainstream. At CES, we have seen good deal of focus on NUI, and our executives are talking about speech constantly:

  • Steve Ballmer talked about NUI quite a bit in his keynote at CES this week (check it out here)
  • Zig Serafin sat down with Microsoft’s PressPass this week and discussed what role speech plays in Microsoft’s NUI strategy (check it out here).
  • Robbie Bach, in addition to his CES keynote presentation, also talked about Project Natal and what NUI means to Microsoft and to the consumer (check it out here)

So ... What’s next for speech and for NUI? Well, you’re going to have to wait and see... 🙂

But if the last year is any indication, 2010 is going to be a breakthrough year for speech.

Comments (2)
  1. Wreck says:

    Thanks for the update Rob.

    For the last two years I’ve been using Vista speech recognition constantly, so it’s been interesting to move to windows 7 speech recognition. I use the UK version for both. The dictation mode does seem a little bit more accurate. However the typing mode (which I switch to by saying "start typing") definitely seems to be less accurate than Vista. I use the NATO phonetic alphabet for programming and there seems to be more confusions now between the words. For example to type the letter ‘h’ I say ‘hotel’ but now this is sometimes misrecognized as ‘capital’ meaning the next letter is capitalized. Another confusion is between ‘dot dot’ and ‘stop typing’. Obviously this is only anecdotal and may or may not be able to be reproduced in the US version but I’d expect a quantitative evaluation exclusively on the NATO typing mode would show a reduced accuracy. Is there any way you guys can get the performance back up to Vista standards?

  2. Wreck says:

    That seems to be fixed now. I guess from a windows update. Thank you for the fix.

    One more issue which I think is more to do with WSR (rather than IE) is that Windows 7 speech recognition hangs on long webpages; see social.answers.microsoft.com/…/10fcc50e-7f8e-4684-891d-3a12b3a53f43 Would be great if you could have a look at replicating this to work out what the problem is.

    Thanks again and Happy 2011!

Comments are closed.

Skip to main content