Comparing Speech Recognition in Vista vs Apple OS X


I was reading news the other day on my Windows Mobile phone while I was waiting in line for lunch, and I read a review comparing Microsoft’s Vista OS and Apple’s latest incarnation of OS X, Leopard. It wasn’t focused on any one specific area, just on the two OSs in general. It’s always nice to see how Microsoft is stacking up against Apple on the OS front, so I kept reading with excitement and anticipation.

But … When I got to the line item comparing Leopard’s speech recognition with Vista’s speech recognition, I was a little surprised to see that the reviewer said they were essentially comparable. No real differences to speak of.

I was surprised by that because all other comparisons I’ve seen (including this recent one) have always said that the speech recognition capabilities in Vista are much better than what’s available by default in OS X.

I didn’t think much of it, and just kept on reading news until my lunch was ready.

But … This morning I saw this post. Apparently, there’s a wide-spread myth that OS X has had all kinds of great speech recognition capabilities in it since 1994, including speech-to-text (converting the words you say into text and inserting them into the application that’s currently running). But, that apparently isn’t true.

The issue, as this author points out, is that people believe that the speech recognition in Apple’s OS X can do far more than it actually can do. But, effectively, it can only do some limited command and control scenarios. Vista can do so much more.

So … I’m curious. What do you think? Do any of you out there actually have any experience with the built-in speech recognition capabilities of Apple’s OS X? If so, let me know what you like about it, and what you don’t like. Can you do text-to-speech? Can you do all the same kinds of things that you can in Vista?

I’d like to know the real story…


Comments (4)

  1. Rob Chambers, the new GPM for my old group , is blogging about Macintosh speech recognition , asking

  2. Rob Chambers, the new GPM for my old group , is blogging about Macintosh speech recognition , asking

  3. Rhys says:

    Not being a Mac user/developer myself, I’m hardly qualified to comment on Mac vs Windows speech recognition, but this sentence stopped me in my tracks as I was reading:

    "…text-to-speech (converting the words you say into text and inserting them into the application that’s currently running)…"

    Unless I’m missing something big, that’s not what most people understand by text-to-speech (TTS). Rather, TTS is the other way round – the computer uses a synthetic voice to ‘read’ out text. As far as I can see, what you’ve described is speech-to-text, not TTS. Examples of TTS in your OSes include Microsoft Anna for Vista, or Microsoft Sam previously.

    I may have misinterpreted this massively, but that’s the way I’ve always seen the terminology used.

  4. robch says:

    Silly me. Sometimes my brain works faster than either my fingers or my voice. I’ve corrected the post.

    Certainly, “text-to-speech” is the conversion from text in documents and applications to the sounds that humans can understand. And “speech-to-text” is the conversion of words you say into text…

    Thanks for catching the mistake, Rhys.