Commanding and Dictation – One mode or two in Windows Vista?

A few followers of my blog have recently asked me:

In Office XP, Office 2003, and in Windows XP Tablet PC Edition, there are two separate modes for Commanding and for Dictation. Will this still be true in Windows Vista Speech Recognition?

I’m very happy to say that we’ve “fixed” that in Vista. Windows Vista Speech Recognition does away with the hard separation of Commanding and Dictation modes. In our UI, there’s just one button to turn on the microphone. When the microphone is on, Windows Speech Recognition will be listening for both commands and for dictation at the same time.

Over the past several years (since Office XP) the speech core technology team has continued to make the SR engine more and more accurate. One side effect of this is that it’s now accurate enough to easily distinguish between commands and dictation, when both are active at the same time.

For those of you that don’t know what I’m talking about …

Commands are speech utterances which cause an action to take place on your computer. For example, in Vista, you can say “File” and the file menu will drop down. Then you can say “Open”, and the file open dialog will come up. Those two utterances, “File” and “Open” are both commands.

Dictation, on the other hand, are speech utterances which are converted to text and inserted into the document you’re editing. For example, in Vista, you can say “Hello period My name is Rob Chambers period” while Notepad is open, and the text “Hello. My name is Rob Chambers.” will be inserted into Notepad’s document at the insertion point.

In Office XP, Office 2003, and in Windows XP Tablet PC Edition, the user couldn’t just leave the microphone on and say commands and dictation back to back. They’d have to switch modes. For example, they’d have to say:

“Command”, “File”, “New”

“Dictate”, “Hello period My name is Rob Chambers period”

“Command”, “File”, “Save”

In Vista, however, since there’s only one mode, the user can simple say:

“File”, “Open”

“Hello period My name is Rob Chambers period”

“File”, “Save”

It might not seem like a big deal at first, but if you try using the system for any length of time at all, you’ll see that having to tell the computer each time if what you’re going to say is a command or is dictation, it gets very tiresome.