Using Speech Recognition from Sho

One of the great things about Sho is that a broad variety of powerful libraries is available at your fingertips. Today I’ll give an introduction to using Windows’ speech recognition engine, which is surprisingly easy and fun to use.

To get started, you’ll need to have some kind of microphone. Headsets work best, but webcam mics, built-in laptop mics, or analog mics will work as well. Once you’ve got that plugged in, you need to do a little bit of setup:

>>> clr.AddReference("System.Speech")
>>> from System.Speech.Recognition import *

Now we’ll instantiate a speech recognition engine. We’ll set it up with the default dictation grammar – you can custom design a grammar as well, but the default works pretty well for general use.

>>> sre = SpeechRecognitionEngine()
>>> sre.SetInputToDefaultAudioDevice()
>>> sre.LoadGrammar(DictationGrammar())

Now we can tell the recognizer to listen for speech. Type in the following line and say something into the mic, like “a family of ducks can create a traffic jam:”

>>> res = sre.Recognize()

This call will block until it gets an utterance from the microphone. Once it returns, you can print the resulting text:

>>> print res.Text

You can also use the recognizer in an asynchronous mode, where each speech event will trigger a callback. To do this, first define a function and hook it to the recognizer’s SpeechRecognized event:

>>> def printspeech(sender, event):
txt = e.Result.Text
print txt
>>> sre.SpeechRecognized += printspeech

Then start up the recognizer in asynchronous mode; the call below will return immediately:

>>> sre.RecognizeAsync(RecognizeMode.Multiple)

The printspeech() function will be called after each speech utterance. If you leave out the RecognizeMode.Multiple, the recognizer will stop listening after the first event. Note that the callback can be a method in an object, which will allow you to keep some state. In this way, you can create all kinds of speech-enabled applications: a voice-controlled chess game, a drawing program that lets you change pens by voice, etc. Give it a shot and let us know what you build with it!