Adding Speech Functionality to your Windows Phone 8 App

The new speech API, is one of the add on features of windows phone 8. This feature when added to your application makes it really interactive and engaging to the user and it is really easy to be done code wise.

Launching apps with speech was already available in the previous Windows Phone releases, but Windows Phone 8 significantly extends the speech functionality available to developers by adding customizable voice commands which allow you to deep link commands to specific pages or actions in your app. Windows Phone 8 also comes with new speech recognition and text-to-speech APIs which enable developers to easily allow users to interact with their apps using speech.

 

Here is a sneak peek into the new speech APIs:

Speech Recognition

In order to use speech recognition in the application, we have to make sure the application capabilities are set to the ones we need to activate speech recognition in the WMAppManifest.xml file. We need to add two capabilities that aren’t checked by default:

  • ID_CAP_MICROPHONE
  • ID_CAP_SPEECH_RECOGNITION

basic speech recognition example will look something like this:

private async void btnSpeak_Click(object sender, EventArgs e)

{

 SpeechRecognizerUI speechRecognizer = new SpeechRecognizerUI();

// start recognition with default dictation grammar

 SpeechRecognitionUIResult recognitionResult = await speechRecognizer.RecognizeWithUIAsync();

// display the speech recognition result

 txtRecognitionResult.Text = string.Format("You said: \"{0}\"", recognitionResult.RecognitionResult.Text);

}

 

Lets go over what these lines:

  • First thing I did was to mark the entire button click method as async
  • I initialized a new variable of type SpeechRecognizerUI, which is the class that is responsible for opening a new window with a default UI, and handling the speech recognition itself (getting sound, sync against grammar, etc).
  • I called the method RecognizeWithUIAsync() to preform the actual action of displaying the default window of the speech recognition, and specified that I want to get the result of the recognition into SpeechRecognitionUIResult.
  • I did whatever I want with the result- in this case simply displayed it in a Textblock.

P.S. The default way to retrieve and analyze the recognition is from an automatic web service that deciphers the correct words. Network access needs to be enabled and currently connected for this to work.

 

Text-to-Speech

In order to use speech recognition in the application, we have to make sure the application capabilities are set to the ones we need to activate speech recognition in the WMAppManifest.xml file. We need to add one capability that isn’t checked by default:

  • ID_CAP_SPEECH_RECOGNITION

basic text-to-speech example will look something like this:

private async void btnSpeakText_Click(object sender, EventArgs e)

{

SpeechSynthesizer synthesizer = new SpeechSynthesizer();

await synthesizer.SpeakTextAsync(tbTextToSpeak.Text);

}

Lets go over what these lines mean:

  • First thing I did was to mark the entire button click method as async
  • I initialized a new variable of type SpeechSynthesizer, which is the class that is responsible for generating a synthesized speech. For example, your app could read the content of a message
  • I called the method SpeakTextAsync to perform the actual action of speaking the content of an assigned Textblock.