New managed Speech API


I heartily announce that our new managed Speech API is in the Avalon & Indigo Beta 1 RC!


 


With the System.Speech namespace you can incorporate both speech recognition and speech synthesis in your applications.


 


Recognition:


 


The main classes for speech recognition are:




  • DesktopRecognizer: abstracts the recognizer shared by apps on the desktop.


  • SpeechRecognizer: abstracts a recognition engine for exclusive use by your app.


  • RecognitionResult: examine text and semantics returned by a recognizer.


  • SrgsDocument: used to build recognition grammars (the rules for what phrases a recognizer should listen for in your app)

 


For example, to load a grammar containing your app’s commands into the shared desktop recognizer:


 



DesktopRecognizer desktopRecognizer = new DesktopRecognizer();


desktopRecognizer.LoadGrammar(new Grammar(new Uri(grammarPath)));


desktopRecognizer.SpeechRecognized += delegate(object sender, RecognitionEventArgs e)


    {


        // Do appropriate handling when we get a recognition


        // Console.WriteLine(“User said {0}”, e.Result.Text);


    };


 


You’ll also need to have an SR engine installed.  There are various ways to get these.  Tablets already have an engine.  If you have a recent version of Office, you’ll have an engine.  You can also download an engine from the SAPI web site http://www.microsoft.com/speech/download/sdk51/.


 


Synthesis:


 


The main classes for speech synthesis are:




  • SpeechSynthesizer: abstracts a synthesis engine


  • PromptBuilder: build a prompt string containing emphasis, loudness, pre-recorded sounds, and other characteristics.

 


For example, if you want your app to say “hello world”, just write:


 



SpeechSynthesizer synth = new SpeechSynthesizer();


synth.Speak(“Hello world!”);


 


You can easily splice this with a “ding” wave file by using the PromptBuilder:


 



PromptBuilder builder = new PromptBuilder();


builder.AddAudio (new Uri (@”file://\windows\media\ding.wav“));


builder.AddText(“Hello world!”);


 


SpeechSynthesizer synth = new SpeechSynthesizer();


synth.Speak(builder);


 


Windows comes with a synthesis engine.


 


The API uses the W3C standard formats for recognition grammars (SRGS) and synthesis (SSML).


Comments (9)

  1. Today is a big day for speech developers. Check out Robert Brown’s post for details about the new managed…

  2. Mabsterama says:

    What’s that? You’re building an application, and you want it to talk to the user, and to understand them…

  3. We released the "Avalon" and "Indigo" Beta 1 RC to the general public earlier today (release notes)….

  4. Our own Robert Brown is on Channel 9 now, with a Scoble interview discussing speech technology and the…

  5. Paul Mooney says:

    New Avalon, Indigo, WinFX, and Speech API Beta 1 RCWe released the "Avalon" and "Indigo" Beta 1…

  6. The new Speech API is included in the latest Avalon bits that were just released last week. Robert Brown,

  7. Anonymous says:

    The new Speech API is included in the latest Avalon bits that were just released last week. Robert Brown, here in this video, talks about the latest in speech and gives us a demo, and