Microsoft Anna – The new TTS voice in Vista


Since I posted yesterday, I've gotten some questions about the TTS engines that Microsoft includes in the OS itself. Well, here's the deal...

Microsoft has shipped TTS engines in the OS before. In the Windows XP time frame, for example, we shipped three different voices that were all based on the same underlying technology: Microsoft Mary, Microsoft Mike, and Microsoft Sam.They were OK, but certainly not something you'd want to listen to all day long (IMO). You can hear what they sound like here: Microsoft Mary, Microsoft Mike, Microsoft Sam.

In Windows Vista, though, we've completely revamped the technology. As a result we have a "new" voice built right into the OS named Microsoft Anna. Here's what she sounds like: Microsoft Anna.

As you can tell, Anna sounds more human than Mary, Mike, and Sam. As time goes by, the voices will become more and more natural. Someday, you won't even be able to tell the difference between a synthesized voice and a real human voice.

By the way ... I created these .WAV files by simply using one of the sample applications included with the SAPI 5.1 SDK, called TTSApp.


Comments (7)

  1. Rosyna says:

    Curious, why is the word "Microsoft" prepended to all the voice names? Does Microsoft really need to brand everything? It reminds me of "Microsoft Windows Vista Home Premium Microsoft License Pack Additional License" (http://www.amazon.com/Microsoft-Windows-Vista/dp/B000HCZ9B6/)

  2. Rob Chambers says:

    It’s fairly common for TTS Voices in Windows to have the company name prepended to the "voice talent" name. The main reason to do so is to help the user distinguish which voices are from what company when the user goes into the Speech Control Panel to choose a different voice. That way, if two different company’s end up choosing the same "voice talent" name (like Anna), users will be able to tell which voice came from which company.

  3. anony.muos says:

    lol that SAM=Foreground window

  4. anony.muos says:

    Check this out:

    http://www.istartedsomething.com/20060812/what-did-you-say/

    Overall pronunciations and naturallness rocks!!

    But Anna seems to:

    1.  Has a tendency to say the next word even before the first one finishes. Some words are pronounced as if they’re at the beginning of a sentence.

    2.  Can’t say ‘Vista’ properly.

  5. eddwo says:

    This seems like a great comparison of various different TTS systems.

    http://www.student.oulu.fi/~vtatila/reviews_of_speech_synths.html

    I have to say that Microsoft SAM sounds no better than the system available on the Amiga two decades earlier.

    Anna seems slightly better, the one demoed in the WWDC Keynote better still.

    The Neospeech sample from that page sounds absolutely amazing. The phrasing is slightly off, but it could be because the text is ambiguous, but the sound is completely natural.

    I understand that you can’t go shipping 600mb of sample for each language with the OS, but the difference between the built in one and the state of the art is night and day.

  6. Check out the "Mojave Experiment" , where Microsoft brought in people to show them a un-released

Skip to main content