The Blizzard Challenge: Rate text to speech samples

What's the Blizzard Challenge? In a shell's nut, it's a competition that provides researchers and speech labs with the same set of originally recorded waves (i.e., 5 hours of a male speaker), and then challenges teams to create the "best" (i.e., most natural and intelligible) TTS voice. If it were an architecture challenge, it would be like giving the same set of building materials to many different architects to come up with the most functional and beautiful building.

And now rating the results is open to the public here:

It took me about 20 minutes to complete the study. There are five tasks, with the first three scoring waves, and a transcription for the last two. It's probably a good indication of how good a TTS system you can get with 5 hours of recordings. I can't tell who are all of the participant groups at this point. Festvox has a little bit more information on its site.


Comments (0)

Skip to main content