Text-to-speech overivew in Scientific American


Scientific American recently posted an article overview of text-to-speech. I know that other bloggers have been mentioning it, but since I’m a TTS-centric blogger, I thought that I should put in a plug for it here.


One of the interesting question raised by the authors is the following, “Should machine speech be indistinguishable from a human speaker, as in the well-known Turing test for artificial intelligence?” The authors conclude, “probably not.” They say that a “better goal” (rather than a voice that could ‘trick’ a human listener) is a voice that is “pleasing [and expressive[ to which people feel comfortable listening.” Do the authors truly believe this? Or do they mean a more realistic goal rather than a better goal.  I find it hard to believe that every TTS engineer isn’t trying to make TTS voices that sound as realistic as possible. Just because you can make a voice ‘trick’ a user, doesn’t mean that it has to be implemented as such. That is, if you can create a voice that would trick a user, you could just as easily tweak it so that it’s less realistic and perhaps more suited for a warning system or “video games” (I’m not sure the authors are gamers else they wouldn’t have suggested that natural human speech is not most appropriate for video games). You can bet that if the folks at AT&T labs could make a voice to trick a user, they would be writing a much different end to their article.


Comments (2)

  1. Kevin Daly says:

    I think there may be a psychological twist to this however: if you *know* that a voice is coming from a machine, if it is too humanlike it may actually sound *more* phoney.

    I liked the voice of HAL in 2001: A Space Odyssey – well modulated, perfectly understandable, but not gushy.

    I suppose that it’s a mistake for example to give a machine a voice that expresses enthusiasm that we know it cannot feel (makes it sound like advertising, where everyone looks/sounds improbably excited about crappy products)

  2. jaywaltm says:

    It’s been a while since I’ve seen "2001"; however, it seemed to me like that voice was in fact a real person’s voice. Anybody out there know for sure?

    Now that’s another interesting psychological twist… if you think the voice is fake, but indeed it is REAL. Mmmm….