Google’s Latest Acquisition Brings Text-to-Speech Luminaries into its Fold

In this blog post Mike Cohen, who manages the Speech Technology efforts at Google, invokes images of Captain Kirk and the crew of the Starship Enterprise to provide a vision of the ideal way to interact with computers. In the post, Mike announced Google’s acquisition of Phonetic Arts, a company started by Rhetorical Systems’ co-founder Paul Taylor in 2006. From its inception, Phonetic Arts was chartered to provide “expressive” renderings of text initially for computer games.

As part of Google’s vast array of search, navigation, entertainment and electronic publishing services, one can picture the Phonetic Arts platform as foundational for all sorts of spoken output, from driving directions to full-on eReaders. As Mike Cohen points out, Google Translate already provides spoken output in a multiplicity of languages. He linked to this page that provides a history of the service – starting with English and Haitian Creole then adding French, Italian, German, Hindi and Spanish. With an assist from an “open source speech synthesizer called eSpeak, Google Translate added Afrikaans, Albanian, Catalan, Chinese (Mandarin), Croatian, Czech, Danish, Dutch, Finnish, Greek, Hungarian, Icelandic, Indonesian, Latvian, Macedonian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Swahili, Swedish, Turkish, Vietnamese and Welsh.

The management team at Phonetic Arts looks like an alumni gathering of both Rhetorical Systems and Entropic (a company acquired by Microsoft in 1999. They were pioneers in development of TTS platforms that did a better job of rendering human-like output. Google’s purchase is likely to trigger a higher level of interest in the companies that can support better ways for all sorts of devices to provide spoken output. Last March we noted that CereProc (founded by a former Rhetorical Systems CTO) had made great strides by providing a spoken persona for Roger Ebert.

Meanwhile, Nuance Communications, which purchased Rhetorical and a few other TTS assets about six years ago, has been steadily enhancing its line of TTS resources under the Vocalizer brand. With Phonetic Arts firmly in Google’s fold, it’s clear that the domain of “speechable moments” is rapidly expanding to include more lifelike spoken output.



Categories: Articles

Tags: , , ,

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.