On June 16th, Nuance completed its acquisition of SVOX AG, the Switzerland-based provider of a full range of speech processing software. According to a Form 8-K filed with the Securities and Exchange Commission on June 16, Nuance paid former stockholders of SVOX €87 million (approximately $125 million), of which €57 million was paid in cash at the closing; €8.3 million is payable in cash or shares of Nuance common stock on the first anniversary of the closing and another €21.7 million is payable in cash or shares of Nuance common stock on or before December 31, 2012.
This is a signal event in the global battle for supremacy taking shape among Apple, Google, Microsoft, IBM and AT&T, among other technology giants that recognize that the future hinges on providing a highly-personalized user interface that mates speech technologies, “artificial intelligence,” embedded technologies and “cloud-based,” dynamic information and resources that comprise a consistent, “predictive” multimodal user interface.
SVOX was founded in 2000 as a two-person company specializing in text-to-speech rendering. Over the past eleven years it has shown great creativity as it broadened its product offerings, adding ASR (automated speech recognition), acoustic processing (to isolate speech from background noise), dialogue management (bordering on artificial intelligence) and voice biometrics. Its only peer in product range (other than Nuance) is Loquendo, which is the speech-processing subsidiary of Telecom Italia. The company is profitable largely because of successes in licensing its multi-lingual TTS to a multiplicity of solutions providers, mostly “embedded” implementations, but also including Google (for Google Translate).
Nuance will be well-advised to take stock of the full-range of SVOX’s technology solutions and their “fit” with all of its mobile and enterprise offerings. The rap on Nuance of late – in the Speech Gospel According to Vlingo – is that the company (as the largest, diversified provider of speech processing technologies) would rather acquire its competition than take it on in the marketplace. When all else fails, it resorts to the courts (where the number of intellectual property suits regarding speech procesing and the mobile user interface are not worth enumerating). But the truth of the matter, which many of industry pundits fail to register, is that automated speech processing – be it text-to-speech rendering, speech recognition or speaker identification – is, almost always, merely part of a solution rather than a solution in and of itself.
The cold reality is that the market for the speech processing technologies developed by SVOX is driven by forces that are much larger than the sum total of all speech processing providers. Pundits have already rushed to point out that the SVOX acquisition comes in the wake of Apple’s non-announcement of its Siri-supported, iOS-based personal assistant (or in advance of the release of iOS5, as reported here). That pits it squarely against Google Voice Search and related multimodal user interfaces on Android. The other major competitor is Microsoft, which is tightly coupling Windows Phone OS-based services with its own flavor of speech processing and the dialogue management and AI (artificial intelligence) resulting from cooperation between its Tellme subsidiary and Bing, its search engine business unit.
The other major players, of course, are AT&T and IBM. AT&T has invested in Vlingo and provides its core speech processing resources. Vlingo has demonstrated industry-leading (showcase) hands-free applications with device makers, like Samsung and carriers, like T-Mobile. IBM, on the other hand, has put its stock in Nuance by licensing its speech processing technology and forming a developmental joint venture to bring new technologies to market. You can now add SVOX’s intellectual property to the portfolio. SVOX, by itself, had to choose its battles and opeted to focus on embedded TTS. The combo of IBM, Nuance and SVOX has a better chance to bring a formidable portfolio of solutions (ASR, TTS, voice biometrics, acoustic processing, and accompanying application logic) to market.
Collectively, this group of competitors is destined to define the next generation of virtual assistants.
Categories: Articles