As the early adopters or users of Siri or Google Now well know, it has been a challenge for Intelligent Assistants to add an area of expertise, build deep domain knowledge or add support of the newest shiny device that a manufacturer has brought to market. With the introduction of Alexa Voice Services (AVS) and an SDK called the Alexa Skill Kit (ASK), the giant internet retailer aims to change all that by creating simple tools and APIs for app developers to use to add a human-like spoken interface to mobile, consumer electronic (and perhaps automotive) devices.
As if that weren’t enough, the company also established a $100 million Alexa Fund, designated “to fuel voice technology innovation.” This approach stands in stark contrast to other major players in the speech-enabled virtual assistance arena. Siri, for example, follows Apple’s long-standing commitment to internal development. Its acquisition of Novaurus in April 2014 brought in a team of technicians and management personnel with very strong background in speech recognition and experience with one of the longest-standing sets of code-base for software (Dragon) in the speech recognition and text-to-speech rendering.
Google, meanwhile, continues to leverage its ability to apply brute to recognizing and rendering speech in multiple languages and across multiple knowledge domains thanks to its sheer size and dominance in the Web- and smartphone-based search domain. Many claim that Google’s speech recognition and understanding is superior to its rivals, but that is a constantly moving target. Google also has a reputation for providing Web-based access to speech recognition or translation for free; so one may be tempted to ask why Amazon’s Alexa Voice services is any different. The short answer to that question is that Google also has a reputation for using input (and responses) from its free services to build a large corpus of known answers. It will either bake its learnings into a revenue generating service or set a price for individuals or companies to gain access to its APIs.
Automated directory assistance was the prototype for this approach. As I noted in this post in December 2010, Goog411 has served its purpose. It had disrupted a cash cow for incumbent telephone companies while offering a service that “provided a good mechanism for collecting a steady stream of utterances to help tune and refine its speech recognition engine (street names and business names are notoriously hard to “get right”).” In a very tangible way, automated Directory Assistance represented a precursor for today’s initiatives in Intelligent Assistance. It showed that people could be very happy with highly-automated, high-volume automated responses to voice queries. It also established that live agents, or DA operators, had an ongoing role to play in order to provide the most accurate and efficacious response to those queries.
With the introduction of AVS and ASK, Amazon actively solicits a large community of hardware makers and application developers to embed voice-based control in their products and services. With the money from the Alexa Fund, Opus Research expects Amazon to find success this endeavor. We also expect to see several other solutions providers following suit.
IBM is arguably adding another three zeroes to the development of Intelligent Assistants with its $2 billion investment in Watson and its developer support. Watson’s specialty is “Cognitive Computing” and deep understanding of queries. It is not always speech-enabled, but IBM has pretty tight links with speech processing leader, Nuance, to ensure that Watson will have accurate speech recognition and very life-like text-to-speech rendering.
Amazon has used its speech-enabled speaker/controller combo, Echo, to establish a beach-head for Alexa inside homes. Echo had been in controlled release, and the introduction of AVS and ASK corresponds with the general availability of Echo to the general public. We see Amazon expanding the list of devices that support Alexa to get well beyond Echo, as discussed in this article. All solutions providers will benefit from the high-profile that Amazon is taking as it introduces Echo and adds to Alexa’s vocabulary and the repertoire of tasks it is able to undertake at the spoken command of its owners.
Categories: Conversational Intelligence, Intelligent Assistants, Articles