“Watson Come Here!” Now There’s a Conversational Speech API from AT&T

Congratulations to AT&T for making some waves among the growing pool of applications developers looking for ways to add conversational speech to their offerings. With relatively little fanfare (this being the closest thing to a press release), AT&T let it be known that it would release new APIs (application programming interfaces) in June to make it easier for developers to integrate the goodness of AT&T Watson(SM) speech processing technology, rather than some of the alternatives that are emerging in the marketplace. As AT&T likes to explain it, “This technology reflects an investment of more than one million hours of research and development in speech technologies, leading to more than 600 U.S. patents and patent applications.”

AT&T’s John Donovan provides more detail on the future offering here. In essence, when the APIs are released in June, developers can look forward to making it easier for the people who use their apps to use their voice (and probably their mobile phones) to carry out general web searches, find local businesses, engage in question and answer discussions and originate messages, including voice mail and text messages. Anticipating tighter coupling with Web-enabled TVs, there will also be a pre-built interface to AT&T’s U-verse® electronic programming guide.

Here’s AT&T’s YouTube video providing background on Watson and “how it will work” both for application developers and users:

In short, AT&T will be encouraging the developer community to take advantage of the market conditioning that Apple has undertaken by promoting the Siri Virtual Assistant through mass media advertising. But Apple, with Siri, is promoting the “closed garden” approach that has made the iPhone so successful. AT&T is making it clear that it’s engine, and the one million person hours of R&D will be there for the taking (though the mention of 600 patents) looks like a veiled warning that some metering of the use is inevitable.

Watson is already in use for voice search on iPhones and Android-based devices. An app called Speak4It is a mashup that includes AT&T Watson, and has been available since mid-2010. In addition, the AT&T Translator App, described here, has been a showcase for Watson technologies’ ability to support 7 different languages.

While many in the business and tech press think AT&T’s new offer portends a “smack-down with Siri,” that is not necessarily true. What we are destined to see is geometric growth in the energy put forth by developers to bring take advantage of voice as the “natural interface” for phones and other mobile/personal devices. Apple will continue to use its experience with Siri to refine the service offerings are Siri-enabled. Brisk sales of the iPhone 4S (which is almost identical to the iPhone 4, save for the inclusion of Siri), attest to its economic value. Apple’s implementation of Siri as a “beta” feature of the phone, give it sole access to the “Home” button as the mechanism to summon Siri (as well as the native accelerometer as a mechanism to support the “Raise to Speak” function).

Meanwhile, developers can look forward to more tools for using natural language in specific domains. AT&T has deemed general search, Q&A, business search and control of TVs to be worthy of early attention. Our belief is that positive experience in these domains will lead to the development of more “speechable moments,” as I started calling them back in mid-2010. Today there are more than a dozen firms – coming out of a diverse development community that, in addition to speech processing, spans customer care analytics, automotive electronics, telematics and pure academic research – that have formidable software platforms optimized for understanding the semantic structure, logical content and practical context of spoken phrases.

At this stage in the markets maturity, we’re happy to see AT&T entering the market with tools and resources to support the developer community.



Categories: Articles

Tags: , , ,