Home › Articles › Nuance and Expect Labs Combine to Showcase Power of Understanding Speech

Nuance and Expect Labs Combine to Showcase Power of Understanding Speech

By Dan Miller on December 14, 2012

Evidence that 2013 is going to be a watershed year for the conversational user interface continues to mount. The recently announced joint effort between technology start up Expect Labs and Nuance Communications provides a perfect case in point. Expect Labs was founded a little more than two years ago and has invested in development of an “Anticipatory Computing Engine” that constantly monitors multi-person conversations in order to derive meaning.

Soon, Expect Labs will formally introduce its first product, MindMeld for iPads, to showcase how its core technology embedded in a tablet will support a new kind of interpersonal communication. By monitoring and deriving meaning from the contents of a phone conversation, it can serve up related search results, such as news articles or Web pages of interest. It then facilitates sharing with other people on the call, as well as archiving and indexing records from the phone call.

Expect Labs’ product development efforts dovetail nicely with R&D efforts that Nuance has dedicated to various aspects of natural language understanding over the past few years. At Nuance’s Customer Experience Summit held in Orlando in early December, Doug Sharp, VP of Enterprise Engineering, noted that the company has 135 engineers “dedicated to some aspect of language understanding.” His point is that the statistical language models (SLMs) that are necessary for accurate automated speech recognition are only the beginning of what comprises a natural, multi-modal, mixed initiative user interface. Much of the heavy lifting now is taking place along a continuum that spans machine learning, data driven discovery in new or evolving domains and mining the intranet and social networks to bring continuous improvement an engine’s ability to understand meaning and provide new kinds of assistance.

Significant investment is being made into “mixed initiative” and “agenda-based” task specification. That means that a person might provide direct instructions to a device or application or the application may recognize the purpose of a call based on its context and prior utterances. Another key object of investment is in “advanced dialog” management, which supports orderly interactions or turn-taking between person and machine. This heralds the arrival of “collaborative assistance.” As Sharp put it, we are witnessing the evolution from a “personal assistant” to a “personal advisor.”

When formally released, Expect Lab’s MindMeld for iPad will leverage Nuance’s Dragon SDK (software development kit) for more accurate speech recognition. The Dragon SDK is provided as part of Nuance’s mobile app developer program called NDEV, designed to make it easier for mobile application developers to speech-enable their applications. But, as Doug Sharp pointed out in Orlando, the natural language understanding and dialogue management taking place in the background are just as important as accurate speech recognition and rendering when it comes to anticipating a person’s intent and providing relevant responses or taking proper action.

The Dragon SDK can closely link automated speech recognition to tools and resources comprise a natural language portal. That includes audio clustering, auto-transcription, grouping and tagging of content and quick grammar generation. It will feel as if the application is learning along with its user. The prospect for MindMeld, and other post-Siri applications, to be quick learners is what makes me quite bullish for prospects of better personal virtual assistants (or advisors) in 2013.

‹ Reflections on VBC 2012-Singapore

Kurzweil’s Move to Google Will Accelerate More Human-like (and Humanistic) Virtual Agents ›

Categories: Articles