Voice Actions for Android: Speechable Moments From Google Spell New Market Dynamics

While I was in Canada on vacation, Google stocked the Android App store with a set of new “Voice Actions” Applications. From a functional point of view, it is the superset of speech-enabled mobile services. On new handsets (running the so-called “Froyo” — the Android 2.2 operating system), users will be able to initiate voice dialing, voice search (which equates to a Yellow Pages search based on Google Maps), messaging capabilities, music search and selection, and even map search and directions at the push of a single button, as depicted in this video demo:

This being a demo, your own experience may be different. As head engineer for the Voice Actions project at Google, Mike LeBeau is quite adept at using the services in ways that are designed to impress and amaze. But, this is not like Google Wave, where some of the most creative minds in collaborative computing and messaging invented and launched a platform to show the virtues of sharing on-screen information in real time with little attention to the actual user experience. This is a combination (I’d say “recombination”) of Google’s formidable speech recognition and dictation capabilities with Google Maps and various flavors of Google Search which, unlike Wave, takes a major focus on the user experience, especially for mobile phones.

The set of services has been seen as a direct competitive foray against the native, speech-enabled features on Apple’s iPhone (including the services that may spring from Apple’s acquisition of Siri), as well as the myriad of multi-platform applications from Nuance (Dragon), Vlingo, Promptu and even AT&T. Perhaps more ominously, Google seems to be making the statement that it plans to compete with a crop of fledgling speech-enabled service providers, like PhoneTell, a company that developed some nifty mashups of voice search and call handling on Android phones, in part because there has been less friction involved in invoking and gaining access to the speech processing and call processing features in the Andoid SDK.

It can be argued that the Colossus of Redmond beat Google to the punch a couple of weeks ago at SpeechTEK when Zig Serafin, general manager of the Speech Group at Microsoft, showcased a set of speech-enabled features for the Windows Phone 7 operating system. But Microsoft’s marketing efforts will be hampered by two major issues. One is the overall lack of traction around Windows Phone 7, which is one of several candidates for third place behind iPhone and Android in race for smarphone marketshare (with the largely non-voice-aware Blackberry is the same boat).

The other major impediment is Microsoft’s mixed message surrounding the “Natural User Interface.” Its attempt to leapfrog the pack involves adding “gestures,” exemplified by the full-body involvement of game-players using a feature called Kinect on the xBox. It seems like a leap of faith to think that gestures will make a difference with small screens and mobile devices. Seems like Apple’s multitouch and Nuance’s predictive texting or services like Swype for input make a lot more sense.

As for Nuance, like Promptu and Vlingo, it has offered voice input for Android for several years now. As noted above, its differentiator is destined to be accuracy (which is the clay feet of all applications in the real world where background noise and microphone quality have greater impact than core recognition software), ease-of-use, and an existing installed of happy users. From my perspective, Nuance’s potential trump card in this game (as noted above) is support of multiple modalities through applying several of the principles that support predictive texting across multiple means of input. We also believe that Nuance has something of a “most favored voice technology provider” for both Apple and Siri which could be an important factor in the battle for primacy among the top-tier smartphone providers (Apple versus a broad range of Android manufacturers).

When we look back on the summer of 2010, the launch of Voice Actions for Android will be seen as a signal event. It goes a long way toward re-establishing the spoken word as the natural input for a phone (duh!). That’s the benign part. On the darker side, Google once again shows that it is not neutral when it comes to claiming pre-emptive market share where it sees potential for growth. The result will be accelerated innovation in the name of competition.

Game on!



Categories: Articles

Tags: , , , ,

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.