Taking a page from the “man vs. machine” folk legend of John Henry, Nuance sponsored a contest at its Conversations customer and partner event last week in Orlando, Florida in which Nuance speech applications went head-to-head with Ben Cook, the world’s fastest text messenger. Nuance won.
Textmaster Cook has prevailed in numerous past contests versus contenders using QWERTY keypads, T9 text input or classic three-tap formats for text entry. Nuance’s Mike Thompson did the honors of dictating words into the Nuance Mobile software running on a wireless handset.
For the first message (“I’m on my way. I’ll be there in 30 minutes.â€) the predictive T9 input took over a minute. QWERTY on a Blackberry took 29 seconds and Cook completed the message in 16 seconds. Thompson doing dictation completed his entry in less than 8 seconds.
But this first, simple statement only served to whet the audience’s appetite. The next message duplicated the phrase that was the basis of Cook’s World Record winning tap-off: “The razor toothed piranhas of the genera Serrasalmus and Pygo centrus are the most ferocious freshwater fish in the world. In reality they seldom attack a human.â€
This mouthful-of-a-message took Cook 48 seconds, which was six seconds longer than his record performance. The dictation-based input took just 16 seconds. Thompson admitted that the software would not ordinarily be programmed to render words in Latin. But all’s fair in love and texting. The point was made.
Unlike John Henry, who “died with a hammer in his hand,” Ben Cook will live to text another day. In the meantime, Nuance used a clever PR ploy to make sure that speech technologies cross over into the general news stream. But the larger question is when will speech become a mainstream interface on mobile devices?
Consumer Demand for SEMS
SEMS stands for “speech-enabled mobile search.” And while speech has been hyped for years, this may be the moment when it finally starts to live up to the promise of that hype. The irony is that while the Internet churns out a continuing parade of so-called “Web 2.0” companies whose applications have little or no associated consumer demand, the exact opposite is true in mobile: the applications lag consumer demand.
There are few companies currently delivering against the very real, pent-up demand for mobile content applications that work, are easy to use and offer a range of information for consumers “on the go.” Speech-enabled mobile search exists at the intersection of traditional directory assistance and mobile data. And Opus Research believes that the speech-enabled “mobile opportunity” is huge.
Here are some mobile numbers reflecting the size of the broader market:
• There are more 200 million wireless phones in the U.S. and mobile phone penetration is even higher in European markets. China alone has more than 400 million mobile subscribers, according to the Chinese government.
• Worldwide directory assistance (DA) revenues (both wireless and wireline) are roughly $13 billion (Opus Research, 2006)
• Mobile ad revenue are forecast to reach $11.3 billion by 2011 on a global basis, up from $871 million this year (Informa, 2006)
• More than 50 percent of U.S. mobile phone users have sent or received a text message (Ipsos, 2006)
• In excess of 2 billion text messages were sent globally in 2005 (Frost & Sullivan)
Until comparatively recently, consumers in the U.S. could only reliably consult DA when looking for local information on mobile phones. A study released this summer, conducted by Harris Interactive for Tellme, hints that DA usage is really a surrogate for mobile local search and that users often want greater flexibility or more cost-effective alternatives to traditional DA (“what city, what listing?”). This validates the notion of ad-supported models that are free to consumers.
The Tellme/Harris study confirmed that a majority of DA usage (55%) now occurs in a mobile context. That number was even higher for 18 to 28 year olds (63%). In the aggregate, however, there were common categories of interest:
• Restaurants & Bars: 43%
• Retail Stores: 36%
• Hotels/Lodging: 24%
• Movie Theaters, Amusement & Recreation: 20%
• Transportation (Taxis & Airlines): 10%
There were a number of behaviors revealed by the study that substituted for 411:
• Called a family member: 58%
• Called a friend: 46%
• Stopped at a phone booth: 29%
• Called a colleague: 27%
• Torn page from phone book: 7%
• Booted up computer in the car: 7%
• Driven to wireless “Hot Spotâ€: 5%
Speech Addresses Usability Issues
Notwithstanding a range of problems in the quality of the data and the service, DA is still the most usable and accessible form of mobile local search. This is especially true for non-smartphone users. And, notwithstanding the dramatic rise of SMS, the wireless industry must address some of the more challenging usability issues before mobile data can become truly mainstream in the U.S. Imperfect though it still is, voice is one of the potential responses to some of those pervasive usability questions.
As the Ben Cook “text off” arguably showed, speech is likely the most efficient and “intuitive” interface for wireless phones. Thus, speech is key in the inevitable transition from consumer-supported “directory assistance” to ad-supported “mobile local search.” That transition is in already in full swing with several ad-supported, DA-like services now in the market:
• 1-800-Free-411 (from Nuance partner Jingle Networks)
• 1-877-520-Find (reportedly from Google)
• 1-800-San-Diego (other cities as well)
• Hello Yellow (offered by Yellow Pages Group and Call Genie in Canada)
• 1-800-411-Metro (briefly a Jingle competitor, but ran out of funding)
There are also a range of other companies offering 411-type services via text messaging (e.g., 4Info, AskMeNow). All of these services assume that consumers want a more flexible and free version of DA while on the go.
Mobile is a critical competitive battleground for the Internet incumbents (Google, Yahoo, Microsoft, AOL) and represents the next “disruptive” opportunity. But the wireless carriers obviously have billions at stake as well and are doing their best to avoid the “disintermediation” that most ISPs suffered online. Ultimately voice may be so pervasive in mobile that it becomes a kind of commodity, but there are opportunities for first movers to differentiate with speech and, assuming a good user experience, develop consumer loyalty and momentum.
However, in a recent interview with Deep Nishar, Director of Wireless Products for Google, he expressed skepticism about speech as the “magic bullet” or single key to mainstreaming mobile data usage. He cited the challenges of background noise, accents and a range of other familiar factors in arguing speech (at least at Google) wasn’t quite “ready for prime time.” Instead, Nishar focused on the basic “affordability” of mobile data services and network speeds as the immediate factors that would drive adoption and usage.
Despite the seemingly bearish outlook on voice, Google owns speech patents and is reportedly now testing a DA-like service (1-877-520-Find) that promotes category search in addition to traditional business lookups (“Say the type of business or business name”). The service, though rough at the edges, offers users the ability to navigate back and forth from listings details to “search results,” as well as the option to send the number via text to a user’s phone.
Mobilizing the Mobile Market
Beyond Google, Yahoo! – and to varying degrees – AOL, MSN and Ask are all working on several wireless fronts at once: mobile Web, SMS and voice/DA. Meanwhile, carriers are doing deals with JumpTap, Tellme, Action Engine and Medio Systems to develop their own mobile search and monetization offerings before the search engines and portals can turn them into the proverbial “dumb pipe.” Indeed, something of an early mobile arms race is in progress.
And once true voice-enabled mobile search hits the market (See “On The Road to Speech-Enabled Mobile Search,” Oct. 16, 2006) it might very well be the mainstream usage winner in mobile. But voice is not an island. A truly useful application will need to be multi-modal, allowing for “voice-in†with a text or other output. PocketThis is an example in the current “enhanced DA†arena.
The need for an automated speech interface for a number of mobile applications is now all but self-evident to users, content providers and service providers. That represents a major shift in the marketplace, as the sold-out crowd in attendance at Conversations seemed to attest. We should now see an acceleration of go-to-market strategies with voice as the interface or a key component of a mobile-search offering. The only question now is who gets there first with the best product.
Categories: Articles