The Latin phrase “deus ex machina” refers to an unlikely force or unexpected character that intervenes to make things better for the main character in a book, film or play. This is the plot device used in H.G. Wells’ War of the Worlds where “the germs in our atmosphere” kill off the Martians or how “ordinary tap water” destroys The Blob in the 1950s horror flick.
The literal translation is “god from a machine.” Its modern mobile equivalents include GM OnStar, Dial “DIR-ECT-IONS”, GOOG-411 and all other voices (automated or otherwise) that respond to requests for guidance or other commands from lost, hapless travelers or mobile subscribers.
2008 promises to be a year in which vox ex machina (meaning “voice from a machineâ€) will intervene on behalf of ordinary earthlings (or should that be “earth linksâ€) as spoken words better integrate into mobile applications over the phone, in your hand and in your car.
Experience Raises Expectations
Here’s another word that traces its origin to Latin: experience. Its root, the infinitive expirire, means “to try or to put to the test.†We, modern earthlings, are putting more applications to the test on desktops, laptops, handsets and automobile dashboards. This phenomena is driving change on two fronts. One is inside the confines of business enterprises; the other is free-wheelin’ in new cars. (Can you say “Sync�)
Web browsing, instant messaging, social networking and screen-sharing used to be forbidden on corporate networks – even more so when real-time interaction called for collaboration with customers, partners or employees “outside the firewall.†This is not the case anymore. Businesses are finding that it is in their best interest to use the Internet, telephone and wireless Web to keep communications channels open 24/7, and they must offer a level and variety of services that are at least at parity with what individuals can avail themselves of at home or over their wireless phones.
Inside the enterprise, IBM’s Lotus SameTime and WebSphere Portal are architected to promote highly personalized collaboration and real-time communications. We will watch closely to learn how automated speech and conferencing as well the new tools and services – designed to make it easy for enterprise employees to take best advantage of real-time chats, bookmarks, conferencing and collaboration – are integrated into the experience.
The latest revision of Lotus SameTime is in direct competition with Microsoft’s Unified Communication Suite as both give a growing number of enterprises new ways to initiate conference calls, share documents, screens and online conference resources. If they are successful, there will be a continued crescendo of such activity both in and out of the office. And with classical Web-based search evolving into document management, database management and data mining, this is fertile ground for competition as Google, Oracle, Yahoo! and others define and refine real-time and mobile Web services for a broader set of customers.
Is it Live or “Who Gives a Damn?â€
As for the automobile-based experience, map-based services are near ubiquitous. They include the gamut of personal navigation devices, Web-based driving directions and geo-positioning systems from household names like Google, Yahoo!, Garmin, and TomTom. More importantly, a slew of applications from mobile service providers and customer care specialists are making the choice between live resources and automated support much more dynamic.
For the first time in the history of automated speech services, applications providers have put “positive results†ahead of such obscure measurements as “automation rates†and “engine accuracy.†Firms like SimulScribe and SpinVox, who capture dictated messages and send them over to countries with low-labor costs for “near real-time†transcription, have attracted investment capital and are rolling out services more broadly. From the user point of view, the services feel like Nuance’s Dragon Dictate, IBM’s ViaVoice or Microsoft’s embedded speech technology. You press a button or give a command in order to dictate into a “speech-to-text†transcription resource.
While we question whether there is a business model for a service that involves live transcribers in 100% of situations, we acknowledge that successful deployment of such services goes a long way to overcome natural skepticism from the general public. Beneficiaries are bound to be the long-time providers of automated transcription, as well as a slew of new service providers like V-Enable or Kirusa, that offer automated front-ends to popular messaging services, especially SMS.
Collectively, it amounts to the rebirth of the speech-enabled portal, with Microsoft’s TellMe, as well as its Live Search business unit, joining the others to propel speech-enabled search into the public spotlight on millions of mobile handsets.
The iPhone Factor
In the coming year, enterprise employees and mobile subscribers have important roles to play by trying new services and putting to them to the test. Past experience shows that success breeds more success. But as Apple demonstrated so vividly in 2006-2007, intense media coverage and major advertising spending is bound to broaden experience. (Just ask Canadian recording artist Feist, who has been endlessly featured in iPod commercials.)
Once again, using Apple as an example, we’ve learned that an improved user experience spawns higher usage. As we reported on LocalMobileSearch.net, providers of mobile Web services are finding that visits from iPhone owners occur with a frequency that is out of proportion with other smartphones. One provider, NetApplications, reported that iPhone browsers outnumbered all flavors of Microsoft’s WindowsMobile combined.
iPhone provides “on-the-glass†presentation of icons or widgets that are short cuts to popular functions, Web sites and information sources. It should be no surprise that the existence of such enablers correlate with higher usage. Ultimately, Apple will bring along more sponsored links and advertiser support required to keep present a broader array of free services over the wireless Web.
Apple’s exclusive relationship with AT&T Mobility (in the U.S.) forces other device makers and mobile service providers to counter with strategies of their own. Verizon Wireless offers the LG Voyager, which matches iPhone’s multiple widgets “on-the-glass,†and adds a QWERTY keyboard that is exposed by physically opening the phone, like a clamshell. The idea that a better keyboard will equate to more use for text messaging and e-mail has yet to be tested.
Open Mobile: More Applications and Advertisements
Displaying widgets for multiple services on the faceplate of mobile devices says one thing: “Your phone can do more.†The current roster of service providers have another message: “Advertising will keep the cost down.†Together, they have done nothing less than invent a new mobile medium. Expanding user experience, faster data links and improved methodologies for buying, targeting and displaying promotional messages are the major accelerants. In 2008, all parties will determine whether their collective momentum can overcome barriers to expand availability and use.
Providers of branded services and applications have long complained about a logjam created by both device makers and mobile operators when it comes to governing real estate on the phone’s front porch. They question whether on-the-glass widgets have merely replaced the long-hated “WAP deck†in the battle for location on meaningful mobile real estate.
New opportunities come in the form of “open†initiatives among both carriers and device makers. In the U.S., the power of openness is baked into the rules that have been developed for auctioning a swathe of wireless spectrum in the 700MHz range in early 2008. According to rules established by the Federal Communications Commission (FCC), a portion of that spectrum will be sold under the condition that it be made available to third-party content and software providers that support a broader variety of devices.
The impact has been immediate and far-reaching. Among U.S. carriers, both Verizon Wireless and AT&T Mobility have declared that their networks to be “open†for almost all intents and purposes (a claim that remains to be seen, especially over Verizon’s CDMA-based infrastructure). More importantly, 266 companies applied to the FCC to participate in the auction. By the end of December 2007, 96 were immediately accepted. The other applications were deemed incomplete, but the bulk of them should be accepted before the auction is conducted. There are a number of relatively small carriers, cable operators and entrepreneurs, but most eyes are on Verizon Wireless, AT&T Mobility, Google and a group of companies, like TowerStream, who have made investments in WiMax (large footprint, broadband wireless) infrastructure.
Mobile Devices Promise to Open Up as Well
Meanwhile about 35 companies have joined the Open Handset Alliance (OHA). These include wireless carriers from around the world and unites them with handset manufacturers and designers, semiconductor manufacturers and a formidable mix of software companies spanning search (led by Google), application acceleration and integration (Aplix, Esmertec and Wind River) and multimodal commerce (eBay, PacketVideo, Sonivox). Of most interest to the CAT community is the participation of Nuance whose new Mobile Division, under Steve Chambers, is positioning its growth plan to support speech-enabled, multimodal, mobile applications.
OHA has made a beta version of Android, the proposed “software stack†for the open handset, available to a broad community of developers. The results have been mixed, with a highly vocal contingent claiming that it is poorly documented and full of bugs. Such complaints demonstrate the power of the open-source approach and community. It is a process that turns constant complainers into the source of continuous improvement. If all goes well, new products and services based on OHA will hit the market in time for the shopping holidays of 2008.
Even if Android doesn’t roll-out as expected, hype alone will propel a number of potential new applications into the spotlight. Immediate access to popular functions, visual access to messaging, spoken commands for navigation, payments and authentication will be made available to the general public much more quickly because the circle of users, developers and mobile operators who have experience introducing new services and processes is ever-expanding.
Time for Role-Based Business Models
The business plan for mobile applications delivery has changed irreversibly. When Google first threatened to bid on the 700 MHz spectrum, North American mobile operators first indicated that they would launch a lobbying battle and court fight by insisting that Google was undermining their long-standing business plan. Mobile operators make long-term investments, they observed, and have relied on regulatory frameworks that protect that investment for the lifetime of all hardware and software components. Therefore, competition should be prohibited by law. They withdrew this argument almost immediately, apparently because top brass recognized that we’ve entered a new era. “Android is an enabler of what we do,†the CEO of Verizon Wireless told BusinessWeek reporters as a testimonial to the complete turnaround in attitude his company experienced in the space of a month.
Executive views may have changed, but the marketplace remains tremendously immature as we enter 2008. The framework for “open access†to networks and “open development†of devices calls out for a framework that will bring a stable products and services to the marketplace. Carriers, device makers and service providers don’t just have dueling technologies, they have competing business models. Users have benefited from the little bit of cooperation that has taken place in the past. For example, North American mobile operators have long subsidized the true price of expensive smartphones. The iPhone is an exception, but it is not yet the one that proves the rule.
Role-Based Biz Dev
For the past several years, Opus Research has predicted a general downward trend in pricing for the core speech processing and call processing technologies. Lowering the cost for basic solutions expands the market, attracts more application developers and makes room for more innovation. These trends will continue to shape the conversational access technologies market in 2008. But each major participant – meaning mobile operator, handset maker, media platform supplier, application developer, content provider, etc. – must know and fulfill its role.
The software development community has long benefited from “role-based†solutions. The term refers to a framework, followed by the likes of IBM, Microsoft and Oracle, whereby users access and navigate through business logic according to their “role†(job title, department, etc.) in the company. In the world of open access to wireless spectrum and handsets, role-based development schemes take on increasing importance. Carriers must offer higher-speed transport, better coverage and well-defined access points to their networks and back-office (billing) capabilities. Handset makers must bring their products to market at affordable prices with popular capabilities, preferably without the need for subsidies (other than advertising, that is).
January 2nd, 2008
Dan Miller
A year ago, RSA, the security subsidiary of storage giant EMC, gave both prospective buyers and technology providers reason to believe that adoption of voice biometric-based user authentication was entering a new phase. By offering Adaptive Authentication for Phone, the company prepared to pave the way for seamless integration of voice biometrics into its fabric of hardware, software and business policy governing access control.
Finally! Clarity in the form of a packaged offer that includes voice biometrics.
Alas, this simple starting point for business enterprises was not to be. In August, RSA and EMC were conspicuously absent from SpeechTEK in New York City. No promotional campaign ever took shape; no more shoes were dropped. The implication is that a company with a $50 billion market capitalization and $11 billion in top-line revenue could not build a compelling business case for speech-enabling its authentication infrastructure.
The parent company’s mantra is to “store, manage and protect” enterprise data. However, like its brethren in IT and network security systems, it has a well-established hierarchy of priorities. At EMC, the order of things include: risk assessment, access control for information and infrastructure, protection of confidentiality and the integrity of information, security management and compliance. User authentication is definitely part of the mix, but it is not a first-order concern, which makes voice-based caller authentication almost tertiary.
Succeeding In Spite of the Cynics
Last May, at the Voice Biometrics Conference in Washington, DC, Opus Research highlighted several large-scale implementations of caller authentication in customer-facing contact centers including a roster of banking customers of ABN-AMRO, communications services customers at Bell Canada and “clients” for government largesse in New Zealand and Australia.
As it stands today - and illustrated in coverage on voicebiocon.com - we’re seeing continuing growth in implementations. In September, BellCanada had already enrolled 275,000 customers into its voice authentication system. Healthcare giant WellPoint has enrolled well over 150,000 affiliated employees. And, likewise, voice biometrics is being used for phone-based authentication of the 128,000 members of Australian Health Management (ahm).
Solving Real-World Problems
The steady increase in enrollments is a good measure of the maturity of the emerging market. It’s a triumph for the tenacious group of technology providers that have successfully provided solutions to both security mavens and customer care specialists. They have boldly moved into the market where EMC and its cohort of security “pure plays” are unprepared to tread. Like EMC’s product planners, enterprise security officers have a well-defined hierarchy of concerns. First and foremost is to prevent wrong-doers from compromising important data. They fight the most common kinds of attacks: denial of service, constant spam, worms and other forms of malware that can bring down the corporate WAN and make all data inaccessible.
Mobility and Social Networking Will Accelerate Adoption
The growth of e-commerce, online banking and mobile access accentuates the need for multifactor protection of customer data for financial services, healthcare and insurance companies. Thus, securing “the phone channel” was the theme of Voice Biometrics Conference in Washington, DC in May. In the meantime, the advent of unified communications (UC) has redefined the term “phone” and with it, both security officers and infrastructure providers have to take a fresh look at network security.
There is growing evidence that mobility and customer convenience are poised to accelerate adoption. As an example of the first phenomenon, simply look at the roll-out of VoicePay as a simple, voice-authenticated means for mobile phone subscribers to make electronic payments. IBM has made duel advancements in the latter area by making speaker verification a feature pack that is baked into its flagship middleware and application server WebSphere, while at the same time introducing highly-reliable, text-independent speaker authentication, which has the potential to greatly simplify the user enrollment process. (Both companies will be featured at Voice Biometrics Conference London November 28-29, 2007.)
Network infrastructure providers have three primary areas of concern. One is to maximize up time, which puts emphasis on intrusion detection, firewalls, session border control and all that fun stuff. Another first order concern is protecting the privacy of a conversation. This is accomplished through encryption of the actual “talk-path.” In addition, the system can maintain “whitelists” or “blacklists” regarding devices that reside at the endpoints of various talk paths.
At this point, the tension between the mainstays of “secure” networking and the values underlying of UC becomes obvious. An emphasis on real-time communications and collaboration dictates implementation of constant “presence indicators,” push-to-talk initiation of phone calls and the simplification of spontaneous conferencing. This is antithetical to prevention of unsafe network entrance.
“Who’s on First?”
This question, first asked by Abbott and Costello, isn’t funny in the context of spontaneous voice teleconferencing or other IP-based real-time communications. How many of us have been on the company’s conference bridge when an extraneous tone indicates that an unknown person has joined the call. Blacklisting a rogue device or softphone does not prevent a malicious interloper from joining the call. As the tools for enterprise-wide collaboration and real-time communications take hold, enterprises are bound to attach a premium to detecting who’s calling, not just what they are using to initiate the call.
Our hypothesis is that voice biometric-based authentication is the most natural and cost-effective way to authenticate callers in real time. At Voice Biometrics Conference London, we’ll have solutions providers and their customers describe how and why our hypothesis is true.
October 15th, 2007
Dan Miller