One Week to Go: Webcast On Best Practices for Voice Biometric Implementations

2012 January 25

The Opus Research report on Voice Biometric Best Practices is raising the profile of multifactor authentication for secure customer care, proof-of-life and password management. Join Pat Carroll, CEO of Validsoft, and Opus Research’s Dan Miller as they provide more details about increased interest from mobile subscribers, financial institutions and government agencies and share what they expect to see in the coming years.

Live Webcast: Voice Biometrics: Overcoming Barriers to Adoption

Thursday, February 2, 2012 — 10 a.m. EST / 7 a.m. PST
Sign Up Below!

A BrightTALK Channel

Or click here to learn more and to register to participate.

Research Report – Voice Biometric Authentication Best Practices: Overcoming Obstacles to Adoption

2012 January 17


Featured Research
As technology providers and system integrators around the world successfully bring their solutions to market, we’re identifying the product attributes, architectures and deployment strategies that define the best practices in layered, multi-factor and risk-based deployments of voice biometrics.

This Report made available courtesy of Validsoft. Contact Pete Headrick (pheadrick@opusresearch.net) to receive a copy

For more information on becoming an Opus Research client, please contact Pete Headrick (pheadrick@opusresearch.net).

Click Here to View the Report Summary

Voice Control’s Excellent Adventure at CES in Las Vegas

2012 January 12

At this year’s International Consumer Electronics Show (CES), amid new smartphones and tablets; connected TVs; automotive entertainment systems; and super-thin computers, speech processing providers were able to make the point that almost anything you can do with these new gadgets can be improved by adding voice to the user interface. Nuance Communications set the stage with with a series of announcements, including Dragon TV; Dragon Go! for Android-based mobile devices; voice control of UltraBooks(TM), the slim (think MacBook Air) laptops developed by the likes of ASUS, LG, Samsung and now Dell in conjunction with Intel; and for media access anywhere and everywhere, Nuance formed a 10-year alliance with GraceNote, Sony Corporations digital media management operation formerly known as the CDDB (for Compact Disk Data Base).

Dragon TV has an uncanny resemblance to Vlingo’s “Virtual Assistant for Smarter TV” described by TJ Leonard in this blog post. The post includes an embedded video demonstration that was first released in December 2011. TJ also notes that the acquisition of Vlingo by Nuance, which was announced in late December, is not likely to close until later in 2012. In the mean time, both Vlingo and Nuance will continue to promote their products. It’s all the more entertaining for CES attendees and all folks following the speech enabled electronics world.

Neither Nuance nor Vlingo are destined to be alone in their efforts to replace TV remotes with simple spoken instructions. As we noted in an earlier post, Microsoft has a number of initiatives around Kinect that are designed to bring all of the goodness enabled by xBox (games, streamed entertainment, Web navigation) under the control of a combination of voice and gestures.

Japanese Consumer Electronics giant Panasonic has a long-standing relationship with Novauris and will be reportedly rolling out voice controlled TVs later this year. The two companies reportedly co-developed embedded speech recognition software under the NovaLite brand for use by Panasonic directly, or under license to other TV and consumer electronic manufacturers.

Meanwhile, even in the absence of any efforts by Apple to provide tools or APIs for 3rd party developers, the Web is awash in demonstrations of “Siri hacks” designed to take control of household appliances and electronic devices. The video below (courtesy of Vimeo) is especially amusing because it shows the developer actually bolting a black box onto the back of a TV.

Siri Universal Remote from Todd Treece on Vimeo.

It doesn’t end with the TV and game console. As this article by Emma Woollacott in TG Daily points out, LG Electronics has packaged a whole suite of technology under its ThinQ brand to enable people to listen to spoken instructions and also provide verbal feedback, like when a washing machine is in need of a new bearing. It is by no means a new idea, either. This patent, issued to Panasonic (then called Matsushita) in 2006, describes a voice controlled “Home Agent Server” for taking command of household appliances. It references prior filings from Nokia, LG and ultimately AT&T, dating back to 2003.

So the pattern is in place. Embedded processors, high-speed wireless data links and server farms “in the cloud” are delivering on the long-promised vision of “Voice Control of Your Connected Life” in ways that are accurate and reliable and less subject to ridicule (although maintaining a sense of humor has been an important part of ongoing marketing efforts).

Enterprises in Denial: Dealing with the Personal Data Deluge (Global Survey Results)

2012 January 6


Featured Research
A remarkably high percentage number of C-level executives indicate their companies’ lack of a defined strategy to deal with all the “personal data” provided by customers and prospects through a multitude of channels. Yet they also tell us of their plans to incorporate that data into “understanding intent” and forging better communications links that promote loyalty, profitability and product refinement.

This Report made available courtesy of Empirix. Contact Pete Headrick (pheadrick@opusresearch.net) to receive a copy

For more information on becoming an Opus Research client, please contact Pete Headrick (pheadrick@opusresearch.net).

Click Here to View the Report Summary

Lithium gets a $53 Million Vote of Confidence From VCs

2012 January 5

The investment community put some serious weight behind the underpinnings of Conversational Commerce when New Enterprise Associates (NEA) and SAP Ventures chipped in with original investors to double the outside capitalization of Lithium Technologies. Those past investors include Benchmark Capital, DAG Ventures, Emergence Capital, Greenspring Associates, Shasta Ventures and Tenaya Capital. The company is already cash-flow positive, according to this report by Ryan Lawler at GigaOm. That means that investors see great (and global) potential for Lithium to achieve its strategic objective of helping its client companies to engage more effectively with their customers and prospects.

Lithium was a major presence at our Conversational Commerce Conference (C3) last February. Its executives and one of its customers (FICO) described the power of promoting conversations between (or among) employees in marketing, customer care or other departments and individuals with specific questions or issues that they need to resolve. At the time, we saw Lithium tackling issues that were largely internal for companies who had to sort out tensions between the Marketing Department (with “brand” and “message” top of mind) and Customer Care or Contact Center personnel who saw themselves often play the role of customer advocate.

Lithium bridges the gaps by creating a social communications platforms where customers, prospects and enterprise employees can join in forums to resolve the issues that originate directly from customers or prospects, rather than alpha bloggers or angry Twitterers. The additional investment (which brings Lithium’s total capitalization from outsiders to $101 million) will fuel new hiring and investment to enable global expansion. According to Lawler’s report, the company plans to double its 200 current employee count and expand its global reach beyond North America and Western Europe, targeting Asia and the Pacific Rim.

As the saying goes, “Money talks!” and Lithium’s message in the social business domain has been that “customers don’t want to ‘Friend’ you, they want to get something done.” Opus Research would add that it starts with a genuine conversation, not a someone railing or whining on Twitter or Facebook. Lithium recognized this fact from the beginning and it is being rewarded by investors who see the concept’s global potential.

PT Bank Negara Indonesia (BNI) Implements Voice Biometrics for Password Reset

2012 January 3
by Dan Miller

We’ve received news from Indriya Innovations in Singapore that PT Bank Negara Indonesia (Persero) Tbk (BNI) in Jakarta, Indonesia has completed a pilot and moved to full implementation of a multifactor authentication system that employs voice biometrics for automated password reset.

The move marks a “first” in the Asian financial industry and marks a major step forward for voice biometrics, given that BNI is so well respected and is one of the largest banks in Indonesia. Its deployment validates our observation (reinforced by the the likes of IBM Research), that deployments of voice biometrics globally are accelerating.

BNI, which was founded in 1946, underwent a major rebranding and repositioning in 2004. It now employees 20,000 people and has offices in Singapore, Hong Kong, Tokyo and London, as well as an agency in New York. In the press release, company spokesperson M. Harsono, (IT Helpdesk & Command Center Group Head) explained that “resetting passwords have often posed challenges for our IT help desk.” He noted that company policy allows only “strong passwords” which consist of “a combination of alphabets, numbers and special characters.” The result is that “BNI users have often forgotten their passwords.”

Thousands of employees from around the country were calling the Help Desk to reset their passwords. This labor-intensive process has been alleviated by the new, Web-based, multifactor password reset application from Indriya Innovations. Requests from the Web portal give rise to an outbound call to one of the phones or IP-based devices that have been registered with the Help Desk. The multifactor solution integrates “something you have” (the phone device), with “something you know” (challenge questions and a series of numbers) and integrates the Nuance Vocal Password biometric engine.

The pilot and initial implementation supports Indonesia’s national language, Bahasa, for the convenience of domestic employees, thus affirming that voice-based authentication can support multiple languages. If adoption runs true to form (as described by IBM and others) password reset apps are the introduction of multifactor authentication that ultimately replaces passwords in a number of use cases, in banking, healthcare, government transfer payments and mobile commerce.
(Updated: Wednesday Jan 4, 2012 8:40 AM Pacific)

Finally, Vlingo and Nuance Settle Their Differences Out of Court; Turn Full Attention to the Conversational User Interface

2011 December 20

One step’s done and another’s begun as Nuance announces plans to acquire rival virtual assistant provider, Vlingo.

Last October, when Apple elevated Siri (beta) to a “showcase” position on the iPhone 4S, it accelerated speech processing’s momentum into the mainstream of the mobile user interface. It also attracted serious investment among the software, search, e-commerce and mobile giants as they battle for share of an inchoate Conversational Commerce marketscape, where the $5 billion annual market cited by Nuance’s Mike Thompson in this press release is just table stakes.

When it comes to aggregating and assimilating companies that have developed important elements of the mobile user interface, Nuance has been prolific. As a result, the conversational user interface that spans automated speech and text entry drives more than half of the $975 million in revenue Nuance generated in first 9 months of its current fiscal year. Its “Mobile and Consumer” division accounted for over $270 million, while its “Enterprise” group, whose growth is increasingly driven by mobile and cloud-based services, accounted for $212 million.

While duking it out both in and out of court, both Vlingo and Nuance introduced impressive mobile assistants. As we chronicled last February, Vlingo’s roadmap for its virtual assistant includes voice dialing; originating and receiving email, text messages or tweets; and taking control of search and navigation are all done with great accuracy and ease-of-use. Its path to growth involved geographic expansion and the addition of new languages and features. However, from out point of view, the real differentiator has been truly hands-free operation, as demonstrated by Vlingo’s T.J. Leonard in this video.

Meanwhile, Nuance has proceeded to build momentum around its Mobile Advantage services and the “Prodigy” development initiative. Dragon Dictate has an growing corpus of utterances and supports multiple languages. Dragon Go! demonstrates the power of adding natural language understanding while integrating a limited number of destination sites on the Web to support e-commerce and better outcomes for general search. Nuance’s challenge now will be to integrate the best of Vlingo into the Prodigy project. Right now, supporting both brands and services will be the least disruptive path for current Vlingo users.

Regarding Nuance’s acquisition of Vlingo, antitrust should not be much of a concern. Efforts to build the conversational mobile user interface involve direct competition among Google, Microsoft, Apple and Amazon.com, today and will add every major mobile device maker, mobile carrier and “cloud” service provider in the very near term. Collectively, these companies and a number of smaller, earlier stage technology firms will continue to compete and constantly improve the overall user experience as spoken input and command of the features and functions of phones, tablets, TVs and autos becomes routine (when appropriate).

That Didn’t Take Long! Siri-based Comparison Shopping Adds Best Buy Catalog

2011 December 16
by Dan Miller

It’s quickly becoming apparent that Conversational Commerce and Recombinant Communications are inextricably intertwine. (Try saying that five times fast). We’re in month 2 of Siri’s beta release on Apple’s iPhone 4S and we’re already witnessing how the service will improve as a product of natural selection, gradual upgrades, augmentation and evolution. As a case in point, spoken queries to Siri regarding electronic gadgets, appliances, games and computers will result in a display of responses that include the the SKUs (stock keeping units) in Best Buy’s catalog.

You can find coverage of the phenomenon in dozens of tech publications today, but they seem to trace back to this post on the Apple-centric tech blog called RazorianFly.com. According to the post, the enhancement is very much the result of Wolfram Alpha (an answer-oriented search engine that is integrated into Siri’s search results) integrating with Best Buy’s product database through BestBuy.com’s API. Or, as more than one tech blog put it, Siri now returns the same errors as a search on Wolfram Alpha.

As Shaylin Clark at WebProNews explains in the post above, as a “computational engine” oriented toward asking questions, Wolfram Alpha can be quirky (he calls it “finicky”). But putting speech-based access to comparison shopping that includes Best Buy’s inventory marks progress, even if the results are not always optimal. The point is that end-users are gaining experience with the service. They are learning what it is good at and where it fails.

My empirical observation is that people are being much more patient with Siri than they had been with prior renditions of voice-based “assistants” (like Wildfire, HeyAnita or Webley). One reason is that the service is faster, better, more robust and capable of doing more things than its predecessors. There’s more knowledge in the databases that comprise its available knowledge (heck, it defaults to a search on Google, but it has maps, online music and Wolfram Alpha to bring to bear). It’s very early days and Siri is bound to get better. And it will inspire competing services from Google, Microsoft/Tellme, Amazon, Nuance, Vlingo and a handful of others. Each will add new features, functions, information and APIs to differentiate their services and deliver a better customer experience.

At this point Apple has taken a leadership position by coming to market with a service that’s instantiated as an embedded application that recognizes utterances accurately; determines context and meaning; and then has meaningful integrations with a broad range of knowledge bases so that it starts by recognizing intent and finishes by delivering relevant results. The truly exciting aspect to this is that the the services from Apple and its competitors will continue to evolve and get better.

Conversational Commerce in 2012: Emphasizing the “Self” in Self Service

2011 December 15

In 2011, the idea of “self-service” is morphing from a derogatory term about automated handling of calls into an IVR or contact center and has transformed into the preferred point of arrival for users of the mobile, multimodal Internet. In 2015 we will look back to this year as one in which several emerging technologies formed the basis of products and services that define how individuals carry out everyday commerce. These are:

Accurate speech recognition combined with natural language processing: This gets to the heart of Conversational Commerce. Credit Apple’s Siri with bringing the speech enabled mobile assistant into prime time, but expect category leaders Google (just Google the word “Majel”) and Microsoft/Tellme to use their investment in speech processing technologies to leverage themselves into the mobile assistant realm. Collectively, they are making it more comfortable for people to carry out conversations with their smartphones (or tablets or TVs or cars).

Nuance will be a formidable competitor in this realm, working closely with IP and researchers from IBM. Nuance’s speech processing technology is deeply embedded in iOS-based devices (though the licensing terms and details on the integration are closely guarded). Therefore, Nuance is a direct beneficiary of Siri’s success. In the mean time, the company has effectively marketed its own platform for mobile dictation and speech input under the Dragon brand and has launched Dragon Go!, which demonstrates the value of deep integration with popular mobile destinations, including Yelp, OpenTable, Google, Bing, YouTube and a couple hundred others, based on context.

Vlingo is also formidable in this category. Aside from launching an all-out patent war in the U.S. courts, it has effectively differentiated itself as offering capabilities that neither Siri nor Nuance presently have. One of the most important is “hands-free” operation. Using the wake up words “Hey Vlingo” mobile subscribers can then enter commands and content to hear or originate text messages, conduct searches or get driving directions. These are compelling use cases and provide the mechanism for users to put their devices (running all their personal apps) under the control of their voice. Given its size, relative to the cohort of Google, Microsoft, Apple and even IBM (which is working directly with Nuance), it is unlikely that Vlingo will be acting alone. Regardless of who emerges as its benefactor or owner (device maker, mobile carrier, cloud computing provide…), Vlingo’s presence will be felt in the 2015 timeframe.

The Smartphone+Cloud paradigm: This is closely related to Apple and Siri because Siri is an app running “natively” on the iPhone 4S, but relying heavily on speech processing and computing resources in Apple’s cloud. As the retail price of smartphones continues to decline – especially with subsidies from wireless carriers – the adoption curve continues to get steeper and the population of wireless smartphone users gets more attractive. That’s why so many service providers and content providers are comfortable targeting smartphone users as a key customer base.

Common wisdom has it that, by 2015, platform fragmentation issues vis-a-vis smartphones will be largely behind us. Apple’s iOS and Google’s Android will share leadership. Android will have the edge in terms of devices in service and Apple will have the more coherent strategy for monetization of content and service delivery. They will be joined by one or more companies that, today, are considered also-rans, most likely Microsoft’s Windows Phone (with a big assist from Nokia) and perhaps RIM Blackberry. In a perfect world, the “open source” version of HP’s WebOS will become the basis for innovative application development and delivery, but that is unlikely unless there’s a cloud-based entity with its eyes on the smartphone prize.

Incidentally, Amazon.com’s acquisition of Yap shows that it has its eye on speech enabling the mobile phone (not just the smartphone) crowd. This means that Salesforce.com, which watches the operations of Amazon Web Services (AWS) quite closely will emerge as an important player in the smartphone+cloud domain by 2015.

Spoken words recognized as information assets: Once you have people comfortable talking to their smartphones, you have a rich new set of utterances to go into a corpus of data to support better understanding. In the U.S., compliance with federal laws like Sarbanes-Oxley and HIPAA requires companies to capture and store the content of phone conversations between and among employees, customers and prospects. To make the best of the situation, companies have been able to analyze, index and tag the content of these conversations to support business goals, often as part of WFO (work force optimization) programs in contact centers or to facilitate collaboration among geographically dispersed workgroups on a collaboration platform.

Customer care analytics specialists, like Nexidia, CallMiner, Verint and others have developed proprietary approaches to detect patterns, tag and analyze conversations. More recently a firm called HarQen was chartered specifically to treat spoken words as information assets. Its core product line, Symposia, captures and stores the audio from telephone calls and conference calls and allows participants or other listeners to tag or annotate conversations and share them with others. They have developed use cases for human resources to support interviews, performance reviews and the like. But the broader applications for company-wide and global deployments span a wide variety of collaboration efforts in sales, marketing, customer support or product development.

Today, speech analytics can be a complex and expensive proposition. In some cases it involves capture, transcription, tagging, analytics and reporting. In others it is pure pattern recognition, where the core technologies detect recurring utterances or find a set of predefined phrases (like detecting the hashtag “#FAIL” in a Tweet). By 2015, it will be routine to treat spoken words as just another set of unstructured data which can be put under an analytic lens in order to support specified objectives.

Advent of true “self” service: When you put these the above-mentioned technologies together, you have the foundation for smartphone-based services that are highly responsive to individual end-users. Ideally they can distinguish between background noise and spoken words, they can detect activate programs when a “wake-up word” is uttered, they can also distinguish between the voice of their owner and others and then bring pre-loaded preferences, account numbers, historical activities, loyalty programs and other personal data or PII (personally identifiable information) to bear on the task at hand.

Modern CRM and “social CRM” systems give the appearance of understanding intent, but it is largely the product of well-informed guess work, relying on data and metadata provided by customers or third-parties. By contrast, services that adhere to the “Smartphone+cloud paradigm can offer true “self-service.” For example, a smartphone app from French auto insurer Groupama (called “Groupama toujour la” or Groupama Always There) uses the iPhone’s screen as a visual display of agent queues and enables policyholders to indicate the purpose of the call and elect to stay on hold or schedule a call-back.

During the past few years, individual customers have been provided with tools to shorten the time it takes to get to a human when calling the companies with which they want to carry out business. Fonolo, Lucyphone and, more recently Hold Free are each taking different approaches to empowering phone-based customers. By 2015, we can foresee self-service more use cases and deployments that enable mobile subscribers to use their smartphones to take greater control of what personal data they would like to share, with whom they want to share it and their terms and conditions for how friends or the companies they are doing business with can make their info available to others.

Add a speech recognition and natural language understanding and you can see how an individual might say “I’m hungry” and have that two word utterance interpreted properly, and Siri-like results returned. Something like “The next available reservation at your favorite restaurant is at 6:30 PM. Should I make a reservation for you? Or would you like to invite someone else to join you?”

The technologies that are destined to survive and thrive are those that support highly personalized, conversational interactions that culminate in a transaction or other tangible result. This should be the prevailing definition of “self service.” In the near term, enterprises are spending billions of dollars on “Big Data,” business intelligence and analytics resources. Ironically, “Enterprise Mobility” is a close second based on research conducted by the likes of IBM. Our own research, to be published in January, shows that a majority of executives in large enterprises don’t have a defined strategy for managing all the data and metadata generated by mobile customers. When they do, they will also do a much better job of hearing and responding to their true wants, needs and preferences, as well as intent.

The customer care pendulum will swing away from the enterprise’s CRM system as a “customer interaction hub” to a more distributed system where individuals are at the center of their own self-service system.

Teletech Reselling Salesforce.com’s Service Cloud

2011 December 14

In a move that gets to the heart of Opus Research’s original concept of “Recombinant Communications” (RC), business process outsourcing (BPO) specialist Teletech has reached an agreement whereby it will resell features, capabilities and components hosted in Salesforce.com’s Service Cloud. Peaking inside the cloud (and under the hood) of Service Cloud, one finds the resources to support a multi-modal, multichannel contact center. Agents can engage in presence-based chat, via Salesforce Chatter and will also find hooks into ways to monitor and communicate over Twitter and Facebook.

As illustrated in the thumbnail sketch that leads into this post, Service Force is a cloud-base instantiation of all of Salesforce.com’s resources optimized to support a company’s customer service reps. Chatter acts as one of the points of ingress, but there are others that can be custom built as a customer portal, a vehicle for displaying dynamic customer profiles, a resource for carrying out business intelligence and analytic functions as the foundation of customer care activities. In the spirit of RC, Teletech clients will be able to retain and leverage elements of their existing customer care infrastructure while contracting with Teletech to make sure that they can integrate activity from mobile devices or via social networks.

Working with Salesforce.com is not unexplored territory for Teletech. In August 2010 the company hosted software that comprised the infrastructure for a service called the Customer Interaction Cloud, which was jointly offered by Cisco and Salesforce.com. The BPO specialist has experience with Sales Cloud, Service Cloud, Jigsaw, Force.com and Database.com and is prepared to integrate these capabilities into its clients customer care fabric.