Reflections on SpeechTek 2012: Co-Location Lends New Contexts to Automated Speech
This year the transformation of SpeechTEK continued as conference organizer, InformationToday, Inc. (ITI), co-located three closely-related-but-loosely-coupled events at the Marriott Marquis Hotel in New York City. By design, SpeechTEK shared the exhibition space and conference rooms with “CRM evolution” and a newly launched Customer Service experience (CXe) conference and vendor showcase.
While there were some multi-conference attendees, each event appealed to a specific set of interests. For example, SpeechTek had multiple tracks to enable specialists in speech processing to delve into the intricacies of dialogue design, Text-to-Speech “sculpting” or voice biometric authentication. “CRM evolution” showcased “leaders” in deploying the technologies that, for example, integrate activity streams and metadata from social networking platforms or incorporate community-building tools and business process optimization (BPO) into the customer care mix. CXe attracted an emerging and, with hope, fast-growing set of business executives who recognize that the best use of today’s self-service, contact center and CRM technologies should focus on creating the optimal customer experience (regardless of channel) and that the to make life easier and effortless for customers will ultimately lead to greater profits for vendors and brands.
ITI’s “Big Tent” approach gave attendees (including me) a great opportunity to gauge where the various flavors of automated speech processing “fit” in a continuum of conversational commerce that, by necessity, must include Web sites, mobile apps, contact center resources, “the Cloud,” and combinations or permutations of all social media and mobile devices. As a reminder, the roster of automated speech technologies includes speech detection (sorting spoken words from background noise), speaker identification (distinguishing individual speakers from a crowd of talkers), voice activation (using a “trigger word” to awaken a device or application), speaker authentication (using voice to affirm a claimed identity), recognition, understanding, transcription, translation and then text-to-speech rendering.
Without a doubt Apple’s high-profile investment in marketing Siri, its mobile virtual assistant has ignited interest in speech-based services that can carry out tasks or resolve a user’s query. On the eve of the conferences, Angel’s introduction of Lexee and Nuance’s launch of Nina, illustrated how tighter integration of automated speech recognition and text-to-speech conversion with cloud-based resources for natural language processing, semantic web searches and “artificial intelligence” can be applied in enterprise customer care settings.
Such “virtual agents” could be seen as the latest incarnation of speech-based persona. However, they are, more accurately, a conversational “front-end” for customer care resources. Both services aim to make customer care more conversational and, thus, more convenient for callers. As Dave Rennyson, President at Angel, explained to me at SpeechTEK, we’ve entered “post-telephone society” where the interactions between companies and their customers spans time, medium and modality. Rennyson added that, in many cases, the best practice is to support “sparse dialogue design,” where customers use their voice, keypads, keyboards or touch screens to provide as little input as possible for maximum effect – which resembles the way most people use the dialog box on Google’s landing page or on Microsoft’s Bing.
Nuance’s Nina added another dimension to the personalized customer care equation. Buy design, it can include voice-based user authentication in order to simplify and shorten the time it takes for an individual user to assert his or her identity. This provides evidence that the speech processing world well recognizes that stronger security measures are called for as individuals carry out personal chores through interactive platforms. The introduction of Voxeo’s Security Suite, which melds verification of automatic number identification (ANI), Toll, Voice, Location, and payment vehicles in a hosted services platform that conforms to Level 1 PCI strictures affirms the value of adding a dash of voice biometrics to the quality customer experience recipe.
Personalized, multi-channel e-commerce will also be bolstered by advancements in real-time and “near real-time” analytics. As a venue that brings together CRM, speech processing and customer experience experts, the co-located expo’s exposed where the customer care world is moving aspirationally. It also gave us a chance to gauge how well enterprises are moving toward the vendors’ goals. I had a very interesting discussion with Jon Ezrine, Nexidia’s senior vice president and COO. Over the past five years, his company has refined and expanded its product and service offerings to accommodate “on-demand” access to resources that help companies get a “360 degree view” of their customers and prospects.
Nexidia started with speech analytics, to detect patters in spoken words, phrases and emotional triggers that fit a narrative of what makes people “happy,” “sad,” or (most often) “angry.” Now the monitoring and analytics must cross all channels and the analytic lens has much more raw material to work with and there are stronger links to be drawn between observed customer activity and defined business objectives. The speech analytic specialist has fanned out into other modalities and departments to help companies ensure that they offer a consistent image and brand representation across multiple channels (a prime example is efforts by diversified telecommunications companies to maintain consistent dialogue across the triple- or quadruple-play offerings: phone, wireless, internet, cable TV).
The era of “Big Data” suggests that companies can better serve their customers by aggregating and analyzing data from a number of sources and applying that analysis in real time to help foster a successful outcome for customers. In Ezrine’s view, we’re still in very early days for “customer interaction analytics,” which he sees as moving into the “early majority” along a maturity or customer adoption curve. By contrast, speech analytics, referring to systems that link agent monitoring to workforce optimization systems in order to accomplish specific KPIs (key performance indicators), is already well-understood and widely adopted in large contact centers.
The objective of all the technologies under discussion here should be to empower customers. Automated speech has a special place in the customer interaction hierarchy. As mobile phones or other devices assume primacy as tools for individuals to take control of their lives, spoken words are often the most natural way for people to activate and take control of personal information and resources required to carry out everyday activities. Cross-talk between the leaders (or laggards) in related disciplines should be encouraged. At this point, co-location of related conferences is a start.