“Voice Intelligence” and “Affective Communications”

By Dan Miller on September 21, 2021 • ( 1 )

Here are answers to multiple questions that are top-of-mind, or should be, among the people who hold the key to future success of voicebots as they progress from “Personal Assistants” to “Personal Agents”:

Yes, Amazon’s Alexa and Google Assistant are learning from every conversation they have with you.
Yes, when a company says “this call may be recorded for training and quality purposes” they are often referring to training a “voicebot”.
Yes, the articles and video reports appear almost daily in newspapers on cable TV news to warn readers and viewers that their smart speakers are “eavesdropping” on them are true.
Yes, there is cause for concern because nearly every aspect of our online or “on mic” activities are treated as fair game for marketers seeking to identify the customers that are most likely to buy their products *and* fit the profile of a desirable customer.

To summarize: As they mature, ‘bots are becoming more “Affective” (meaning sensitive to moods, feelings and attitudes). The jury is out as to whether this affectiveness will make them more “Effective” (meaning “capable of producing results). And if it is the latter, are they producing results for brands and advertisers, or are they simply carrying out the objectives of advertisers and marketers.

Doctor Joseph Turow spells out the horns of the Voicebots dilemma in his recently published book, “The Voice Catchers: How Marketers Listen in to Exploit your Feelings, Your Privacy and Your Wallet.” I recommend the book for everyone endeavoring to build a compelling Alexa skill or an action for Google Assistant, as well as Customer Experience (CX) and Contact Center professionals that have introduced voicebots to brands and enterprises.

The Voicebot’s Dilemma

Turow presents a well documented and articulate cautionary tale about a short list of e-commerce and search providers are “harvesting” the rich information about intent, attitude and sentiment that our utterances embody. In his words, the people who designed and refined Siri, Alexa, Google Assistant and the also rans set out to make “seductive assistants” in order to make people comfortable enough to use them to conduct searches, play their favorite media and, eventually, buy goods and services. Amazon, by developing a broad line of hardware as well as services, has the clearest stake in charming its customers to providing clear statements of their purchase intent before they visit their Web site or mobile app. But Google is a close second in terms of compiling insights that help provide its advertisers with rich insights to support targeting.

Second, it raises awareness, and the associated red flags regarding privacy.. Amazon and Google are purposely vague about how they treat personal information (hiding terms and conditions in their EULA (End User License Agreement). Voicebot and Intelligent Assistant developers in contact centers (who are, in turn, customers, prospects or “consumers”) don’t have the same business incentives that motivate Amazon or Google. Instead, they have incentive to establish trustworthy links between what Turow terms “Humanoids and Humans”. That means following the practice of pursuing “privacy by design”.

Yet there’s much more to building trustworthiness than compliance to GDPR, CPA2 and other privacy laws. Voice Assistants are trying to succeed against a headwind of general privacy concerns that are amplified by well-reasoned critiques of the practices and approaches of the leading providers of Voice Assistants, where Google and Amazon are joined by a slew of third-party aggregators with designs on using the characteristics of an individual’s utterances to gauge the speaker’s true identity, creditworthiness, veracity or general health. “Harvesting” voice without the speaker’s “informed consent” has taken on dire implications.

Make End-Users Aware of the Trade Off

As every one of the 4 billion or so users of Facebook know, the personal information that we knowingly “publish” about ourselves is voluminous. Facebook uses it to build profiles that are, in turned, used to target advertisements and determine the next post or photo that one sees when casually browsing through the Facebook, Instagram or its other properties. Turow terms this phenomenon as “the spiral of personalization”. It describes a behavior among marketers to Hoover up as much data about an individual as possible in order to provide them with individualized messages, advertisements or offers.

Almost every human (as opposed to humanoid) is well aware of the shortcomings of the targeted advertising approach. At best, it can lead to some serendipitous suggestions. At worst, it results in poorly targeted and untimely delivery of promotional messages. Almost everyone has seen ads for running shoes or a similar “searched for” item appearing in one’s feed long after said items have been purchased.

“Informed consent” is the legalistic term for the permission that brands should obtain from individuals who call in to talk to their agents or interact with their Alexa Skill or Google Action. Too often it is gained implicitly or “by adhesion” when people click “I agree” to the terms and conditions on a Web site or in a smart speaker’s owner’s manual. This approach is not going to withstand scrutiny as the general public becomes aware of the breadth and depth of the attributes that brands are capable of collecting on individuals (even if they are anonymized). In the course of interactions, brands are able to add emotional state and general health conditions to such attributes as location, device in use, credit history and the like.

What’s missing is an honest and forthright description of all the information that a voice assistant’s service provider is able to glean about an individual. Ideally, it would be accompanied by easy-to-use, voice enabled tools that give those individuals control over what they reveal about themselves in the course of a commercial conversation. Privacy-by-design and trustworthiness-by-design are just buzzwords without providing end-users with better tools for controlling how and when their voice and other personal attributes are shared.

From Personal Assistants to Advisors to Agents

Opus Research follow a model for the evolution of voice-first resources from “Assistant” to “Advisor” to “Agent”. An Assistant is like today’s Alexa or Siri, it provides answers to questions or enables individuals to use their voice to take command of popular apps (like Spotify) or appliances. An Advisor has knowledge in specific verticals and can answer questions or provide informed responses while traveling or investigating options for healthcare. An “Agent” is much more aware of personal preferences, like payment cards, airline preferences and the like and is empowered to carry out transactions on an individual’s behalf.

Without transparency, progressing from Assistant to Agent is not possible.Plus there are headwinds mounted by privacy advocates and scholars, like Dr. Turow, who rightly observe that “marketers” absorb all the data they can (including voiceprints and “voice data”) to gain competitive advantage. “Informed consent” won’t cut it. Privacy-by-Design and Trustworthiness-by-Design are the path forward

‹ Looking Back: My First Day at Opus Research (a.k.a. Yesterday)

Amazon Connect’s Voice ID Debuts Globally ›

Categories: Intelligent Assistants