Guest Post: Voice in 2021: From Hype to Value?

2021 promises to be a year of transition for voice assistance and audio services: from the natural hype and the irrational exuberance of a brand new space in 2015-2016, with inflated expectations peaking in 2017-2018, through a trough of disillusionment in 2019-2020, and the stirrings of some sort of enlightenment towards a plateau of productivity and rewarding maturity in 2021 and beyond.

Here are some lessons we have learned at Witlingo:

Marketers and customers want a solid ROI for Amazon Alexa skills, Google Assistant actions, and Samsung Bixby capsules: It remains a struggle to provide a good answer to the simple question: what is the return on my money and how long until I get that return? Sure, there are companies that are thinking strategically, that buy into the proposition that voice is here to stay, and that it’s wiser to adopt earlier than later. But such partners — these early adopters –are the exception, not the rule. The rest, understandably, don’t have the time or the budget, they feel, for the long game. They need to justify the spend to their boss: not just in terms of paying you dollars for your offer and service, but also in terms of having someone on their team spend precious cycles on this new channel. We have yet to crack this puzzle.

Discovery of skills/actions/capsules remains a significant challenge: Marketers remain reluctant to do even the most elementary things to get the word out about their skills and actions. For instance: leveraging their current social surfaces (Twitter, Facebook, LinkedIn, etc.) to alert their client base that a skill/action exists. And what I mean by leverage their current social surfaces is something as basic and as nearly effortless as this:

Another simple, low hanging fruit: mentioning their voice presence in their podcast (which more and more companies are, wisely, launching) or in podcast appearances as guests (“You can also find us on Amazon Alexa and Google Assistant”). Why not? Partially because the imperative to take seriously an Alexa skill or a Google Assistant action is so not there, so that the overhead of tweaking a couple of graphics here and there or mentioning the skill/action in a podcast is thought to be not worth the while. In other words: CMOs have yet to send out that stern memo to all, instructing them to actively and consistently promote their voice first presence.

The technology is still not there: Alexa is now 5 years old and Google 4, and yet their basic capabilities have at best improved incrementally. Speech recognition has improved, but not dramatically. Text to Speech (TTS) has improved, but not by an order of magnitude. Conversational sophistication has almost not improved at all: for instance, you still have to say “Alexa” or “Hey Google” to interrupt the assistant when it’s speaking. That’s a pain and things get tiresome and tedious in a hurry. Notifications remain rudimentary: with Alexa, you still can’t play an audio when playing the content of a notification (whatever you play, it needs to be TTS), and all you can do is have Alexa say something and then stop (you can’t even launch a skill!); and as far as Google Assistant is concerned, those two problems don’t exist as such, because a third party action can’t even trigger a notification on a smart speaker! In other words, we are very, very far from the unimaginably powerful conversational assistants we thought we would have 5 years into the voice first revolution. (Sure, Google dazzled us with Duplex — but that turned out to be a partial parlor trick.)

Neither Amazon nor Google seems to care much about nurturing a healthy voice first ecosystem (the verdict is still out on Samsung): It pains me to say it, but it’s the truth. The Amazon and Google folks handling the partners do care very much — there is no denying that. I have experienced this first hand, and consistently, and I and my team appreciate our partner handlers. But at the strategic level, with the higher ups and the big heads, it’s clear that they really don’t care. What they do care about are the features that they can deliver to their respective first domains — i.e., the out-of-the-box capabilities — even if doing so would mean cutting their partners in the knee. In other words, the product teams at Amazon and Google seem to view the developer community not as an extension of their team but as straight up competitors to beat and as a source of ideas to pick up features from. Partners are not even seen as customers to empower and enable (so much for customer obsession). Example: they will launch — and have launched — a Voice User Interface (VUI) design tool, even though far richer VUI tools exist in the market. Is that a bad thing? Yes it is. For, how can one compete with free stuff from Amazon and Google (even if mediocre)? They will also augment — and have augmented — their analytics capabilities, even if doing so means killing off (and they have killed off) budding startups. They will also launch courses on design, they will set up training programs, they will write articles, etc., as if no one in the ecosystem is busting their behind doing that (and doing it with greater verve and depth than they are). In others words, either they really don’t care or it has not occurred to them to care.

How will we move from here to a better place? I suggest two action tracks: (1) We, those who evangelize for conversational voice, need to solve the hard problems of identifying real value for our customers and delivering that value in a way that is impactful to their top line or their bottom line. If we manage to do that, I have no doubt we will start seeing CMOs dispatching those stern memos, and tough problems such as discovery will begin to disappear. And (2) Amazon, Google, Samsung, and other platforms, need to adopt a coherent, serious strategy towards their partners and need to care about the ecosystem, if they want to be the leaders that they claim to be. They need to shift their focus away from adding new features, dazzling as they may be (and can you really get more dazzling than having Samuel L. Jackson be a voice of Alexa?) to the core assistant, and instead double down on building up the plumbing — some of it really basic — that partners need in order to build those compelling, ROI delivering experiences for their customers. That is, they need to view Alexa/Assistant/Bixby not as a mere assistant but as a platform on which to build value.

I will stop here, even though I have more to say, and give the floor to you. What are your thoughts on this stage of voice and audio? Do you think that voice and audio will really move to the beginnings of a value delivery phase in 2021? If so, what is your evidence? If not, why not? And what do you think about what Amazon and Google are doing? Do you have opinions? And if so, are you willing to share them or are you afraid to upset some people? (This is 2021, after all: the year to get real!)

If you would like to share your thoughts on where voice is going in 2021, using — what else of course but your very own rich and nuanced voice — please go here and leave a 1-2 minute #voicesnippet via here: https://www.witlingo.com/voicefirst/

We will be happy to publish your audio contribution on The Voice First Flash Briefing, the corresponding Spotify Microcast, and in the always running Youtube Voice First Live Stream.


Ahmed Bouzid, previously Head of Product at Amazon Alexa, is Founder and CEO of Witlingo, Inc., a McLean, VA-based B2B Saas company that helps brands launch voice first solutions and experiences on platforms such as Amazon Alexa, Google Assistant, Samsung Bixby, and beyond 

 



Categories: Intelligent Assistants