Voice UI Is The Future. But When?

by Jean-Louis Gassée

Propelled by what we can legitimately call The Alexa Movement, voice is now perceived as the future of the User Interface. But we need numbers: How many of us continue to use Alexa (or Siri, or Google assistant) after the novelty has worn off? What do we use it for?This will help us understand the likely actual future of Voice UI.

For decades, we’ve been promised voice interactions with our machines — promises that have been iconized in science fiction, from the obliging “Computer!” in the Star Trek franchise to the malevolent HAL in 2001 Space Odyssey (“I’m sorry Dave, I’m afraid I can’t do that”). And, until recently, science fiction is where real-world attempts at giving our machine the gift of gab have remained.

For decades we’ve had almost yearly ‘this time we’ve got it’ announcements followed by dead silence (does anyone remember Ubi)? A cheeky reporter once toured the offices of product reviewers who had touted the hands-free benefits of voice-recognition advances. We know what the reporter found: None of the visionary reviewers used the new Voice UI.

Nonetheless, the desire has remained intact. The ability to speak a command is a natural and obvious evolution in the way we interact with our machines. Just as the graphical UI created at Xerox Parc freed early computers from the unforgiving Command Line Interface (CLI), the promise of a Voice UI frees us from manual touch, it’s a third hand when two are busy, letting us control thermostats, navigation systems, music selection…

For people with neuromotor challenges, Voice UI isn’t just a convenience, it restores power and dignity. Also, lack of literacy becomes less of a roadblock. You no longer need to be able to read and write to get things done, a liberating influence in parts of the world where education is still developing.

Today, Voice UI is no longer just a promise. Amazon rightly gets credit for both its vision and impeccable implementation of the technology.

In late 2014, Amazon introduced Echo, a line of voice-activated “smart speakers”. Prices range from less than $40, on a good day, to just under $230 for the screen-equipped version.

But prices tell a less important part of the story. Bezos & Co. made an early bet on Artificial Intelligence (AI), endowing Alexa, Echo’s intelligent voice responder in the Cloud, with an ever-increasing range of Skills, i.e. response modules proposed to users. With the enthusiastic support of third party developers, Alexa counted more than 15,000 Skills by last July.

To top it off, Amazon’s support for new users is considerate and actually helpful. The company provides a well-designed Skills Guide that orients users in this new world without menus, mice, and keyboards, and also helps existing Echo owners expand their enjoyment of the Alexa universe.

And to top this off, Amazon publishes a weekly newsletter to Echo owners that demonstrates both new and road-tested ways to speak with Alexa. Keep in mind this isn’t your typical email peddling a new robot floor cleaner, or urging you to re-order the laundry detergent you bought last month. No, every single week, Amazon tells Echo users how to make better use of a purchase they made months or years ago. Amazon’s practice indirectly reveals Apple execs haven’t bought Echo devices. Otherwise, getting hebdomadal emails from Jeff Bezos would have shamed them into writing weekly mash notes to their beloved Mac, iPhones and iPas users. But I digress,

Voice UI is great progress, even if the technology feels a bit stilted and is occasionally infuriating. One challenge is that the smart device just sits there awaiting our commands, doing little or nothing to let us know what commands it understands and how precisely they ought to be formulated. Also, voice assistants generally don’t pass the Turing Test, meaning they can’t really fool us into believing we’re conversing with a human.

A bigger frustration for those of us who are interested in the future of the technology, is that Amazon (and Google and Apple) are playing it close to the chest when it comes to numbers. How many Echo devices have been sold? How often are they used on average: Ten times a day? Five times a week? Almost never? Who (age, occupation) uses them the most and for what?

Amazon knows all this but keeps this fascinating knowledge to itself — and so we turn to the “market analysts”. A survey from late 2016 (eons ago in tech time) found that “the top feature tried by Echo users is the very simple act of setting a timer”. A more recent study says only that Echo users “buy more stuff”.

(As an aside: Serious investigation is, of course, complicated and expensive. You need a large sample, say 1000 people, to achieve a decent confidence interval for the results, and the participants need to reliably represent the user population at large. I seriously doubt that most research “reports” caroming in the on-line echo chamber meet the above criteria. For example, the “top feature is setting a timer” conclusion was based on a survey of 180 Echo owners.)

The same lack of numbers applies to Google Assistant. Android smartphones are very successful, and there are hints that Google Assistant does wonders in countries where literacy isn’t yet universal — India comes to mind — but we’re left in the dark when it comes to quantifying uses, successes and failures…

For Apple, we have a decent idea of the number of Siri-capable devices, it’s the number of iOS devices that are less than four-years-old (give or take). Similar to Amazon’s Skills Guide, Apple’s provides a neat Siri onboarding site, a clean and well-lighted place where users new and old can navigate the sea of possibilities… But how is Siri being used? When, where, by whom, how often? We’re not told.

Why do we need to see these numbers? Just out of curiosity, of course, but also to get a feeling for how the Voice UI wars will play out. Today, Amazon is on top, but will it remain there? The three big contenders, Amazon, Google, and Apple, have significantly different business strategies:

Because of its current domination of Voice UI and e-commerce — a synergy no one saw coming — Amazon is uniquely positioned for further episodes of conquest. But in order to win the installed base race against Google Assistant and Siri, Amazon needs to keep installing more Echo or Alexa-capable devices.

Google doesn’t sell goods to “civilians”, it makes its money by selling advertising tools to merchants. While Google Assistant has the reach of Android devices, it doesn’t have Amazon’s unique integration of advertising and e-commerce.

Apple doesn’t sell laundry detergent or advertising — it makes its money by selling devices. Like all other components of Apple’s ecosystem — iTunes, the App Store, the Services business with its rapidly-growing revenue — Siri’s only purpose is to help sell more iPhones, Macs, iPads, and Watches.

Three contenders, three different areas of domination. Which one will win? Ask again later.

Regardless of what happens in the future, one can’t help being impressed with Amazon’s Voice UI performance and wonder what it means for Google’s advertising business and its forays into connected devices.

In the meantime, absent numbers, we have to pause before making lofty claims for the future of Voice UI. Is it, to quote a respected expert, “THE […] critical piece of the puzzle for the next era of computing”? Perhaps. But we ought to remember the small matter of timing, best captured by Horace Dediu’s apothegm: Those who predict the future, we call futurists. Those who know when the future will happen, we call billionaires.

We will not be bored.


PS: No mention of Apple’s HomePod because I haven’t tried one for long enough yet.

— JLG@mondaynote.com

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.