Nobody used keyboards in the sci-fi of our childhoods. Whether it was the control system of starships or the hub of an idealistic world, every interaction was based on human speech.
Now we’re closing in on that reality. Siri was the first seismic shift in the field, but companies like Google and Amazon have gone much further. Apple has even announced a massive rollout of voice-activated apps into the App Store. With high rollers like Uber, Runkeeper, and Skype all taking up the mantle of voice recognition, this tech is no longer a niche development — it is swiftly transforming into a necessity for app developers hoping to keep up with the competition.
Alexa Upped Our Speech Technology Game
With Amazon Echo, a device without screens or digital input mechanisms, we need only say, “Alexa,” and our wish is her command.
Alexa started out similarly to Siri. She could accomplish a limited set of small tasks when prompted by a human user. But Amazon opened the platform to developers around the world, and the Echo device’s capabilities grew.
It wasn’t just one company building a product from the ground up; instead, Amazon tapped into the wider technology community to build numerous command lines through the shared knowledge of the neural net. And it isn’t just Amazon that’s making waves with this technology. Google is helping redefine human voice detection, too.
Google’s Voice Access Breakthrough
By analyzing dialects, accents, sentence structure, and vocal inflection, Google is working toward a more precise understanding of human commands. This research will allow programs to differentiate between when a user is asking a question versus making a statement.
This is a huge step in the right direction. Commercial speech recognition has improved by 30 percent over the past few years, but breaking the accent barrier will unleash a new wave of improvements.
The current incarnation of Google’s Voice Access already gives users the ability to control their phones with words instead of actions. But once Google’s research comes to fruition, the real work will begin tying it into voice and intent recognition.
How Designers Can Incorporate Speech Technology Into UX
UX
designers need to start considering the consequences of these developments.
On-screen displays today function side by side with limited voice recognition,
but as networking begins to grow and integrate across multiple IoT devices,
users will need a simple, speech-based UX.
So
how can developers ensure their apps’ UX takes full advantage of this
technology’s potential? There are three key ideas to adopt:
1. Consider the total
experience.
UX today is a primarily visual experience – but with the
incorporation of speech technology – it will become an aural experience, too.
Developers need to adjust their approaches accordingly. They can’t simply focus
on the laying out of links or buttons; they need to think about the entire
journey when somebody interacts with the software.
2. Provide audible cues.
There’s
nothing more frustrating for a user than confusion, and often voice-based
systems leave users dumbfounded over whether their voice commands were recognized
in the first place. Don’t fall into this trap. Provide audible cues to users so
they know their commands were registered and understood.
3. Provide visual cues.
Sometimes users won’t want or understand an audible cue –
they might be shouting into their phones in a busy bar, or they might be
whispering in a library. When working with visuals as well as audio, visual
cues of understanding are very important, especially when there’s a series of
questions to be answered. Users need to know that the first entry has been
understood and that the system is basing subsequent questions on that first
interaction.
The
latest breakthroughs in speech technology have the potential to make our sci-fi
childhoods a reality. And developers have a big role to play in unleashing our
inner geeks. We can make people’s lives easier and their day-to-day tasks
faster.
No comments:
Post a Comment