Amazon’s Alexa is ready to broaden its one-track mind.
The virtual assistant can deliver weather forecasts and traffic updates, and tap more than 90,000 additional functions, or “skills,” contributed by outside developers. The catch is that you can generally do only one of those bounteous things at a time.
That limitation—also found in competitors such as Siri and Google Assistant—has hampered the ideal of virtual helpers serving as omni-capable butlers. A user who wants to combine a virtual assistant’s varied skills to perform a multi-part task typically has to make several requests, one after the other.
Wednesday, Amazon demonstrated a new model, in which Alexa completes multiple tasks in a single conversation that combines services previously isolated in separate skills. That new power is promised in coming months, and initially will be limited to a single use case: dinner and a movie.
Alexa offers multiple skills for purchasing movie tickets and reserving restaurant tables, but each had to be used in isolation. In a demonstration of the new experience, after a person had bought two movie tickets through a skill called Atom Tickets, Alexa asked “Will you be eating out?”
When the answer was “Yes, find me a Chinese restaurant,” Alexa transitioned into a discussion about nearby options, and reserved a table for two. The assistant then offered to arrange a ride to the restaurant, and scheduled an Uber.
Previously, achieving all that would have required a user to speak to Alexa at least 40 times, says Rohit Prasad, the vice president who leads work on the artificial intelligence behind Alexa. The new multitasking conversation system can get the same result in 13 utterances or fewer, in part because the user doesn’t have to repeat the time and location over and over.
“This shifts the cognitive load from the customer onto the assistant,” Prasad says. He announced Alexa’s new conversational abilities at Amazon’s re:MARS conference in Las Vegas Wednesday. The experience doesn’t depend on Alexa making suggestions: Users can also proactively ask Alexa for a ride or dinner reservation to go with their movie tickets. Google’s assistant can handle some follow-up questions that refer to earlier commands, but not to proactively remix outside services as Amazon showed Wednesday.
Changing how people plan trips to the movies is a nice, but small, upgrade to Alexa’s capabilities. It’s also a small step in tackling one of the trickiest challenges in computing—how to make machines fluent enough with language to properly converse with people.
One reason Alexa and other assistants have generally been limited to single-shot queries is that software struggles with the variety of language. Even simple questions like asking someone to share a meal can take on many forms: You could refer to food, dinner, a bite, or eating out. The answers bring more complexity: not just “yes” or “no,” but all shades of indecision, and opinions about what kind of food or restaurant. Limiting users’ options cuts down the uncertainty. Conversations, where every utterance entangles new meaning with the prior context, are particularly challenging for machines.
Prasad says that Alexa’s new multi-tasking mode is built on improvements in Amazon’s ability to use a conversation’s context to puzzle out ambiguous sentences. That makes Alexa more likely to choose the correct response in conversations not limited to a single function, he says.
Alexa’s upgrade also depends on software that guesses when to suggest switching to a different function, and what data, such as times and locations, needs to be transferred to make the functions work together smoothly.
Amazon says the underlying technology can be applied in many different scenarios and languages. The movie-night planning service will be limited to the US, and to English, but Prasad says Amazon will soon give Alexa the power to multitask in other, unspecified, ways.
Travel would be an obvious use for the new capability, says Werner Goertz, a research director at Gartner. Juggling hotel reservations, car rentals, flight times, and other logistics is tricky; Alexa could tap multiple services to help manage them in a single conversation.
Making Alexa work like a broker for other company’s services could also bring new challenges. Goertz says Amazon will have to be careful not to ask users too aggressively about additional services such as rides or restaurants, because it could be annoying, or make users wonder if Alexa is working on behalf of other companies. Amazon’s research into having Alexa detect frustration in a user’s voice may help with that.
Broadening Alexa’s conversations could also add to business challenges raised by hosting skills from outsiders on the assistant, Goertz says. The new multitasking model sees Alexa play a more active matchmaking role between users and other companies. Not all of them may like how Amazon deploys its brand, or chooses which get presented, says Goertz.
Prasad says Amazon has already been thinking about such challenges, and is talking with developers to get their views on the new conversational approach. Since 2017, the assistant has responded to a user saying “Alexa, I need a ride” for the first time by suggesting both Uber and Lyft. After that it defaults to the one you used previously.
Longer term, Amazon intends to take a less active role in designing specific use cases like the movie night planning system. Tools will be offered to developers to let them make experiences combining multiple services in a single conversation. Ultimately, Alexa should be flexible enough to recombine any skills as a conversation requires, Prasad says.
More Great WIRED Stories