Amazon today announced a new Alexa feature for U.S.-based English-language users that enables devices powered by the assistant to infer latent goals, or goals implicit in requests but not directly expressed. For instance, if a user says “How long does it take to steep tea?,” Alexa might follow up with “Five minutes is a good place to start” and the question “Would you like me to set a timer for five minutes?”
According to Amazon, dialog transitions like these require a number of AI algorithms under the hood. A machine learning-based trigger model decides whether to anticipate a latent goal by factoring in aspects of the context including text of a user’s session and whether the user has engaged with Alexa’s suggestions in the past. If the model finds the context suitable, the system suggests an Alexa app to address the latent goal.
Those suggestions are based on relationships learned by the latent-goal discovery model, according to Amazon. (For example, the model might discover that users who ask how long tea should steep frequently follow up by setting a timer.) The latent-goal discovery model analyzes several features of user utterances including pointwise mutual information, a measure of the likelihood of an interaction in a context relative to its likelihood across Alexa traffic. Deep learning-based sub-modules assess additional features, such as whether a user was trying to rephrase or issue a command or whether the direct and latent goals share entities or values (like the time required to steep tea).
Over time, the discovery model improves its predictions through active learning, which identifies sample interactions that are particularly informative during fine-tuning.
In the next portion of Alexa’s latent goal inference pipeline, a semantic-role labeling model looks for named entities and other arguments from the current conversation including Alexa’s own responses. Context carryover models transform these entities into a structured format the follow-on app can understand, even if it’s a third-party app. Lastly, through bandit learning, in which machine learning models track whether recommendations are helping or not, underperforming experiences are automatically suppressed before they reach Alexa-enabled devices.
Amazon says that latent goal inference requires no additional effort from app developers to activate. However, developers can make their apps more visible to the discovery model by using Amazon’s Name-Free Interaction Toolkit, which provides natural hooks for interactions between apps.
“Amazon’s goal for Alexa is that customers should find interacting with her as natural as interacting with another human being,” Amazon wrote in a blog post. “While [apps] may experience different results, our early metrics show that latent goal [inference] has increased customer engagement with some developers’ apps.”
Latent goal inference builds on Natural Turn Taking, an Alexa feature that lets users converse with the assistant without having to repeat a wake word. (Three AI models run in parallel to power Natural Turn Taking, which will initially only be available in English when it launches sometime next year, as previously announced.) Earlier this summer, Amazon launched another conversational capability in Alexa Conversations, which aims to make it easier for developers to integrate conversational experiences into apps.
How startups are scaling communication: The pandemic is making startups take a close look at ramping up their communication solutions. Learn how