AI researchers developing reinforcement learning agents could learn a lot from animals. That’s according to recent analysis by Google’s DeepMind, Imperial College London, and University of Cambridge researchers assessing AI and non-human animals.
In a decades-long venture to advance machine intelligence, the AI research community has often looked to neuroscience and behavioral science for inspiration and to better understand how intelligence is formed. But this effort has focused primarily on human intelligence, specifically that of babies and children.
“This is especially true in a reinforcement learning context, where, thanks to progress in deep learning, it is now possible to bring the methods of comparative cognition directly to bear,” the researchers’ paper reads. “Animal cognition supplies a compendium of well-understood, nonlinguistic, intelligent behavior; it suggests experimental methods for evaluation and benchmarking; and it can guide environment and task design.”
DeepMind introduced some of the first forms of AI that combine deep learning and reinforcement learning, like the deep Q-network (DQN) algorithm, a system that played numerous Atari games at superhuman levels. AlphaGo and AlphaZero also used deep learning and reinforcement learning to train AI to beat a human Go champion and achieve other feats. More recently, DeepMind produced AI that automatically generates reinforcement learning algorithms.
On the human cognition side, at a Stanford HAI conference earlier this month DeepMind neuroscience research director Matthew Botvinick urged machine learning practitioners to engage in more interdisciplinary work with neuroscientists and psychologists.
Unlike other methods of training AI, deep reinforcement learning gives an agent an objective and reward, an approach similar to training animals using food rewards. Previous animal cognition studies have looked at a number of species, including dogs and bears. Cognitive behavioral scientists have discovered higher levels of intelligence in animals than previously assumed, including dolphins’ self-awareness, and crows’ capability for revenge.
Studies on animals’ cognitive abilities may also inspire AI researchers to look at problems in a different way, especially in deep reinforcement learning. As researchers draw parallels between animals in testing scenarios and reinforcement learning agents, the idea of testing AI systems’ cognitive abilities has evolved. Other forms of AI, like assistants Alexa or Siri, for example, cannot search a maze for a box containing a reward or food.
Published in CellPress Reviews, the team’s paper — “Artificial Intelligence and the Common Sense of Animals” — cites cognition experiments with birds and primates.
“Ideally, we would like to build AI technology that can grasp these interrelated principles and concepts as a systematic whole and that manifests this grasp in a human-level ability to generalize and innovate,” the paper reads. “How to build such AI technology remains an open question. But we advocate an approach wherein RL agents, perhaps with as-yet-undeveloped architectures, acquire what is needed through extended interaction with rich virtual environments.”
When it comes to building systems like those mentioned in the paper, challenges include helping agents sense that they exist within an independent world. Training agents to grasp the concept of common sense is another hurdle, along with identifying the kinds of environments and tasks best suited to the task.
A prerequisite for training agents to use common sense will be 3D simulated worlds with realistic physics. These can simulate objects, like shells that can be cracked apart, lids that can be unscrewed, and packets that can be torn open.
“This is within the technological capabilities of today’s physics engines, but such rich and realistic environments have yet to be deployed at scale for the training of RL agents,” the paper reads.
The researchers argue that common sense is not a uniquely human trait, but it depends on some basic concepts, like understanding what an object is, how the object occupies space, and the relationship between cause and effect. Among these principles is the ability to perceive an object as a semi-permanent thing that can remain fairly persistent over time.
Forms of cognition exhibited by animals include understanding the permanence of objects and that a reward may lie inside a container, like that a shell may contain a seed. The challenge of endowing agents with such common sense principles can be cast as the problem of finding tasks and curricula that, given the right architecture, will result in trained agents that can pass suitably designed transfer tasks.
“Although contemporary deep RL agents can learn to solve multiple tasks very effectively, and some architectures show rudimentary forms of transfer, it is far from clear that any current RL architecture is capable of acquiring such an abstract concept. But suppose we had a candidate agent, how would we test whether it had acquired the concept of a container?”
Researchers believe tasks for training agents to learn common sense should be capable of developing an understanding without the need to see many examples, approaches known as few-shot or zero-shot learning.
The evaluation of common sense contained in the paper focuses on part of common sense physics, and does not account for other forms of common sense such as psychological concepts, the ability to ascertain different forms of objects like liquids or gases or understanding of objects that can be manipulated like paper or a sponge.
In other recent reinforcement learning developments, at VentureBeat’s recent Transform conference, UC Berkeley professor Ion Stoica talked about why supervised learning is far more commonly used than reinforcement learning, Stanford University researchers introduced LILAC to improve reinforcement learning in dynamic environments, and Georgia Tech researchers combined NLP and reinforcement learning to create AI that excels in text adventure games.
The audio problem: Learn how new cloud-based API solutions are solving imperfect, frustrating audio in video conferences. Access here