• The world’s first psychopath AI (Artificial Intelligence) serves as a reminder that although newer, improved, and increasingly complex algorithms continue to be developed, the need for good, unbiased data will always be paramount.
The Rorschach test, named after its creator Hermann Rorschach, is a psychological test that uses a subject’s interpretation of a series of ambiguous inkblots to assess her personality, emotional tendencies, and mental biases. Depending on a person’s perception of the different, seemingly abstract patterns, psychologists are said to gain insights into her thinking process. Although different people perceive the ten images (that make up the test) in different ways, people with mental disorders, such as schizophrenia and psychopathy, come up with extraordinarily skewed and, sometimes, morbid interpretations of these blotted patterns. For instance, for an image that normally would evoke responses such as ‘a moth’ or ‘a bat’, a psychopathic response would be something like “pregnant woman falls at construction story”. Or an image that is normally interpreted as ‘two people standing next to each other’, a psychopath would interpret as, say, ‘man jumps from a window. Surprised? Well, these disturbing answers were exactly what Norman, the world’s first psychopath AI gave when subjected to the Rorschach test.
The psychopath AI
Norman is an image captioning AI, similar to the ones used by Facebook to identify people and objects in images posted on the platform by users. It is meant to scan images and make sense of the contents by describing the picture in words. But, as you read just now, what Norman sees isn’t exactly what normal image recognition AI programs see. Where normal image captioning AI sees a black and white image of a baseball glove, Norman sees ‘a man murdered with a machine gun in broad daylight’, or where standard AI sees a group of birds sitting on top of a tree branch, Norman sees a man electrocuted to death. If you gave such deviant responses in an actual Rorschach test, you’ll almost certainly be labeled as a psychopath and sent on your way to an asylum. Now, before you become too worried or give in to your paranoia of AI and robots spontaneously going rogue, let me clarify that Norman, the psychopath AI, was created intentionally by a group of MIT scientists to respond in the way it does. But how was Norman created this way?
The making of an AI
As you might know, creating an AI involves two major facets – 1. designing the algorithm and 2. training that algorithm with data. The algorithm determines how the data is processed and how the outputs are derived from the input. Algorithms are usually created by attempting to replicate what little we know about human cognition and thought processes. The more closely an AI algorithm can emulate human mental processes, the better. AI researchers constantly take inspiration from brain research and studies by neuroscientists to find out how our brain performs specific functions to create algorithms and neural networks.
Take for instance how scientists at Google’s DeepMind lab developed vector-based navigation capability in AI by understanding how human and other animal brains performed navigation using a hexagonal grid-based coordinate system. Hence the functioning of AI systems can be considered analogous to the functioning of a human brain. Just as the brain develops its decision-making capabilities through learning and experience, AI algorithms are trained using large datasets, containing data similar to what the system would process during its full-fledged implementation. Thus, an AI that is supposed to understand written text is programmed with natural language processing (NLP) capability and is trained using large sets or textual data.
This enables the AI program to teach itself how to process language better. Similarly, an image captioning AI gets more proficient when it is trained with the captions of a large number of images. However, just as bad experiences shape a person’s mind in a different way than the good experiences, allowing an AI to ‘experience’ good or bad data when it is trained can determine the way it responds when used in real-world applications. Hence, it may come as no surprise that Norman was trained to respond in a ‘psychopathic’ manner by feeding it data exclusively from a Reddit page where users posted gruesome and morbid pictures. Since the AI was trained on captions relating to violence and death, it learned it as a natural response to even the most serene imagery. It’s like if you teach a kid, ever since it is born, to call an apple an orange, the kid will naturally grow up to always refer to apples as oranges, and will have a hard time in learning the right names.
Norman, the psychopath AI, serves as a case for what happens when an AI is fed with highly biased data. It points to a truth that has been regularly emphasized by experts in the field of data science and AI research.
The devil is in the data
Regardless of how good an AI algorithm may be, the data being processed by it must be clean and unbiased for the program to function as purported. AI researchers gather data from different sources online, which more often than not happens to be public repositories of different types of data, to train AI algorithms. This might sometimes include information that is biased along numerous factors. The vastness of data may sometimes make it hard to recognize these biases, even despite carefully executed data cleaning and organization. Understanding how different types of data affect the behavior of AI algorithms can enable AI researchers to further refine the data and improve the algorithms to eliminate bias. The most effective way to minimize the bias and imbalanced proclivity of AI towards the extremes in any situation is to ensure that the dataset it is trained on is both vast and varied. Enough representation of data from different sources should be ensured from the early stages of AI development to maintain the AI’s robustness. It’s like ensuring a kid gains knowledge and experience in different areas, giving them a broader perspective to ensure all-around, balanced development and growth. Gathering data from a wider, possibly shared source of data can provide AI researchers with a vast quantity as well as a variety of data that can train future AIs to be less biased.
The challenges to creating unbiased AI
In addition to the vastness of data that needs highly sophisticated programs and systems to sift through and clean, the creation of unbiased AI faces a few other challenges. For instance, using a shared source of data to ensure the representation of a wide variety of data sources can be hampered due to confidentiality issues. Not all groups and organizations will be willing to openly share their data to be used by external parties, which may also include competitors. Another barrier to training AI to be unbiased is that of humans working on the data. Although this bias may be subconscious, this bias can translate into the way data is collected, cleaned, and organized, and cause the resulting AI to be biased. Fixing these biases is among the top priorities for AI developers, as broad-application AIs will be impossible to build if biases in processing persist.
Although the degree of bias demonstrated by Norman, the psychopath AI, seems impossible to replicate under natural conditions, it still makes a case for the significance of data in determining the success of AI applications. With the increasing penetration of AI in our everyday lives, which will only grow further with time, the need for ensuring an unbiased operation of this technology will gain importance. While no definite and practical solution to this bias exists currently, it is only a matter of time before one is discovered.