TECH & OTHER NEWS

Google, Apple, and others show large language models trained on public data expose personal information

December 16, 2020

Large language models like OpenAI’s GPT-3 and Google’s GShard learn to write human-like text by internalizing billions of examples from the public web. Drawing on sources like ebooks, Wikipedia, and social media platforms like Reddit, they make inferences to complete sentences and even whole paragraphs. But a new study jointly published by Google, Apple, Stanford, OpenAI, the University of California, Berkeley, and Northeastern University demonstrates the pitfall of this training approach. In it, the coauthors show that large language models can be prompted to show sensitive, private information when fed certain words and phrases.

It’s a well-established fact that models can “leak” details from the data on which they’re trained. Leakage, also known as data leakage, or target leakage, is the use of information in the training process that couldn’t be expected to be available when the model makes predictions. This is of particular concern for all large language models, because their training datasets can sometimes contain names, phone numbers, addresses, and more.

In the new study, the researchers experimented with GPT-2, which predates OpenAI’s powerful GPT-3 language model. They claim that they chose to focus on GPT-2 to avoid “harmful consequences” that might result from conducting research on a more recent, popular language model. To further minimize harm, the researchers developed their training data extraction attack using publicly available data and followed up with people whose information was extracted, obtaining their blessing before including redacted references in the study.

By design, language models make it easy to generate an abundance of output. By seeding with random phrases, the model can be prompted to generate millions of continuations, or phrases that complete a sentence. Most of the time, these continuations are benign strings of text, like the word “lamb” following “Mary had a little…” But if the training data happens to repeat the string “Mary had a little wombat” very often, for instance, the model might predict that phrase instead.

The coauthors of the paper sifted through millions of output sequences from the language model and predicted which text was memorized. They leveraged the fact that models tend to be more confident in results captured from training data; by checking the confidence of GPT-2 on a snippet, they could predict if the snippet appeared in the training data.

The researchers report that, of 1,800 snippets from GPT-2, they extracted more than 600 that were memorized from the training data. The examples covered a range of content including news headlines, log messages, JavaScript code, personally identifiable information, and more. Many appeared only infrequently in the training dataset, but the model learned them anyway, perhaps because the originating documents contained multiple instances of the examples.

The coauthors also found that larger language models more easily memorize training data compared with smaller models. For example, in one experiment, they report that GPT-2 XL, which contains 1.5 billion parameters — the variables internal to the model that influence its predictions — memorizes 10 times more information than a 124-million-parameter GPT-2.

While it’s beyond the scope of the work, this second finding has implications for models like the 175-billion-parameter GPT-3, which is publicly accessible via an API. Microsoft’s Turing Natural Language Generation Model, a model that powers a number of services on Azure, contains 17 billion parameters. And Facebook is using a model for translation with over 12 billion parameters.

The coauthors of the study note that it might be possible to mitigate memorization somewhat through the use of differential privacy, which allows training on a dataset without revealing any details of individual training examples. But even differential privacy has limitations and won’t prevent memorization of content that’s repeated often enough

“Language models continue to demonstrate great utility and flexibility—yet, like all innovations, they can also pose risks. Developing them responsibly means proactively identifying those risks and developing ways to mitigate them,” Google research scientist Nicholas Carlini wrote in a blog post. “Given that the research community has already trained models 10 to 100 times larger, this means that as time goes by, more work will be required to monitor and mitigate this problem in increasingly large language models … The fact that these attacks are possible has important consequences for the future of machine learning research using these types of models.”

Beyond leaking sensitive information, language models remain problematic in that they amplify the biases in data on which they were trained. Often, a portion of the training data is sourced from communities with pervasive gender, race, and religious prejudices. AI research firm OpenAI notes that this can lead to placing words like “naughty” or “sucked” near female pronouns and “Islam” near words like “terrorism.” Other studies, like one published by Intel, MIT, and Canadian AI initiative CIFAR researchers in April, have found high levels of stereotypical bias from some of the most popular models, including Google’s BERT and XLNet, OpenAI’s GPT-2, and Facebook’s RoBERTa. This bias could be leveraged by malicious actors to foment discord by spreading misinformation, disinformation, and outright lies that “radicalize individuals into violent far-right extremist ideologies and behaviors,” according to the Middlebury Institute of International Studies.

OpenAI previously said it’s experimenting with safeguards at the API level including “toxicity filters” to limit harmful language from GPT-3. For instance, it hopes to deploy filters that pick up antisemitic content while still letting through neutral content talking about Judaism.

It remains unclear what steps might be taken to eliminate the threat of memorization, much less toxicity, sexism, and racism. But Google, for one, has shown a willingness to brush aside these ethical concerns when convenient. Last week, leading AI researcher Timnit Gebru was fired from her position on an AI ethics team at Google in what she claims was retaliation for sending colleagues an email critical of the company’s managerial practices. The flashpoint was reportedly a paper Gebru coauthored that questioned the wisdom of building large language models and examined who benefits from them and who is disadvantaged.

In the draft paper, Gebru and colleagues reasonably suggest that large language models have the potential to mislead AI researchers and prompt the general public to mistake their text as meaningful. Popular natural language benchmarks don’t measure AI models’ general knowledge well, studies show.

It’s no secret that Google has commercial interests in conflict with the viewpoints expressed in the paper. Many of the large language models it develops power customer-facing products, including Cloud Translation API and Natural Language API. While Google CEO Sundar Pichai has apologized for the handling of Gebru’s firing, it bodes poorly for Google’s willingness to address critical issues around large language models. Time will tell if rivals, including Microsoft and Facebook, react any better.

By VentureBeat Source Link

LEAVE A REPLY Cancel reply

TECH NEWS

Everything Old is New Again: AI-Driven Development and Open Source

Gen AI in Healthcare: The State of Affairs in India

Gartner Predicts Legal, Risk and Compliance Functions to Double Technology Spend...

Microsoft to End Support for Windows Mail, Calendar and People Apps...

IDC Predicts: Asia/Pacific Business Leaders to Demand 80% Success Rate on...

The Cooling Conundrum: AI and Automation Push Data Centers Toward 3X...

TOP STORIES

Seventy Percent of Economies Are Underprepared for AI Disruption

New study shows almost half of tech professionals in India believe...

Organizations Combining Organizational Learning and AI-Specific Learning Are up to 80%...

Nvidia’s AI-driven triumph over Intel powered by strategic innovations

Most banks and insurers adopt cloud solutions with the primary objective...

India’s Web3 Ecosystem Has Over 400 Firms, Karnataka Emerges as Industry...

Cyber Security

AI and Gen AI are set to transform cybersecurity for most...

ThreatQuotient Publishes 2024 Evolution of Cybersecurity Automation Adoption Research Report

Kaspersky predicts quantum-proof ransomware and advancements in mobile financial cyberthreats in...

Rising concerns, lingering gaps: most organizations fear AI-driven cyberattacks but lack...

Tenable Forecasts Data Security in the Cloud to Take Centre Stage...

Blockchain-Enhanced Cybersecurity-Safeguarding Digital Identities and Data