Auto-generated fixes to algorithms don’t completely eliminate bias

December 29, 2020

As predictive models are deployed to make decisions ranging from employee hiring to loan approvals, there’s a growing emphasis on designing algorithms that explain their decision-making and provide recourse to affected individuals. (For example, when a person is denied a loan by a model, they should be informed of the reasons and what can be done to address them.) Several recourse generation algorithms have been proposed in academic research papers, but it remains an open question whether these algorithms are reliable in the sense that they consistently improve outcomes.

A study from Harvard- and Microsoft-affiliated researchers finds strong evidence that they aren’t. That’s because algorithmically generated recourses tend to become invalid as stakeholders like banks and financial institutions retrain and update their models and use frameworks to adapt to new patterns in the data. It’s also because the data used to train these decision-making models is subject to temporal, geospatial, and other kinds of shifts due to data corrections, recourse intervention, and more.

Inspired by current events, the coauthors considered the problem of predicting grades using an AI classifier model. They trained a classifier on a dataset consisting of schools spread out across Jordan and Kuwait, using training examples collected from Jordan schools and deploying it to schools in Kuwait. In one hypothetical scenario, they assumed that students in Kuwait were provided recourses to improve their predicted grades but that when the students reapplied for grade prediction, the training dataset was updated to include Kuwait school data. In a second scenario, the researchers swapped the initial training data to come from Kuwait instead of Jordan.

Applying a state-of-the-art recourse generation technique in the first scenario would provide explanations to 116 students in Kuwait who received failing grades from the classifier trained on the Jordan dataset, the coauthors found. However, were the students to follow the recommendations and reapply for grade prediction, the classifier would yield favorable predictions for only 28.3% of them after being updated with the Kuwait dataset. In the second scenario, the same recourse generation technique would provide recommendations to 66 students, but these recommendations would result in better grades for only 60.6% of students.

In another experiment, the researchers trained a classifier on an error-prone German credit dataset to determine the creditworthiness of loan applicants. After applying the same recourse generation technique in the grade prediction problem, they found that 900 of 1,000 applicants would have been provided recourses. However, if the classifier were to be retrained on a corrected dataset with minor changes, only 22% would be accepted, even after implementing the recommended recourses.

In one final sample, the coauthors benchmarked a classifier that predicted whether a candidate would repay a loan using income, age, and method of application data. Trained on a synthetic dataset, the classifier would give 261 (if age were considered) or 522 (without the age variable) out of 1,024 applicants unfavorable model predictions, the researchers report. But recourse generation wouldn’t vastly improve the candidates’ chances. They would have been told by the recourse generation technique to increase their income, but even with increased incomes, the classifier would predict only 0% to 8% of them would repay loans.

The researchers claim that their work, taken as a whole, shows that distributions shifts can cause “significant invalidation” of generated recourses, endangering trust in decision makers. “The problem of distribution shifts invalidating recourses and counterfactual explanations seems to be a direct result of current recourse finding technologies, rather than of the properties of the initial model,” they wrote. “It would be interesting to develop novel recourse finding strategies that do not suffer from the drawbacks of existing techniques and are robust to distribution shifts.”

VentureBeat

VentureBeat’s mission is to be a digital townsquare for technical decision makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

up-to-date information on the subjects of interest to you,
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform
networking features, and more.

Become a member

By VentureBeat Source Link

Auto-generated fixes to algorithms don’t completely eliminate bias

VentureBeat

LEAVE A REPLY Cancel reply

TECH NEWS

Gartner Predicts Legal, Risk and Compliance Functions to Double Technology Spend...

Microsoft to End Support for Windows Mail, Calendar and People Apps...

IDC Predicts: Asia/Pacific Business Leaders to Demand 80% Success Rate on...

The Cooling Conundrum: AI and Automation Push Data Centers Toward 3X...

Gartner Identifies Four Emerging Challenges to Delivering Value from AI Safely...

The Future of Data Protection: A Deep Dive into NAKIVO Backup...

TOP STORIES

Most banks and insurers adopt cloud solutions with the primary objective...

India’s Web3 Ecosystem Has Over 400 Firms, Karnataka Emerges as Industry...

Next-generation spirits innovation to be shaped by premiumization, convenience, generational shifts,...

Trump Triumph: What it Means for Big Tech, Tariffs, Semiconductors, Automotive...

High- cyber-maturity organizations expect to achieve their business outcomes by 27%...

AI Adoption in 2024: 74% of Companies Struggle to Achieve and...

Cyber Security

Tenable Forecasts Data Security in the Cloud to Take Centre Stage...

Blockchain-Enhanced Cybersecurity-Safeguarding Digital Identities and Data

New F5 Report Unveils Scary Truths About API Security in the...

SteelFox exploits Foxit PDF Editor and AutoCAD for banking data theft...

Kaspersky identifies new stealthy ransomware

Gartner Survey Shows AI Enhanced Malicious Attacks as Top Emerging Risk...