As companies rely increasingly on machine learning models to run their businesses, it’s imperative to include anti-bias measures to ensure these models are not making false or misleading assumptions. Today at AWS re:Invent, AWS introduced Amazon SageMaker Clarify to help reduce bias in machine learning models.
“We are launching Amazon SageMaker Clarify. And what that does is it allows you to have insight into your data and models throughout your machine learning lifecycle,” Bratin Saha, Amazon VP and general manager of machine learning told TechCrunch.
He says that it is designed to analyze the data for bias before you start data prep, so you can find these kinds of problems before you even start building your model.
“Once I have my training data set, I can [look at things like if I have] an equal number of various classes, like do I have equal numbers of males and females or do I have equal numbers of other kinds of classes, and we have a set of several metrics that you can use for the statistical analysis so you get real insight into easier data set balance,” Saha explained.
After you build your model, you can run SageMaker Clarify again to look for similar factors that might have crept into your model as you built it. “So you start off by doing statistical bias analysis on your data, and then post training you can again do analysis on the model,” he said.
There are multiple types of bias that can enter a model due to the background of the data scientists building the model, the nature of the data and how they data scientists interpret that data through the model they built. While this can be problematic in general it can also lead to racial stereotypes being extended to algorithms. As an example, facial recognition systems have proven quite accurate at identifying white faces, but much less so when it comes to recognizing people of color.
It may be difficult to identify these kinds of biases with software as it often has to do with team makeup and other factors outside the purview of a software analysis tool, but Saha says they are trying to make that software approach as comprehensive as possible.
“If you look at SageMaker Clarify it gives you data bias analysis, it gives you model bias analysis, it gives you model explainability it gives you poor inference explainability it gives you a global explainability,” Saha said.
Saha says that Amazon is aware of the bias problem and that is why it created this tool to help, but he recognizes that this tool alone won’t eliminate all of the bias issues that can crop up in machine learning models, and they offer other ways to help too.
“We are also working with our customers in various ways. So we have documentation, best practices, and we point our customers to how to be able to architect their systems and work with the system so they get the desired results,” he said.
SageMaker Clarify is available starting to day in multiple regions.