Unlearning Racism and Sexism in Learning Machines
What if a learning machine learns the wrong lessons?
In a world where software and machine intelligence increasingly are placed in decision making roles – such as, say, helping banks choose who gets a loan and who doesn’t – detection of pernicious bias is critical. But what if the bank isn’t aware its software has, over time, “taught itself” to discriminate based on race or gender? Who’s watching the machine?
Bias in software, first and foremost, poses a major moral problem. In addition, for companies found to be using software that discriminates in its lending, hiring or marketing practices, it can be a matter of legal liability and reputation damage. The problem extends to software used by HR departments that filter resumes; to law enforcement organizations and courts using software to make decisions about bail, sentencing and parole; to online advertising.
Researchers at the University of Massachusetts/Amherst are developing bias-monitoring software that they say goes further than previous attempts to tackle the problem. Alexandra Meliou and Yuriy Brun of the College of Information and Computer Sciences have developed “Themis” (the ancient Greek goddess of justice and order) that measures causality in discrimination. Brun said the research team has applied software testing techniques to perform hypothesis testing, to ask such questions as whether changing a person’s race affects whether the software recommends giving that person a loan.
“Our approach measures discrimination more accurately than prior work that focused on identifying differences in software output distributions, correlations or mutual information between inputs and outputs,” Brun said. “Themis can identify bias in software, whether that bias is intentional or unintentional, and can be applied to software that relies on machine learning, which can inject biases from data without the developers’ knowledge.”
Meliou told EnterpriseTech their research has been recognized with an Association for Computing Machinery SIGSOFT Distinguished Paper Award. They will be present their findings next month at the joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering.
Their work also relates to the futuristic concerns about AI oversight as machine learning systems grow in intelligence and skill. For AI dystopians, such as Elon Musk, Stephen Hawking and the Oxford philosopher Nick Bostrom, the fear is neural networks taking on intelligence and capabilities beyond human control. A possible glimpse of this was the inability of Google (Alphabet) DeepMind programmers to explain how their AlphaGo system defeated the no. 1 ranked player of Go, said to be the world’s most complex board game.
Asked about this issue two months ago at the ISC conference in Germany, supercomputing luminary Dr. Eng Lim Goh of HPE told EnterpriseTech that the challenge is bringing transparency, and the ability to intervene, to the points in neural networks where decision making happens.
He also discussed the distinction between an AI-based system delivering a “correct” answer that also may not be “right” – i.e., ethical and in accordance with social mores – something only humans can decide (see At ISC – Goh on Go: Humans Can’t Scale, the Data-Centric Learning Machine Can).
In its own way, this is what Themis does – with a twist. Instead of human monitoring, it automates oversight of automated decision making.
Meliou told EnterpriseTech it’s possible for software that acquires a race or gender bias to become self-perpetuating.
“It’s definitely possible,” she said. “When you learn from biased data, you are producing biased data, you are producing biased decisions. These types of decisions produce eventually more data that you will likely feed to your system again. So you have this feedback loop that propagates and worsens these biases.”
While Themis has not been used to test software used by financial institutions or law enforcement organizations, Meliou said it has evaluate public software systems from GitHub and found that discrimination can “sneak in” even when the software is explicitly designed to be fair. State-of-the-art techniques for removing discrimination from algorithms fail in many situations, she said, in part because prior definitions of discrimination failed to capture causality.
For example, Themis found that a decision tree-based machine learning approach specifically designed not to discriminate against gender was actually discriminating more than 11 percent of the time. That is, more than 11 percent of the individuals saw the software output affected just by altering their gender.
Themis also found that designing the software to avoid discrimination against one attribute may increase discrimination against others, Meliou said. For example, the same decision tree-based software trained not to discriminate on gender discriminated against race 38 percent of the time.
“These systems learn discrimination from biased data, but without careful control for potential bias, software can magnify that bias even further,” said Ph.D. student and research team member Sainyam Galhotra.
Biased algorithms is a problem that has embarrassed major technology-driven companies in recent years. For example, last year it was revealed that decisions on where to offer same-day delivery for Amazon Prime customers in major American cities seemed to routinely, if unintentionally, exclude zip codes for majority black neighborhoods. Amazon explained that same-day delivery offers were based on many factors, including distance to fulfillment centers, demand in the local area along with numbers of Amazon Prime customers, and other factors.
But in the end, the data taught an algorithm to make, in effect, discriminatory decisions based on race. Amazon has since addressed the problem.
Google has run into issues in which searches for what its algorithms consider to be “typically” black names have produced ads for services that look up arrest records. By all accounts, Google itself was unaware – and did not program its algorithms – to make these decisions.
To be sure, Themis does not correct biased code, it is designed to identify instances of what may be race- or gender-based bias.
“Unchecked, biases in data and software run the risk of perpetuating biases in society,” said Brun. “There is software that may discriminate and businesses may not know it’s happening. If they knew, they would actually make the correct decision because, obviously, that bad press was not good for Amazon. That’s why they need algorithms like ours to determine whether their software is discriminatory.”