Debugging data: Microsoft researchers look at ways to train AI systems to reflect the real world

Photo of Microsoft researcher Hanna Walach
Hanna Wallach is a senior researcher in Microsoft’s New York City research lab. Photo by John Brecher.

Artificial intelligence is already helping people do things like type faster texts and take better pictures, and it’s increasingly being used to make even bigger decisions, such as who gets a new job and who goes to jail. That’s prompting researchers across Microsoft and throughout the machine learning community to ensure that the data used to develop AI systems reflect the real world, are safeguarded against unintended bias and handled in ways that are transparent and respectful of privacy and security.

Data is the food that fuels machine learning. It’s the representation of the world that is used to train machine learning models, explained Hanna Wallach, a senior researcher in Microsoft’s New York research lab. Wallach is a program co-chair of the Annual Conference on Neural Information Processing Systems from Dec. 4 to Dec. 9 in Long Beach, California. The conference, better known as “NIPS,” is expected to draw thousands of computer scientists from industry and academia to discuss machine learning – the branch of AI that focuses on systems that learn from data.

“We often talk about datasets as if they are these well-defined things with clear boundaries, but the reality is that as machine learning becomes more prevalent in society, datasets are increasingly taken from real-world scenarios, such as social processes, that don’t have clear boundaries,” said Wallach, who together with the other program co-chairs introduced a new subject area at NIPS on fairness, accountability and transparency. “When you are constructing or choosing a dataset, you have to ask, ‘Is this dataset representative of the population that I am trying to model?’”

Kate Crawford, a principal researcher at Microsoft’s New York research lab, calls it “the trouble with bias,” and it’s the central focus of an invited talk she will be giving at NIPS.

“The people who are collecting the datasets decide that, ‘Oh this represents what men and women do, or this represents all human actions or human faces.’ These are types of decisions that are made when we create what are called datasets,” she said. “What is interesting about training datasets is that they will always bear the marks of history, that history will be human, and it will always have the same kind of frailties and biases that humans have.”

Researchers are also looking at the separate but related issue of whether there is enough diversity among AI researchers. Research has shown that more diverse teams choose more diverse problems to work on and produce more innovative solutions. Two events co-located with NIPS will address this issue: The 12th Women in Machine Learning Workshop, where Wallach, who co-founded Women in Machine Learning, will give an invited talk on the merger of machine learning with the social sciences, and the Black in AI workshop, which was co-founded by Timnit Gebru, a post-doctoral researcher at Microsoft’s New York lab.

“In some types of scientific disciplines, it doesn’t matter who finds the truth, there is just a particular truth to be found. AI is not exactly like that,” said Gebru. “We define what kinds of problems we want to solve as researchers. If we don’t have diversity in our set of researchers, we are at risk of solving a narrow set of problems that a few homogeneous groups of people think are important, and we are at risk of not addressing the problems that are faced by many people in the world.”

Timnit Gebru is a post-doctoral researcher at Microsoft’s New York City research lab. Photo by Peter DaSilva.

Machine learning core

At its core, NIPS is an academic conference with hundreds of papers that describe the development of machine learning models and the data used to train them.

Microsoft researchers authored or co-authored 43 accepted conference papers. They describe everything from the latest advances in retrieving data stored in synthetic DNA to a method for repeatedly collecting telemetry data from user devices without compromising user privacy.

Nearly every paper presented at NIPS over the past three decades considers data in some way, noted Wallach. “The difference in recent years, though,” she added, “is that machine learning no longer exists in a purely academic context, where people use synthetic or standard datasets. Rather, it’s something that affects all kinds of aspects of our lives.”

The application of machine-learning models to real-world problems and challenges is, in turn, bringing into focus issues of fairness, accountability and transparency.

“People are becoming more aware of the influence that algorithms have on their lives, determining everything from what news they read to what products they buy to whether or not they get a loan. It’s natural that as people become more aware, they grow more concerned about what these algorithms are actually doing and where they get their data,” said Jenn Wortman Vaughan, a senior researcher at Microsoft’s New York lab.

The trouble with bias

Data is not something that exists in the world as an object that everyone can see and recognize, explained Crawford. Rather, data is made. When scientists first began to catalog the history of the natural world, they recognized types of information as data, she noted. Today, scientists also see data as a construct of human history.

Crawford’s invited talk at NIPS will highlight examples of machine learning bias such as news organization ProPublica’s investigation that exposed bias against African-Americans in an algorithm used by courts and law enforcement to predict the tendency of convicted criminals to reoffend, and then discuss how to address such bias.

“We can’t simply boost a signal or tweak a convolutional neural network to resolve this issue,” she said. “We need to have a deeper sense of what is the history of structural inequity and bias in these systems.”

One method to address bias, according to Crawford, is to take what she calls a social system analysis approach to the conception, design, deployment and regulation of AI systems to think through all the possible effects of AI systems. She recently described the approach in a commentary for the journal Nature.

Crawford noted that this isn’t a challenge that computer scientists will solve alone. She is also a co-founder of the AI Now Institute, a first-of-its-kind interdisciplinary research institute based at New York University that was launched in November to bring together social scientists, computer scientists, lawyers, economists and engineers to study the social implications of AI, machine learning and algorithmic decision making.

Jenn Wortman Vaughan is a senior researcher at Microsoft’s New York City research lab. Photo by John Brecher.

Interpretable machine learning

One way to address concerns about AI and machine learning is to prioritize transparency by making AI systems easier for humans to interpret. At NIPS, Vaughan, one of the New York lab’s researchers, will give a talk describing a large-scale experiment that she and colleagues are running to learn what factors make machine learning models interpretable and understandable for non-machine learning experts.

“The idea here is to add more transparency to algorithmic predictions so that decision makers understand why a particular prediction is made,” said Vaughan.

For example, does the number of features or inputs to a model impact a person’s ability to catch instances where the model makes a mistake? Do people trust a model more when they can see how a model makes its prediction as opposed to when the model is a black box?

The research, said Vaughan, is a first step toward the development of “tools aimed at helping decision makers understand the data used to train their models and the inherent uncertainty in their models’ predictions.”

Patrice Simard, a distinguished engineer at Microsoft’s Redmond, Washington, research lab who is a co-organizer of the symposium, said the field of interpretable machine learning should take a cue from computer programming, where the art of decomposing problems into smaller problems with simple, understandable steps has been learned. “But in machine learning, we are completely behind. We don’t have the infrastructure,” he said.

To catch up, Simard advocates a shift to what he calls machine teaching – giving machines features to look for when solving a problem, rather than looking for patterns in mountains of data. Instead of training a machine learning model for car buying with millions of images of cars labeled as good or bad, teach a model about features such as fuel economy and crash-test safety, he explained.

The teaching strategy is deliberate, he added, and results in an interpretable hierarchy of concepts used to train machine learning models.

Researcher diversity

One step to safeguard against unintended bias creeping into AI systems is to encourage diversity in the field, noted Gebru, the co-organizer of the Black in AI workshop co-located with NIPS. “You want to make sure that the knowledge that people have of AI training is distributed around the world and across genders and ethnicities,” she said.

The importance of researcher diversity struck Wallach, the NIPS program co-chair, at her fourth NIPS conference in 2005. For the first time, she was sharing a hotel room with three roommates, all of them women. One of them was Vaughan, and the two of them, along with one of their roommates, co-founded the Women in Machine Learning group, which is now in its 12th year and has held a workshop co-located with NIPS since 2008. This year, more than 650 women are expected to attend.

Wallach will give an invited talk at the Women in Machine Learning Workshop about how she applies machine learning in the context of social science to measure unobservable theoretical constructs such as community membership or topics of discussion.

“Whenever you are working with data that is situated within society contexts,” she said, “necessarily it is important to think about questions of ethics, fairness, accountability, transparency and privacy.”

Related: 

John Roach writes about Microsoft research and innovation. Follow him on Twitter.