Dr. Cathy O’Neil Presents Opportunity and Challenge to Fix and Create Ethical, Fair and Effective Predictive Models

Last week, Microsoft Research New England and Harvard University’s Berkman Klein Center for Internet and Society welcomed to NERD Dr. Cathy O’Neil, the New York Times bestselling author of Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Dr. O’Neil spoke about the effects of predictive models that are often harmful and mysterious to large segments of society and the populations they impact. You can view Dr. O’Neil’s talk on WGBH here:

Data science uses analyses of historical data to predict what may happen in the future. These predictive formulas may seem unbiased because they create numbers, percentages, and scores that the average person assumes is objective math. As Dr. O’Neil pointed out, these algorithms and calculations can be significantly flawed in several ways — the data they collect, the assumptions they make, and the conclusions they draw. Decision-makers in the hiring, credit rating, education merit, law enforcement, and justice systems could not only be relying heavily on biased data, but also perpetuating inherent, historical inequities embedded in the data set and/or formulas they use.

One vivid example that Dr. O’Neil presented is the common use of student test scores to determine which teachers are effective and which ones are not, possibly leading to their dismissal. How does a school district determine which teachers should be dismissed? One school district in New York assessed teachers’ performance based on how their students’ test scores in the current year compared to an anticipated test score (primarily drawn from the students’ score in the previous academic year). If a student performed better than the anticipated score, the teacher was rewarded points and given a higher rating. If the student performed worse, the teacher was penalized and given a lower rating for the negative difference. Reliance on the difference between the anticipated and current scores led to a system in which teachers of poorer students, who may have greater challenges and obstacles, were penalized and fired. While news media could find and publish the ratings under the Freedom of Information Act, Dr. O’Neil was blocked from obtaining the algorithm that calculated these ratings because they were proprietary models created by private companies who sold them to school districts. Without access to the algorithm, a Stuyvesant High School math teacher plotted the published ratings of teachers who taught multiple classes. They created a scatterplot to show only ~24% correlation between the students’ scores and the teachers’ ratings. In one instance, a teacher who was fired due to “low performance” in the public school system was hired within days to a private school. “Did the algorithm create the desired outcome for the school district and its kids?” Dr. O’Neil asked.

The audience was motivated and called to think about their own ethical responsibility as they create or use predictive models, and to increase awareness of these issues often biased against the least powerful in society.