Friends Don’t Let Friends Deploy Black-Box Models: The Importance of Intelligibility in Machine Learning for Bias Detection and Prevention
In machine learning often a trade-off must be made between accuracy and intelligibility: the most accurate models usually are not very intelligible (e.g., deep nets and random forests), and the most intelligible models usually are less accurate (e.g., linear or logistic regression). This trade-off often limits the accuracy of models that can be safely deployed in mission-critical applications such as healthcare where being able to understand, validate, edit, and ultimately trust a learned model is important. We have been working on a learning method based on generalized additive models (GAMs) that is often as accurate as full complexity models, but even more intelligible than linear models. This not only makes it easy to understand what a model has learned, but also makes it easier to edit the model when it learns something inappropriate because of unexpected landmines in the data. Making it possible for experts to understand a model and interactively repair it is critical because most data has these landmines. In the talk I’ll present two healthcare cases studies where these high-accuracy GAMs discover surprising patterns in the data that would have made deploying a black-box model risky, and also discuss the importance of model intelligibility for dealing with fairness and trust issues such as race and gender bias.
Rich Caruana is a senior researcher at Microsoft Research. Before joining Microsoft, Rich was on the faculty in the computer science department at Cornell University and at UCLA’s medical school. Rich’s Ph.D. is from Carnegie Mellon University, where he worked with Tom Mitchell and Herb Simon. His thesis on multi-task learning helped create interest in a new subfield of machine learning called transfer learning. Rich received an NSF CAREER Award in 2004 (for Meta Clustering), best paper awards in 2005 (with Alex Niculescu-Mizil), 2007 (with Daria Sorokina), and 2014 (with Todd Kulesza, Saleema Amershi, Danyel Fisher, and Denis Charles), co-chaired KDD in 2007 (with Xindong Wu), and serves as area chair for NIPS, ICML, and KDD. His current research focus is on learning for medical decision making, intelligible machine learning, deep learning, and computational ecology.