MIDS Capstone Project Spring 2023

Egaleco

Team members

Problem & Motivation

Machine learning algorithms increasingly dictate opportunities and outcomes for individuals and groups across economic, social, political and medical contexts. As acknowledgement of these impacts grow, technology companies are expected to evaluate their own products and surface harmful biases before models go to market. Accordingly, numerous algorithmic fairness tools have emerged to make it easier for data scientists to evaluate their models and proactively reduce algorithmic harms, but a recent study (Deng et al., 2022) revealed several shortcomings of these existing algorithmic fairness tools.

The Value of Egaleco

Egaleco assists data scientists in surfacing bias in their datasets and models and lowers the barrier to entry to these fairness metrics. Given the increasing use of algorithms in the healthcare sector, and the risk of harm since AI models have growing roles in allocation of resources and clinical care, the benefits of preventing these biases before models go to production is large.

Fairness is very sector specific and use case specific. A fairness metric that's applicable in hiring may not be as applicable in healthcare, and even within healthcare, fairness definitions depend on the nature of the intervention (such as whether it's punishing people or assisting people). Given this, Egaleco focuses just on the healthcare sector, providing guidance on relevant legislation and helping users select the metrics that are most relevant to their use case. Although several different nondiscrimination laws govern the healthcare sector and there is an increasing focus on ensuring that algorithms adhere to these laws, many companies have not yet integrated bias detection and mitigation into their data science workflows.

Egaleco currently supports users in calculating group fairness, which explores the notion that when you group individuals by a protected attribute, like race, the groups should be treated similarly. Out of the nearly 100 fairness metrics that exist, Egaleco has curated a list of 12 metrics that fairness experts and ML practitioners in healthcare found to be most relevant to the ways in which machine learning models are most commonly used in the healthcare sector. Egaleco's full repository of fairness metrics can be found here.

Acknowledgments

The Egaleco team thanks the dozens of data scientists, fairness experts, and ethicists who participated in our key informant interviews and practitioner interviews. The learnings from this foundational research played a key role in shaping Egaleco.

We also thank all of the I School Faculty Advisors who provided us with guidance and support: Joyce Shen, Fred Nugen, Daniel Aranki, Jared Maslin, Morgan Ames, and Marti Hearst.

Course

Data Science 210. Capstone , Spring 2023

Class Project Gallery

Last updated: June 16, 2023