SPH Develops New Metrics to Measure Fairness of Risk Prediction Models in Healthcare - School of Public Health

The increasing availability of health data has led to a corresponding rise in models to interpret this data in ways that can inform public health decision making and policy formulation. And while these data-driven models can benefit policymakers — as well as patients and healthcare providers — they also have the potential to exacerbate health inequities by failing to capture the nuanced ways that biases may be present in the models, particularly when predicting outcomes related to sensitive attributes such as race, gender, or socioeconomic status.

To address these shortcomings, University of Minnesota School of Public Health (SPH) researchers developed a framework and new metrics to measure the fairness of risk prediction models. Led by SPH doctoral student Solvejg Wastvedt, the researchers sought to address key ways that existing “algorithmic fairness” models fall short of what is needed to develop fair and equitable health policy.

In the study, which appears in Biostatistics, SPH researchers present novel metrics that address the challenges of current risk prediction models; develop a complete framework of estimation and inference tools for the metrics; and demonstrate the effectiveness of the model by applying it to a COVID-19 risk prediction model deployed in a major Midwestern health system.

“While measurement of model inequities is just the first step toward addressing discrimination, it is an important step,” says Wastvedt. “Without accurate assessments, biased risk prediction models that are intended to personalize treatment can actually make health inequities worse.”

The researchers also noted that, while there are existing methods for measuring the fairness of data algorithms in general, many of these methods are hard to apply to healthcare. The existing methods often focus on one way of grouping people, such as by race or gender. This fails to account for the variety of ways different groupings intersect and affect the discrimination a person experiences both within and outside the healthcare delivery system. Another drawback to current methods is that, in clinical applications, risk prediction is typically used to guide treatment, creating distinct statistical issues that invalidate most existing techniques and do not work with most existing fairness measures.

The SPH project is the first known work to develop new fairness metrics that account for both of these issues. The research team came up with three metrics for measuring fairness, as well as a framework of estimation and inference tools for the metrics. The main approach they used is counterfactual fairness, which uses techniques from causal inference — the term used to describe the process of determining whether an observed association reflects a cause-and-effect relationship.

Instead of basing their fairness metrics on observed data, in which some patients are treated and some are not, the researchers used these causal inference techniques to simulate hypothetical outcomes for patients receiving no treatment. Using this method, the researchers were able to measure fairness with respect to the same no-treatment baseline for all patients, ensuring the algorithm accurately predicts patients’ needs regardless of current treatment assignment patterns. The methods are implemented in a publicly available R package.

SPH Associate Professor and Innovative Methods & Data Science (IMDS) co-Director Julian Wolfson, and IMDS member and SPH Assistant Professor Jared Huling contributed to the research.

Read more about the tools Wastvedt developed here.