Research
Completed
A Causal Approach to Fair Predictive Modeling via Penalized Maximum Likelihood Estimation
Brown University
Instructor: Prof. Alice Paul
Keywords: Causal Inference, Optimization

Automated decision-making systems are increasingly prevalent across various domains, including hiring, loan approval, and criminal justice. However, the widespread use of these systems has raised significant concerns about the fairness of the models they employ. These systems are often trained on historical data that may contain inherent biases, leading to the amplification or perpetuation of these biases and resulting in unfair treatment of certain groups.

Various approaches have been developed to mitigate the unfairness in automated decision-making systems. These approaches are typically categorized into three groups: pre-processing, in-processing, and post-processing methods. However, a common challenge in those methods is the trade-off between fairness and accuracy. Many methods require decision-makers to define what constitutes an optimal balance, which can introduce subjective biases into the process. There is also a need for continuous monitoring and human intervention to ensure that models remain fair over time.

To address that problem, a causal approach that focused on in-processing was proposed. This approach leverages path-specific effects (PSEs) and penalized maximum likelihood estimation. By incorporating PSEs into the model, they can identify the degree of discrimination within the data and generate new predictions that meet fairness constraints through an optimized semi-parametric likelihood function. This method not only simplifies the optimization problem but also produces more interpretable results and holds decision-makers accountable for the consequences of their predictions.

[Github]

Completed
Emulators for cohort-based, Markov state, simulation models: an application to the RESPOND model for Opioid Use Disorder (OUD)
Brown University, Syndemics Lab at Boston Medical Center
Project: HEAL Data2Action (D2A) Program
Instructor: Prof. Stavroula Chrysanthopoulou
Keywords: GEE, GLME, MERF, LSTM, Model Tuning

Purpose: In this study we present an innovative idea and unique example of building emulators for cohort-based Markov simulation models.

Methods: We have developed the Researching Effective Strategies to Prevent Opioid Death (RESPOND) simulation model to characterize the Opioid Use Disorder (OUD) dynamics, make projections, evaluate interventions, and inform decision making in this area. Due to its complex structure, implementation of RESPOND can be computationally intensive. The objective of this study is to build an emulator (metamodel), namely a simulation model of a simpler structure, to efficiently map model inputs to outputs based on a calibrated version of the RESPOND model to observed OUD trends in Massachusetts. We explore three statistical approaches for an emulator of the RESPOND model; 1) a generalized linear model (GLM) for longitudinal data, 2) a Mixed-Effects Random Forest (MERF), and 3) a Long Short-Term Memory (LSTM) recurrent neural network model. We describe model selection procedures for determining the best fitted model (set of parameter values) of each model category separately to the available simulated RESPOND data. We compare the three approaches in terms of their accuracy (weighed Root Mean Squared Error (wRMSE) and Mean Absolute Error (MAE)) and efficiency (running time) using simulated data (overall and fatal overdose counts) from the calibrated RESPOND model.

Results/Conclusions: Our findings show that the LSTM model provide more accurate predictions, while GLMs can provide insightful details about observed trends and associations of key factors over time. We also discuss how emulators can be used to create an online, user-friendly applications to depict evaluation of health care alternatives in real time due to the efficiency of the underlying algorithms.

[Demo]