TY - JOUR
T1 - Understanding and mitigating the impact of race with adversarial autoencoders
AU - Sarullo, Kathryn
AU - Swamidass, S. Joshua
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - Background: Artificial intelligence carries the risk of exacerbating some of our most challenging societal problems, but it also has the potential of mitigating and addressing these problems. The confounding effects of race on machine learning is an ongoing subject of research. This study aims to mitigate the impact of race on data-derived models, using an adversarial variational autoencoder (AVAE). In this study, race is a self-reported feature. Race is often excluded as an input variable, however, due to the high correlation between race and several other variables, race is still implicitly encoded in the data. Methods: We propose building a model that (1) learns a low dimensionality latent spaces, (2) employs an adversarial training procedure that ensure its latent space does not encode race, and (3) contains necessary information for reconstructing the data. We train the autoencoder to ensure the latent space does not indirectly encode race. Results: In this study, AVAE successfully removes information about race from the latent space (AUC ROC = 0.5). In contrast, latent spaces constructed using other approaches still allow the reconstruction of race with high fidelity. The AVAE’s latent space does not encode race but conveys important information required to reconstruct the dataset. Furthermore, the AVAE’s latent space does not predict variables related to race (R2 = 0.003), while a model that includes race does (R2 = 0.08). Conclusions: Though we constructed a race-independent latent space, any variable could be similarly controlled. We expect AVAEs are one of many approaches that will be required to effectively manage and understand bias in ML.
AB - Background: Artificial intelligence carries the risk of exacerbating some of our most challenging societal problems, but it also has the potential of mitigating and addressing these problems. The confounding effects of race on machine learning is an ongoing subject of research. This study aims to mitigate the impact of race on data-derived models, using an adversarial variational autoencoder (AVAE). In this study, race is a self-reported feature. Race is often excluded as an input variable, however, due to the high correlation between race and several other variables, race is still implicitly encoded in the data. Methods: We propose building a model that (1) learns a low dimensionality latent spaces, (2) employs an adversarial training procedure that ensure its latent space does not encode race, and (3) contains necessary information for reconstructing the data. We train the autoencoder to ensure the latent space does not indirectly encode race. Results: In this study, AVAE successfully removes information about race from the latent space (AUC ROC = 0.5). In contrast, latent spaces constructed using other approaches still allow the reconstruction of race with high fidelity. The AVAE’s latent space does not encode race but conveys important information required to reconstruct the dataset. Furthermore, the AVAE’s latent space does not predict variables related to race (R2 = 0.003), while a model that includes race does (R2 = 0.08). Conclusions: Though we constructed a race-independent latent space, any variable could be similarly controlled. We expect AVAEs are one of many approaches that will be required to effectively manage and understand bias in ML.
UR - https://www.scopus.com/pages/publications/85207529361
U2 - 10.1038/s43856-024-00627-3
DO - 10.1038/s43856-024-00627-3
M3 - Article
C2 - 39438707
AN - SCOPUS:85207529361
SN - 2730-664X
VL - 4
JO - Communications Medicine
JF - Communications Medicine
IS - 1
M1 - 210
ER -