Bayesian Network Models for Predicting Health Risks of Arsenic in Drinking Water
Presented at the 5th Annual BayesiaLab Conference in Paris, September 25–October 4, 2017.
Jacqueline MacDonald Gibson, PhD
Associate Professor, Environmental Sciences and EngineeringUniversity of North Carolina at Chapel HillRTI University Scholar, RTI International, USA
Abstract
Background
Population health risk models currently used to support drinking water quality regulations in the United States have limited ability to capture inter-individual variability or to represent uncertainty. Toward improving such models, we compared a Bayesian-network model against current methods for predicting pre-diabetes and diabetes risk from arsenic in drinking water. We also assessed the implications of using this model for decision-making about arsenic control.
Methods Using data from a 1,050-member cohort in an arsenic-endemic region of Mexico, we fitted Bayesian-network and logistic regression models to predict pre-diabetes and diabetes risk from arsenic exposure via drinking water. Predictive performance was examined by training each model on 75% of the dataset and testing on the remaining 25%.
Results
The Bayesian-network model was slightly more accurate than the regression model in predicting pre-diabetes and diabetes (sensitivity 75% versus 73% for a specificity of 63%). In addition, the Bayesian-network model revealed a gender-mediated interacting effect of body-mass index and arsenic metabolism on risk that was not evident from the regression model. The Bayesian network estimated that reducing arsenic below the current Mexican regulatory limit of 25 µg/liter would prevent 18,000 diabetes cases among the 1.3 million residents of arsenic-endemic regions—an order of magnitude higher than the estimated 1,460 preventable cancer cases.
Conclusions
Bayesian networks provide a platform that could improve the accuracy of population health risk assessment models for supporting the development of environmental policies.