The Small Data Problem:
Using Bayesian Networks in Endangered Species Policy Development
Steven F. Wilson, Ph.D., Standpoint Decision Support Inc.
302-99 Chapel Street, Nanaimo, BC V9R 5H3, Canada, firstname.lastname@example.org
Bayesian networks are commonly used to address "big data" problems and can also model expert knowledge in the absence of any data. Between these extremes lies a broad class of small data problems, which I define as those where causal explanations are sought from observational datasets with small sample sizes relative to the number of dimensions. Many of these problems are central to ongoing, important policy debates, but machine learning techniques and standard statistical analyses are generally unhelpful. Using examples from endangered species policy development, I present an analysis workflow based on causal identification, model instantiation with informed priors, and Bayesian updating to generate models that blend existing knowledge and available data. Such models can serve an important role in decision-making where policy alternatives cannot be tested experimentally and/or where datasets are constrained.
Steve Wilson has more than 25 years’ experience working at technical and professional levels in strategic and operational planning for public and private-sector clients. He specializes in quantitative approaches to decision support and policy analysis. Steve holds a Ph.D. in wildlife ecology from the University of British Columbia in Vancouver.