Presentation on October 7, 2015, at the 3rd Annual BayesiaLab Conference:
Sky Mining - Photomorphic Redshift Estimation using Bayesian Networks
Pragyansmita Nayak, Ph.D.
George Mason University, Fairfax, VA
Kirk Borne, Ph.D.
Booz Allen Hamilton
This presentation describes the application of "Bayesian Network"-based learning methods to the estimation of galaxy redshifts and distances. The present study investigates redshift estimation using photometric parameters for the galaxies such as color and magnitude (= observed flux) in combination with morphology attributes such as shape, size, orientation and concentration in different wavebands.
We used attributes from the five color bands of the Sloan Digital Sky Survey (SDSS). Machine learning techniques based on Naïve Bayes (NB), Bayesian Network (BN) and Generalized Linear Model (GLM) were investigated in order to better understand their applicability, advantages and resulting predictive performance in terms of efficiency and accuracy. We executed NB, BN, and GLM experiments on a total of 700,777 SDSS galaxies, using 45 photomorphic attributes for each.
Among our study’s major findings, we have demonstrated that some combinations of magnitude and morphology attributes are indeed successful redshift degeneracy resolvers and deliver better redshift estimators than color attributes alone. The BN methodology that we used for redshift estimation can be used more generally for any missing value imputation problem, data quality analysis in large data sets (Big Data), and modeling the distribution of mass in the universe. The potential of use cases from the generated BN models and the likelihood distribution is immense and a few of them will be covered as part of the presentation.