Missing Values Processing
Context
- Missing values are encountered in virtually all real-world data collection processes. Missing values can be the result of non-responses in surveys, poor record-keeping, server outages, attrition in longitudinal surveys, or faulty sensors of a measuring device, etc.
- As missing values processing — beyond the naive ad hoc approaches — can be a demanding task, both methodologically and computationally.
- Traditionally, the process of specifying an imputation model has been a scientific modeling effort on its own, and few non-statisticians dared to venture into this specialized field (van Buuren, 2007) (opens in a new tab).
- With Bayesian networks and BayesiaLab, handling missing values properly is practically feasible for researchers who might otherwise not attempt to deal with missing values beyond the ad hoc approaches.
Usage
- In BayesiaLab there are multiple contexts in which to select the type of Missing Values Processing algorithm.
- In Step 3 of the Data Import Wizard, you can specify the type of Missing Values Processing as you bring your dataset into BayesiaLab. However, that selection only applies to the one-time import process.
- For the ongoing Missing Values Processing in the context of machine learning, you also need to select an algorithm, which is what we discuss here.
- Select
Menu > Learning > Missing Values Processing >...
Missing Values Processing in Detail
Default Missing Values Processing Algorithm
- You can specify the default algorithm under
Menu > Window > Preferences > Data > Import & Associate > Missing & Filtered Values
.