Machine Learning with BayesiaLab
Part of the BayesiaLab exploration path. Start with the BayesiaLab Overview.
BayesiaLab includes a broad set of optimized algorithms for learning Bayesian networks from data, including both structure and parameters.
Machine learning in BayesiaLab is not limited to producing a predictive score. A learned Bayesian network can also support interpretation, simulation, diagnosis, causal review, variable selection, and downstream probabilistic inference.
- Many learning criteria are information-theoretic, such as Minimum Description Length.
- The workflows avoid many parametric functional-form assumptions common in traditional modeling.
- The approach scales from small studies to high-dimensional domains.
- Learned structures can be combined with expert knowledge and refined interactively.
Unsupervised Structural Learning
Unsupervised Structural Learning discovers probabilistic structure without predefining input and output roles.
This supports data-driven knowledge discovery in unfamiliar domains. Rather than asking only, “Which variables predict this target?”, unsupervised learning asks a broader question: “What dependency structure appears to organize this domain?”
The resulting network can help analysts identify relationships, clusters of related variables, potential explanatory pathways, and hypotheses for further investigation.
Supervised Learning
Supervised workflows target predictive performance for a chosen target variable.
Typical supervised use cases include diagnosis, risk scoring, customer-outcome prediction, decision support, and classification. BayesiaLab is not limited to a single network class, such as Naive Bayes. Instead, algorithms can search for high-performing models while controlling structural complexity.
Markov Blanket methods are especially useful for fast variable selection in high-dimensional settings. They help identify the variables that are most relevant to a target in the context of the rest of the network.
Clustering Workflows
BayesiaLab supports both Data Clustering and Variable Clustering.
- Data Clustering creates latent variables whose states represent groups of records.
- Variable Clustering groups variables by relationship strength.
- Multiple Clustering combines both approaches in BayesiaLab’s Probabilistic Structural Equation Model workflow.
These workflows are not merely visualization tools. They create a structure that can become part of the Bayesian network itself, supporting interpretation, segmentation, latent-factor discovery, and key-driver analysis.
Combining Learning and Knowledge Modeling
BayesiaLab supports hybrid modeling workflows in which data and domain expertise inform each other.
Experts may define parts of a network before learning from data. Alternatively, analysts may learn a preliminary structure and then revise it based on expert review, causal constraints, temporal order, or business logic.
This combination is particularly important when the available data contain associations but cannot, by themselves, establish causal direction.
Examples & Learn More
- Chapter 6: Supervised Learning
- Chapter 7: Unsupervised Learning
- Chapter 8: Probabilistic Structural Equation Models
- Knowledge Modeling
- Inference, Diagnosis, Prediction, and Simulation
- Webinar: Diagnostic Decision Support
- Webinar: Analyzing Capital Flows of Exchange-Traded Funds
- Webinar: Factor Analysis Reinvented — Probabilistic Latent Factor Induction