Skip to Content
BayesiaLabKey ConceptsKey Driver Analysis

Key Driver Analysis

Definition

Key Driver Analysis identifies the variables, attributes, factors, or potential actions that are most relevant to a target outcome. Typical targets include customer satisfaction, purchase intent, brand loyalty, employee engagement, recommendation intent, churn, or risk.

At a basic level, Key Driver Analysis asks: Which variables matter most for the target? In a decision-support context, the more useful question is: Which realistic changes should be prioritized to improve the target?

The term “driver” is often used as shorthand for “cause,” but this interpretation requires care. When Key Driver Analysis is based on observational data, such as survey responses or transaction records, the results describe probabilistic relationships and plausible priorities for action. Causal conclusions require additional assumptions, domain knowledge, or experimental evidence.

Key Driver Analysis with Bayesian Networks

Bayesian networks are well suited to Key Driver Analysis because they represent the joint probability distribution of all variables in a domain. Instead of evaluating each candidate driver in isolation, a Bayesian network models the dependencies among all variables simultaneously.

This is especially useful when variables are correlated, as is common in customer experience, market research, HR analytics, and product studies. Rather than treating collinearity as a problem to suppress, a Bayesian network can represent it explicitly. The model can reveal clusters of related variables, indirect relationships, nonlinear effects, and context-specific dependencies.

In BayesiaLab, Key Driver Analysis can be performed on both manifest variables and latent variables. Manifest variables are directly observed, such as survey items or product ratings. Latent variables, also called factors, summarize patterns among manifest variables and can make complex models easier to interpret. See also Latent Variables, Factors, Hidden Nodes, and Manifest Variables.

Relationship to PSEM

A Probabilistic Structural Equation Model, or PSEM, is a Bayesian network-based generalization of traditional Structural Equation Models. It represents relationships among manifest variables, latent variables, and target variables in a directed acyclic graph.

PSEM is particularly useful for Key Driver Analysis because it can reduce a large set of observed variables into a smaller number of interpretable factors. These factors can then be connected to a target node for probabilistic inference, target analysis, simulation, and optimization.

A typical PSEM-based Key Driver Analysis workflow includes:

Learning relationships among manifest variables.

Identifying clusters of strongly related variables.

Creating latent variables with Multiple Clustering.

Learning relationships among latent variables and the target node.

Analyzing and optimizing the target.

For a full example, see Chapter 8: Probabilistic Structural Equation Models for Key Driver Analysis.

The Target Node

A Key Driver Analysis begins with a Target Node, which represents the outcome of interest. Examples include purchase intent, repurchase loyalty, employee satisfaction, recommendation intent, or probability of churn.

Once the target is defined, BayesiaLab can evaluate how other variables in the network relate to it. In a PSEM workflow, this analysis is often useful at two levels:

Factor-level drivers summarize broad concepts and support strategic interpretation.

Manifest-level drivers identify specific observed variables that may suggest concrete actions.

This distinction matters because latent factors are usually not directly actionable. They help explain the structure of the domain, while manifest variables often point to specific product, service, or policy improvements.

Target Analysis and Total Effects

BayesiaLab provides several tools for evaluating candidate drivers.

Target Mean Analysis displays response curves of the target as a function of driver values. This can reveal strong, weak, nonlinear, or “just-about-right” relationships that may be missed by linear methods.

The Target Analysis Report provides tabular measures such as Total Effects on Target. Total Effect describes the change in the mean of the target node associated with a small change in the mean of a driver node.

Standardized Total Effect adjusts this measure by the standard deviations of the driver and target, making it easier to compare variables measured on different scales.

For observational data, these effects should be interpreted as associations unless the model has been extended with causal assumptions.

Observational and Causal Driver Analysis

Many Key Driver Analysis projects are based on observational data. In these cases, Bayesian networks support rich probabilistic analysis, but they do not automatically establish causality.

The webinar Key Driver Analysis with Bayesian Networks — From Observational Data to Causal Inference illustrates this distinction with an HR analytics case study using the 2023 Federal Employee Viewpoint Survey. The analysis first identifies observational drivers of the target node Q43, “I recommend my organization as a good place to work.”

The webinar then shows how the model can be extended toward causal inference by incorporating external causal knowledge and applying the Disjunctive Cause Criterion. This enables policy simulations under explicit causal assumptions.

In practice, Key Driver Analysis can support three levels of analysis:

Observational analysis, which identifies probabilistic relationships with the target.
Causal analysis, which estimates intervention effects under causal assumptions.
Decision analysis, which incorporates utilities, costs, and trade-offs to evaluate policy or business options.

From Ranking to Optimization

A ranked list of drivers is useful, but it is rarely enough. A strong driver may already be near its practical maximum. Another variable may have a smaller effect but much more room for improvement. Some potential changes may be unrealistic, costly, or unsupported by the data.

BayesiaLab’s Target Dynamic Profile extends Key Driver Analysis from ranking to optimization. It searches for combinations of changes that improve the target while respecting constraints such as costs, variation domains, and joint probability.

The use of Joint Probability is especially important. It favors scenarios that are plausible in the data, avoiding recommendations based on unrealistic combinations of evidence. This shifts the analysis from “What matters?” to “What should we do first, given what is feasible?”

Examples

HR Analytics: Employee Satisfaction and Policy Simulation

The webinar Key Driver Analysis with Bayesian Networks — From Observational Data to Causal Inference demonstrates a workflow using the 2023 Federal Employee Viewpoint Survey.

The case study uses Bayesian network learning, latent factors, and observational Key Driver Analysis to identify opportunities for improving employee satisfaction. It then extends the analysis toward causal policy simulation, including a hypothetical return-to-office policy.

Automotive Market Research: Loyalty Driver Analysis

The tutorial Optimizing Customer Loyalty applies Bayesian networks to auto buyer survey data with more than 100 product-attribute ratings.

Because these ratings are highly correlated, traditional methods can struggle to produce stable driver rankings. The Bayesian network approach models these dependencies directly and identifies loyalty drivers at the market, segment, and vehicle-model levels.

The tutorial also shows how optimization can generate model-specific recommendations while taking joint probability into account to avoid unrealistic improvement scenarios.

Consumer Product Optimization: Purchase Intent

Chapter 8: Probabilistic Structural Equation Models for Key Driver Analysis presents a consumer survey example involving 1,320 women evaluating 11 fragrances.

The PSEM uses latent variables to summarize fragrance attributes, imagery attributes, and intensity. The example shows how Target Mean Analysis can reveal nonlinear relationships, such as an optimal “just-about-right” level of intensity for maximizing purchase intent.