bayesia logo
BayesiaLab
Consistency

Consistency

Context

The Consistency (also called Conflict ) allows comparing two joint probabilities of an n-dimensional evidence E:

  • the joint probability of the reference model, i.e. the current Bayesian network that represents the dependencies between the variables,
  • the joint probability of the "straw" model, i.e. the fully unconnected network, in which all the variables are marginally independent.

where LLS(E) and LLB(E) are the log-losses associated with the evidence E by using the straw model (S) and the reference model (B) respectively.

The log-loss reflects the cost (in bits) per instance of using our model, i.e. the number of bits that are needed to encode the probability of the pieces of evidence. The lower the probability is, the higher the log-loss is.

The Consistency is positive if the joint probability computed with the reference model is higher than the one computed with the straw model. Conversely, a negative Consistency indicates a so-called Conflict, as the joint probability of the evidence is higher with the model with all independent variables compared to the model that includes dependencies.

Note that a conflicting evidence does not necessarily mean that the reference model is wrong. Rather, it usually indicates that this evidence belongs to the tail of the distribution represented by the reference model. However, if the evidence are drawn from the reference model's distribution, the probability of observing conflicting evidence should be less than the probability of observing consistent evidence. Thus, the mean of the consistencies has to be positive, otherwise, the reference model does not fit the joint probability distribution sampled by the evidence.

The strength of the dependencies that are represented in the reference model increases the difference between the probabilities of conflicting and consistent evidence, and the Consistency/Conflict values as well. The Consistency Mean is, therefore, a good way to measure the overall strength of a model.

Example

Let's suppose we have the following reference structure with 2 boolean nodes, N1 and N2:

16318498
16318515

4 different 2-dimensional evidence sets can be generated from this model:

EvidenceJoint Probability - ReferenceJoint Probability - StrawConsistency/Conflit
E10.450.250.848
E20.050.25-2.322
E30.050.25-2.322
E40.450.250.848
Mean0.531
EvidenceJoint Probability - ReferenceJoint Probability - StrawConsistency/Conflit
E10.2750.250.138
E20.2250.25-0.152
E30.2250.25-0.152
E40.2750.250.138
Mean0.007

Prior to version 6.0, the Consistency**/Conflict** metric was only available in:

New Feature: Consistency as a Network Performance Metric

16319299

This feature compares two density graphs, with the x-axis representing the log-likelihood of the n-dimensional evidence sets stored in the dataset(s) associated with the current network:

  • the blue bars correspond to the reference model,
  • the red bars correspond to the straw model.
16318569

If the reference model is a good representation of the joint probability distribution sampled by the evidence contained in the associated dataset, the blue bars should be to the left and clearly separated from the red bars. Bars further to the left on the x-axis indicate lower log-likelihood values, which imply higher joint probabilities.

As before, the straw model is the fully unconnected network. However, if the network contains discretized variables, we have a choice regarding the discretization. We can either keep the same discretization for both networks, or choose an alternative discretization algorithm for the straw model, with the requested number of bins being the same as the number of bins in the reference model.

16318577

Beyond the graphical comparison, the following Consistency Gain can be used to quantify the discrepancy between the reference model (B) and the straw model (S):

for the evidence E, and

for the entire data set, where N is the number of rows.

Example

Again, we use the simple reference structure with 2 boolean nodes, N1 and N2.

Here, the associated dataset contains the 4 different weighted pieces of evidence that can be generated from the joint probability distribution represented by the reference model.

16318581
EvidenceJoint Probability - ReferenceJoint Probability - StrawLog-Likelihood - ReferenceLog-Likelihood - Straw
E10.450.251.1522
E20.050.254.3222
E30.050.254.3222
E40.450.251.1522
Consistency Mean0.531
16318584

Here, the associated dataset contains the 4 different weighted pieces of evidence that can be generated from the joint probability distribution represented by the reference model.

16318585
EvidenceJoint Probability - ReferenceJoint Probability - StrawLog-Likelihood - ReferenceLog-Likelihood - Straw
E10.2750.251.8622
E20.2250.252.1522
E30.2250.252.1522
E40.2750.251.8622
Consistency Mean0.007
16318587

New: Relationship Analysis Report

16319301

As shown above, the Consistency Mean offers a good way to express the strength of the dependencies represented by the reference model. However, this metric can only be computed if data is available to sample the joint probability distribution.

If no data is associated with the model, the Consistency Mean can be efficiently approximated by computing the average of the sum of all Arc Forces (Kullback-Leibler Divergence) and the Mutual Information between all pairs of nodes that are connected by an arc. This Consistency Estimate is available in the first part of the Relationship Analysis Report:

16318590

Copyright © 2025 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.