bayesia logo
BayesiaLab
Structural Coefficient

Structural Coefficient

Context

  • BayesiaLab utilizes proprietary score-based learning algorithms.
  • As opposed to the constraint-based algorithms that use independence tests for adding or removing arcs between nodes, BayesiaLab employs the Minimum Description Length Score (MDL Score) to measure the quality of candidate networks with respect to the available data.

Structural Coefficient

  • In BayesiaLab, the computation of the MDL Score also includes the so-called Structural Coefficient α\alpha as a weighting factor for the structural component DL(B)DL(B).

  • With that, the MDL Score is calculated using the following formula:

    MDL(B,D)=α×DL(B)+DL(DB)MDL(B,D) = \alpha \times DL(B) + DL(D|B)

  • As a result, the choice of value for the Structural Coefficient α\alpha affects the relative weighting of the two components DL(B)DL(B) and DL(DB)DL(D|B).

  • You can arbitrarily modify the Structural Coefficient α\alpha within the range of 0 to 150.

  • α=1\alpha = 1, the default value means the components DL(B)DL(B) and DL(DB)DL(D|B) are weighted equally.

  • α<1\alpha < 1 reduces the contribution of DL(B)DL(B) in the MDL Score formula and, thus, allows for more "structural complexity."

  • α>1\alpha > 1 increases the contribution of DL(B)DL(B) in the MDL Score formula, i.e., it penalizes "structural complexity", forcing a simpler model. \

  • There is another way to interpret the Structural Coefficient α\alpha, which can help understand its role in learning a Bayesian network.

  • Weighting DL(B)DL(B) with a factor α\alpha is equivalent to changing the original number of observations N in a dataset to a new number of observations N′:

    N=NαN' = \frac{N}{\alpha }

  • An α\alpha value of 0 would be the same as having an infinite number of observations NN'. As a result, the MDL Score would only be based on the fit component of the score, i.e., DL(DB)DL(D|B), and BayesiaLab's structural learning algorithms would produce a fully connected network.

  • At the other extreme, an α\alpha value of 150 would massively favor the simplest possible network structures as the new equivalent number of observations NN' would only 1/150th of NN.

  • It is perhaps more intuitive to consider the new number of observations N′ as weighted counts of the actual observations NN. For instance, α=0.5\alpha = 0.5 is equivalent to counting all observations twice.

  • From a practical perspective, the Structural Coefficient α\alpha can be considered a kind of "significance" threshold for structural learning.

    • The higher you set the α\alpha value, the higher the threshold for discovering probabilistic relationships. Conversely, the lower you set the α\alpha value, the lower the discovery threshold and the weaker probabilistic relationship would still be found and represented by an arc.
    • Reducing α can be helpful if you have a small dataset from which you want to learn a model. Perhaps at the default value, α=1\alpha = 1, the learning algorithm would not find any arcs.
    • However, choosing too low a value might result in "overfitting", i.e., learning "insignificant" relationships, in other words, discovering patterns in what turns out to be mere noise.
    • BayesiaLab can help reduce the risk of overfitting with the Structural Coefficient Analysis feature.

Copyright © 2024 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.