Links

Structural Coefficient

Context

  • BayesiaLab utilizes proprietary score-based learning algorithms.
  • As opposed to the constraint-based algorithms that use independence tests for adding or removing arcs between nodes, BayesiaLab employs the Minimum Description Length Score (MDL Score) to measure the quality of candidate networks with respect to the available data.

Structural Coefficient

  • In BayesiaLab, the computation of the MDL Score also includes the so-called Structural Coefficient
    α\alpha
    as a weighting factor for the structural component
    DL(B)DL(B)
    .
  • With that, the MDL Score is calculated using the following formula:
MDL(B,D)=α×DL(B)+DL(DB)MDL(B,D) = \alpha \times DL(B) + DL(D|B)
  • As a result, the choice of value for the Structural Coefficient
    α\alpha
    affects the relative weighting of the two components
    DL(B)DL(B)
    and
    DL(DB)DL(D|B)
    .
  • You can arbitrarily modify the Structural Coefficient
    α\alpha
    within the range of 0 to 150.
  • α=1\alpha = 1
    , the default value means the components
    DL(B)DL(B)
    and
    DL(DB)DL(D|B)
    are weighted equally.
  • α<1\alpha < 1
    reduces the contribution of
    DL(B)DL(B)
    in the MDL Score formula and, thus, allows for more "structural complexity."
  • α>1\alpha > 1
    increases the contribution of
    DL(B)DL(B)
    in the MDL Score formula, i.e., it penalizes "structural complexity", forcing a simpler model.
  • There is another way to interpret the Structural Coefficient
    α\alpha
    , which can help understand its role in learning a Bayesian network.
    • Weighting
      DL(B)DL(B)
      with a factor
      α\alpha
      is equivalent to changing the original number of observations N in a dataset to a new number of observations N′:
N=NαN' = \frac{N}{\alpha }
  • An
    α\alpha
    value of 0 would be the same as having an infinite number of observations
    NN'
    . As a result, the MDL Score would only be based on the fit component of the score, i.e.,
    DL(DB)DL(D|B)
    , and BayesiaLab's structural learning algorithms would produce a fully connected network.
  • At the other extreme, an
    α\alpha
    value of 150 would massively favor the simplest possible network structures as the new equivalent number of observations
    NN'
    would only 1/150th of
    N N
    .
  • It is perhaps more intuitive to consider the new number of observations N′ as weighted counts of the actual observations
    N N
    . For instance,
    α=0.5\alpha = 0.5
    is equivalent to counting all observations twice.
  • From a practical perspective, the Structural Coefficient
    α\alpha
    can be considered a kind of "significance" threshold for structural learning.
    • The higher you set the
      α\alpha
      value, the higher the threshold for discovering probabilistic relationships. Conversely, the lower you set the
      α\alpha
      value, the lower the discovery threshold and the weaker probabilistic relationship would still be found and represented by an arc.
    • Reducing α can be helpful if you have a small dataset from which you want to learn a model. Perhaps at the default value,
      α=1\alpha = 1
      , the learning algorithm would not find any arcs.
    • However, choosing too low a value might result in "overfitting", i.e., learning "insignificant" relationships, in other words, discovering patterns in what turns out to be mere noise.
    • BayesiaLab can help reduce the risk of overfitting with the Structural Coefficient Analysis feature.