Skip to Content

Calculating Complexity: DL(B)

Description Length of the Bayesian Network — DL(B)

DL(B)DL(B) is the number of bits required to represent a Bayesian network. We can break down this value into the sum of two components:

DL(G)DL(G), which stands for the number of bits required to represent the graph GG of the Bayesian network,

DL(PG)DL(P|G) represents the number of bits required to represent the set of probability tables PP.

DL(B)=DL(G)+DL(PG)DL(B) = DL(G) + DL(P|G)

Calculating DL(G)

To calculate DL(G)DL(G), we need to determine the number of nodes and the number of their parent nodes.

DL(G)=i=1n(log2(n)+log2(nPai))DL(G) = \sum_{i=1}^n \left( \log_2(n) + \log_2 \binom{n}{\|P_{a_i}\|} \right)

where

nn is the number of random variables (nodes): X1,...,Xn{X_1},...,{X_n}

PaiP{a_i} is the set of the random variables that are parents of Xi{X_i} in graph GG

and PaiP{a_i} is the number of parents of the random variable Xi{X_i}.

Calculating DL(P|G)

Computing DL(PG)DL(P|G) is straightforward as it is proportional to the number of cells in all probability tables.

DL(PG)=in(jPaiSj×(Si1)×DL(p))DL(P|G) = \sum\limits_i^n {\left( {\prod\limits_j^{\left\| {P{a_i}} \right\|} {{S_j} \times ({S_i} - 1) \times DL(p)} } \right)}

where

Si{S_i} is the number of states of the random variable Xi{X_i}

pp is the probability associated with the cell.

As the probability p cannot be known prior to learning the network, we use the following classical heuristic in BayesiaLab:

DL(p)=log2(N)2DL(p) = \frac{{{{\log }_2}(N)}}{2}