# Information Gain

### Definition

The Information Gain regarding evidence $E$ is the difference between the:

• Log-Loss $L{L_U}\left( E \right)$, given an unconnected network $U$, i.e., a so-called straw model, in which all nodes are marginally independent;

• Log-Loss $L{L_B}\left( E \right)$ given a reference network $B$.

$IG_B(E) = {\log _2}\left( {{{P({e_1},...,{e_n})} \over {\prod\limits_{i = 1}^n {P({e_i})} }}} \right) = L{L_U}(E) - L{L_B}(E)$

In earlier versions of BayesiaLab, Information Gain was named Consistency.

### Interpretation

The Log-Loss reflects the "cost" in bits of applying the network $B$ to evidence $E$, i.e., the number of bits that are needed to encode evidence $E$. The lower the probability of evidence $E$, the higher the Log-Loss.

As a result, a positive value of Information Gain would reflect a "cost-saving" for encoding evidence $E$ by virtue of having network $B$. In other words, encoding $E$ with network $B$ is less "costly" than encoding it with the straw model $U$. Therefore, evidence $E$ would be consistent with network $B$.

Conversely, a negative Information Gain indicates a so-called conflict, Log-Loss of evidence $E$ is higher with the straw model $U$ compared to the reference network $B$. Note that conflicting evidence does not necessarily mean that the reference network is wrong. Rather, it probably indicates that such a set of evidence belongs to the tail of the distribution that is represented by the reference network $B$.

However, if evidence $E$ is drawn from the original data on which the reference network $B$ was originally learned, the probability of observing conflicting evidence should be smaller than the probability of observing consistent evidence.

So, for a network model to be useful, there should generally be more sets of evidence with a positive Information Gain, i.e., consistent observations, than sets of evidence with a negative Information Gain, i.e., conflicting observations.

Therefore, the mean value of the Information Gain of a reference network $B$ compared to a straw model $U$ is a useful performance indicator of the reference network $B$.

### Related BayesiaLab Functions

• Information Gain and Evidence Analysis

• Network Performance

Last updated

Bayesia USA

info@bayesia.us

Bayesia S.A.S.

info@bayesia.com

Bayesia Singapore

info@bayesia.com.sg