# Information Gain

### Definition

The Information Gain regarding evidence
$E$
is the difference between the:
• Log-Loss
$L{L_U}\left( E \right)$
, given an unconnected network
$U$
, i.e., a so-called straw model, in which all nodes are marginally independent;
• Log-Loss
$L{L_B}\left( E \right)$
given a reference network
$B$
.
$IG_B(E) = {\log _2}\left( {{{P({e_1},...,{e_n})} \over {\prod\limits_{i = 1}^n {P({e_i})} }}} \right) = L{L_U}(E) - L{L_B}(E)$
In earlier versions of BayesiaLab, Information Gain was named Consistency.

### Interpretation

The Log-Loss reflects the "cost" in bits of applying the network
$B$
to evidence
$E$
, i.e., the number of bits that are needed to encode evidence
$E$
. The lower the probability of evidence
$E$
, the higher the Log-Loss.
As a result, a positive value of Information Gain would reflect a "cost-saving" for encoding evidence
$E$
by virtue of having network
$B$
. In other words, encoding
$E$
with network
$B$
is less "costly" than encoding it with the straw model
$U$
. Therefore, evidence
$E$
would be consistent with network
$B$
.
Conversely, a negative Information Gain indicates a so-called conflict, Log-Loss of evidence
$E$
is higher with the straw model
$U$
compared to the reference network
$B$
. Note that conflicting evidence does not necessarily mean that the reference network is wrong. Rather, it probably indicates that such a set of evidence belongs to the tail of the distribution that is represented by the reference network
$B$
.
However, if evidence
$E$
is drawn from the original data on which the reference network
$B$
was originally learned, the probability of observing conflicting evidence should be smaller than the probability of observing consistent evidence.
So, for a network model to be useful, there should generally be more sets of evidence with a positive Information Gain, i.e., consistent observations, than sets of evidence with a negative Information Gain, i.e., conflicting observations.
Therefore, the mean value of the Information Gain of a reference network
$B$
compared to a straw model
$U$
is a useful performance indicator of the reference network
$B$
.

### Related BayesiaLab Functions

• Information Gain and Evidence Analysis
• Network Performance  