Compression

Context

The fit of a model to a dataset and the efficiency of encoding the dataset with a model are closely related concepts.
In this context, the "compression" achieved with a model can be used as a performance measure.
Under Information Gain and Evidence Analysis, we discussed the Information Gain of a network with regard to a single set of evidence $E$ :
- The Information Gain regarding evidence $E$ is the difference between the:
  - Log-Loss $LL_U(E)$ , given an unconnected network U, i.e., a so-called straw model, in which all nodes are marginally independent;
  - Log-Loss $LL_B(E)$ the current network $B$ .
- As a result, a positive value of Information Gain would reflect a "cost-saving" for encoding the evidence $E$ by virtue of having the network $B$ . In other words, encoding $E$ with network $B$ is less "costly" than without network $B$ .
In Validation Mode F5, select Menu > Analysis > Network Performance > Compression.
A new report window opens up featuring a graph plus a range of metrics.

The report window contains two histograms of the Log-Loss values computed from all observations in the dataset given:

The "current model", i.e., the to-be-evaluated Bayesian network $B$ (blue bars).
The "straw model", i.e., the unconnected network $U$ (red bars).

Furthermore, the report window includes numerous related measures:

Entropy $H_B$ , based on the current model.
Entropy $H_U$ , based on the "straw model."
Mean Information Gain, i.e., the arithmetic mean of the Information Gain of each observation/evidence $E$ in the dataset.
Mean Compression, i.e., the arithmetic mean of the Compression of each observation/evidence $E$ in the dataset.

Compression is a concept that first appears in this context. Its definition is:

Cmp{r_B}(E) = \frac{IG_B(E)}{LL_U(E)}

So, by dividing the Information Gain $IG_B(E)$ by the Log-Loss $LL_U(E)$ , we obtain the Compression measure.

The following table illustrates the calculation of all measures.

We use the same data and network as in the example in Overall Network Performance.

Evidence E from Dataset						Computed Measures
Month	Hour	Temperature	Shortwave Radiation (W/m2)	Wind Speed (m/s)	Energy Demand (MWh)	Log-Loss (Bayesian Network)	Log-Loss (Unconnected Network)	Information Gain	Compression
						$L{L_B}(E)$	$L{L_U}(E)$	$IG_{B}(E)=LL_{U}(E)-LL_{B}(E)$	$Cmp{r_B}(E) = \frac{IG_B(E)}{LL_U(E)}$
8	18	36.57	213.6	2	1574	13.42	22.06	8.63	39%
8	19	36.04	105.91	1.9	1574	13.55	21.68	8.13	38%
8	20	34.71	42.72	2.14	1485	11.93	19.4	7.47	39%
8	21	33.94	0	2.75	1470	11.92	17.73	5.81	33%
8	22	33.19	0	3.55	1378	11.81	17.73	5.92	33%
8	23	32.38	0	4.21	1249	13.69	16.93	3.23	19%
8	0	31.56	0	4.5	1110	12.91	16.93	4.02	24%
8	1	30.6	0	4.8	1031	13.21	16.93	3.71	22%
8	2	29.66	0	4.9	975	11.16	14.7	3.54	24%
8	3	29.02	0	4.6	944	10.85	14.7	3.85	26%
⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞
						Entropy $H_B$	Entropy $H_U$	Mean Information Gain	Mean Compression
					Mean	13.17	17.46	4.29	24%
					Std. Dev.	2.08	2.17	2.33
					Minimum	9.75	14.37	-12.5
					Maximum	31.78	31.06	16.3

Please note the updated terminology when referring to earlier versions of BayesiaLab.

Deprecated	Current
Consistency	Information Gain
Consistency Gain	Mean Compression