Network Performance Analysis Overall — Learning & Test Set

Context

This Overall Performance Report evaluates a network with regard to a dataset that does have a Learning/Test Set split.
If your dataset does not have a Learning/Test Set split, please see Report for Learning Set.
Given that most performance measures here are the same as in the Report for Learning Set, we refer to that topic when appropriate rather than duplicating the content.
In this topic, we focus on the additional features and objectives related to the Learning/Test Set split.

Notation

$B$ denotes the Bayesian network to be evaluated.
$D$ $D$ represents the entire Dataset associated with the Bayesian network $B$ $B$ . The Dataset $D$ $D$ is split into two partitions:
- $D_L$ represents the Learning Set of the Dataset from which network $B$ was learned.
- $D_T$ represents the Test Set (or holdout sample), i.e., the portion of the Dataset that will be used for evaluation network $B$ .
- $E_L$ represents an n-dimensional observation (Evidence), i.e., one row or record in the Learning Set $D_L$ , from which the Bayesian network $B$ was learned.
- $E_T$ represents an n-dimensional observation (Evidence), i.e., one row or record in the Test Set $D_T$ , which will be used to evaluate network $B$ .
- $N_L$ refers to the number of observations $E_L$ in the Learning Set $D_L$ .
- $N_T$ refers to the number of observations $E_T$ in the Test Set $D_T$ .
$C$ refers to a Complete or fully connected network, in which all nodes have a direct link to all other nodes. Therefore, the complete network $C$ is an exact representation of the chain rule. As such, it does not utilize any conditional independence assumptions for representing the Joint Probability Distribution.
$U$ represents an Unconnected network, in which there are no connections between nodes, which means that all nodes are marginally independent.

Example

To explain and illustrate the Overall Performance Report, we use a Bayesian network model that was generated with one of BayesiaLab's Unsupervised Learning algorithms. This network is available for download here:

ElPasoLT.xbl

Overall Performance Report

The Report window consists of three tabs

Test Dataset
Learning Dataset
Comparison

which feature two views each:

Density Function
- The x-axis represents the Log-Loss values in increasing order.
- The y-axis shows the probability density for each Log-Loss value on the x-axis.
Distribution Function
- The observations $E$ $E$ in the dataset $D$ $D$ are sorted in ascending order according to their Log-Loss values:
  - The x-axis shows the observation number.
  - The y-axis shows the Log-Loss value corresponding to each observation.

Test Dataset Evaluation	Learning Dataset Evaluation	Comparison

The radio buttons on the bottom-left of the window allow you to switch the view between the Density function (Histogram) and the Distribution function.

Either view provides a visualization of the Log-Loss values for all observations in the dataset $D$ given the to-be-evaluated Bayesian network $B$ . Thus, the plots provide you with a visual representation of how well the network $B$ fits the dataset $D$ .

Comparing Density & Distribution Functions

In the topic, Overall Performance Report for the Learning Set, the focus was primarily on how well the network $B$ fits dataset $D$ . All measures were about goodness-of-fit.

Our objective for this topic is broader. We are still looking for a good fit, but also want to understand how well the learned network model generalizes beyond the dataset from which it was learned.

In this context, the Comparison tab provides a key visual. You can see the histograms of the Log-Losses of the Learning Set and Test Set overlaid on top of each other.

Log-Loss Computation

The computation of Log-Loss values is at the very core of this Overall Performance Report. In the Report for Learning Set, the Log-Loss values were computed for the entire dataset $D$ .

Given the Learning/Test Set split, BayesiaLab now needs to compute all metrics separately for the Learning Set and the Test Set.

And, in addition to Log-Loss $LL_B(E)$ for the to-be-evaluated network $B$ , BayesiaLab also needs to compute the Log-Loss values $LL_C(E)$ for the complete network $C$ and Log-Loss $LL_U(E)$ and the unconnected network $U$ .

So, to produce the plots and all related metrics, BayesiaLab has to perform the following computations:

$LL_B(E_L)$ , the Log-Loss value for each observation/evidence in the Learning Set based on the learned and to be-evaluated Bayesian network B.
$LL_C(E_L)$ , the Log-Loss value for each observation/evidence in the Learning Set based on the complete network $C$ .
$LL_U(E_L)$ , the Log-Loss value for each observation/evidence in the Learning Set based on the unconnected network $U$ .
$LL_B(E_T)$ , the Log-Loss value for each observation/evidence in the Test Set based on the learned and to be-evaluated Bayesian network $B$ .
$LL_C(E_T)$ , the Log-Loss value for each observation/evidence in the Test Set based on the complete network $C$ .
$LL_U(E_T)$ , the Log-Loss value for each observation/evidence in the Test Set based on the unconnected network $U$ .

The following Log-Loss Table is an extract of the first ten rows each from the Learning Set $D_L$ and the Test Set $D_T$ along with the computed Log-Loss values for each record:

Log-Loss Table

Learning/Test	Month	Hour	Temperature	Shortwave Radiation (W/m²)	Wind Speed (m/s)	Energy Demand (MWh)	Log-Loss (Bayesian Network)	Log-Loss (Complete Network)	Log-Loss (Unconnected Network)
							$L{L_B}(E_L)$	$L{L_C}(E_L)$	$LL_U(E_L)$
learning	8	18	36.57	213.60	2.00	1574.00	13.43	14.68	21.96
learning	8	19	36.04	105.91	1.90	1574.00	13.85	14.68	21.61
learning	8	20	34.71	42.72	2.14	1485.00	12.13	11.87	19.41
learning	8	21	33.94	0.00	2.75	1470.00	11.88	11.87	17.71
learning	8	22	33.19	0.00	3.55	1378.00	11.89	11.09	17.72
learning	8	23	32.38	0.00	4.21	1249.00	14.12	12.68	16.93
learning	8	0	31.56	0.00	4.50	1110.00	13.05	12.36	16.94
learning	8	2	29.66	0.00	4.90	975.00	11.22	11.68	14.66
learning	8	3	29.02	0.00	4.60	944.00	10.91	11.36	14.66
learning	8	5	27.16	0.00	3.11	927.00	11.29	10.98	14.65
⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞
							Entropy $H_B(D_L)$	Entropy $H_C(D_L)$	Entropy $H_U(D_L)$
							Mean	13.16	12.49
							Std. Dev.	2.06	1.34
							Minimum	9.69	9.43
							Maximum	30.88	14.68
							Normalized	68.369%	64.893%

Learning/Test	Month	Hour	Temperature	Shortwave Radiation (W/m²)	Wind Speed (m/s)	Energy Demand (MWh)	Log-Loss (Bayesian Network)	Log-Loss (Complete Network)	Log-Loss (Unconnected Network)
							$LL_B(E_T)$	$LL_C(E_T)$	$LL_U(E_T)$
test	8	1	30.60	0.00	4.80	1031.00	13.41	14.68	16.90
test	8	4	28.16	0.00	3.70	926.00	10.74	10.68	14.68
test	8	15	33.20	318.62	2.56	1554.00	14.62	12.68	20.63
test	8	18	32.71	192.24	2.13	1468.00	13.74	14.68	19.37
test	8	0	27.09	0.00	4.75	1113.00	11.09	11.09	16.44
test	8	1	25.87	0.00	7.53	1033.00	13.69	12.68	17.62
test	8	4	23.27	0.00	8.90	928.00	15.82	14.68	15.77
test	8	5	23.10	0.00	6.82	928.00	11.71	11.51	15.04
test	8	14	28.61	353.33	3.19	1412.00	19.10	?	19.35
test	8	20	27.70	27.59	4.12	1459.00	12.39	12.09	19.00
⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞	⁞
							Entropy $H_B(D_T)$	Entropy $H_C(D_T)$	Entropy $H_U(D_T)$
							Mean	13.23	12.41
							Std. Dev.	2.12	1.31
							Minimum	9.69	9.43
							Maximum	26.07	14.68
							Normalized	68.725%	64.499%

The complete Log-Loss Table serves as the basis for calculating the measures that are reported at the bottom of the report.

The measures reported at the bottom of the Test Set and Learning Set tab, are shown side by side on the Comparison tab:

For clarity, we match up the report's labels to the notation introduced at the beginning of this topic and the corresponding definitions.

Label in Report	Definition in Context of Test Set	Definition in Context of Learning Set
Entropy (H), i.e., mean of Log-Loss Values of all observations	${H_B}({D_T}) = \overline{LL_B({E_T})}$	${H_B}({D_L}) = \overline{LL_B({E_L})}$
Normalized Entropy (Hn)
Hn(Complete)
Hn(Unconnected)
Contingency Table Fit
Deviance	$De{v_B}({D_T}) = 2N \times \ln(2) \times \left( {H_B}({D_T}) - {H_C}({D_T}) \right)$	$De{v_B}({D_L}) = 2N \times \ln(2) \times \left( {H_B}({D_L}) - {H_C}({D_L}) \right)$
Number of Processed Observations	$N(D_T)$	$N(D_L)$

Impossible Observations

Impossible Observations refer to observations/evidence $E_T$ in the Test Set $D_T$ , which are entirely incompatible with the network $B$ that was learned from the Learning Set $D_L$ .

Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov Test (KS Test) is a common tool for comparing statistical distributions. Here, we use it to compare the distribution similarity between two samples, i.e. learning and test, so this test is only computed for the comparison panel.

More specifically, it compares the distributions of the Log-Losses. The values Z, D and the corresponding p-value are displayed.

Extract Data Set

The final element in the report window is the Extract Data Set button. This is a practical tool for identifying and examining outliers, e.g., those at the far end of the right tail of the histogram.

Clicking the Extract Data Set button brings up a new window that allows you to extract observations from the dataset according to the criteria you define:
Right Tail Extraction selects the specified percentage of observations, beginning with the highest Log-Loss value.
Interval Extraction allows you to specify a lower and upper boundary of Log-Loss values to be included.
Upon selecting either method and clicking OK, you are prompted to choose a file name and location.
BayesiaLab saves the observations that meet the criteria in csv format.
Note that the Log-Loss values that are used for extraction are not included in the saved dataset.

Analysis Network Performance Overall Learning Set Target