# Network Performance Analysis Overall — Learning & Test Set

## Context

- This
**Overall Performance Report**evaluates a network with regard to a dataset that does have a**Learning/Test Set**split. - If your dataset does not have a
**Learning/Test Set**split, please see Report for Learning Set. - Given that most performance measures here are the same as in the Report for Learning Set, we refer to that topic when appropriate rather than duplicating the content.
- In this topic, we focus on the additional features and objectives related to the
**Learning/Test Set**split.

## Notation

- $B$ denotes the
**B**ayesian network to be evaluated. - $D$ represents the entire
**D**ataset associated with the Bayesian network $B$. The**Dataset**$D$ is split into two partitions:- $D_L$ represents the
**Learning Set**of the**D**ataset from which network $B$ was learned. - $D_T$ represents the
**Test Set**(or holdout sample), i.e., the portion of the**D**ataset that will be used for evaluation network $B$. - $E_L$ represents an n-dimensional observation (
**E**vidence), i.e., one row or record in the**Learning Set**$D_L$, from which the Bayesian network $B$ was learned. - $E_T$ represents an n-dimensional observation (
**E**vidence), i.e., one row or record in the**Test Set**$D_T$, which will be used to evaluate network $B$. - $N_L$ refers to the number of observations $E_L$ in the
**Learning Set**$D_L$. - $N_T$ refers to the number of observations $E_T$ in the
**Test Set**$D_T$.

- $D_L$ represents the
- $C$ refers to a
**C**omplete or fully connected network, in which all nodes have a direct link to all other nodes. Therefore, the complete network $C$ is an exact representation of the chain rule. As such, it does not utilize any conditional independence assumptions for representing the**Joint Probability Distribution**. - $U$ represents an
**U**nconnected network, in which there are no connections between nodes, which means that all nodes are marginally independent.

## Example

To explain and illustrate the **Overall Performance Report**, we use a Bayesian network model that was generated with one of BayesiaLab's Unsupervised Learning algorithms. This network is available for download here:

## Overall Performance Report

The Report window consists of three tabs

**Test Dataset****Learning Dataset****Comparison**

which feature two views each:

**Density Function**- The x-axis represents the
**Log-Loss**values in increasing order. - The y-axis shows the probability density for each
**Log-Loss**value on the x-axis.

- The x-axis represents the
**Distribution Function**- The observations $E$ in the dataset $D$ are sorted in ascending order according to their
**Log-Loss**values:- The x-axis shows the observation number.
- The y-axis shows the
**Log-Loss**value corresponding to each observation.

- The observations $E$ in the dataset $D$ are sorted in ascending order according to their

Test Dataset Evaluation | Learning Dataset Evaluation | Comparison |
---|---|---|

The radio buttons on the bottom-left of the window allow you to switch the view between the **Density** function (Histogram) and the **Distribution** function.

Either view provides a visualization of the **Log-Loss** values for all observations in the dataset $D$ given the to-be-evaluated Bayesian network $B$. Thus, the plots provide you with a visual representation of how well the network $B$ fits the dataset $D$.

## Comparing Density & Distribution Functions

In the topic, Overall Performance Report for the Learning Set, the focus was primarily on how well the network $B$ fits dataset $D$. All measures were about goodness-of-fit.

Our objective for this topic is broader. We are still looking for a good fit, but also want to understand how well the learned network model generalizes beyond the dataset from which it was learned.

In this context, the **Comparison** tab provides a key visual. You can see the histograms of the **Log-Losses** of the **Learning Set** and **Test Set** overlaid on top of each other.

### Log-Loss Computation

The computation of **Log-Loss** values is at the very core of this **Overall Performance Report**. In the Report for Learning Set, the Log-Loss values were computed for the entire dataset $D$.

Given the **Learning/Test Set** split, BayesiaLab now needs to compute all metrics separately for the **Learning Set** and the **Test Set**.

And, in addition to **Log-Loss** $LL_B(E)$ for the to-be-evaluated network $B$, BayesiaLab also needs to compute the **Log-Loss** values $LL_C(E)$ for the complete network $C$ and **Log-Loss** $LL_U(E)$ and the unconnected network $U$.

So, to produce the plots and all related metrics, BayesiaLab has to perform the following computations:

- $LL_B(E_L)$, the
**Log-Loss**value for each observation/evidence in the**Learning Set**based on the learned and to be-evaluated Bayesian network B. - $LL_C(E_L)$, the
**Log-Loss**value for each observation/evidence in the**Learning Set**based on the complete network $C$. - $LL_U(E_L)$, the
**Log-Loss**value for each observation/evidence in the**Learning Set**based on the unconnected network $U$. - $LL_B(E_T)$, the
**Log-Loss**value for each observation/evidence in the**Test Set**based on the learned and to be-evaluated Bayesian network $B$. - $LL_C(E_T)$, the
**Log-Loss**value for each observation/evidence in the**Test Set**based on the complete network $C$. - $LL_U(E_T)$, the
**Log-Loss**value for each observation/evidence in the**Test Set**based on the unconnected network $U$.

The following **Log-Loss Table** is an extract of the first ten rows each from the **Learning Set** $D_L$ and the **Test Set** $D_T$ along with the computed **Log-Loss** values for each record:

### Log-Loss Table

Learning/Test | Month | Hour | Temperature | Shortwave Radiation (W/m²) | Wind Speed (m/s) | Energy Demand (MWh) | Log-Loss (Bayesian Network) | Log-Loss (Complete Network) | Log-Loss (Unconnected Network) |
---|---|---|---|---|---|---|---|---|---|

$L{L_B}(E_L)$ | $L{L_C}(E_L)$ | $LL_U(E_L)$ | |||||||

learning | 8 | 18 | 36.57 | 213.60 | 2.00 | 1574.00 | 13.43 | 14.68 | 21.96 |

learning | 8 | 19 | 36.04 | 105.91 | 1.90 | 1574.00 | 13.85 | 14.68 | 21.61 |

learning | 8 | 20 | 34.71 | 42.72 | 2.14 | 1485.00 | 12.13 | 11.87 | 19.41 |

learning | 8 | 21 | 33.94 | 0.00 | 2.75 | 1470.00 | 11.88 | 11.87 | 17.71 |

learning | 8 | 22 | 33.19 | 0.00 | 3.55 | 1378.00 | 11.89 | 11.09 | 17.72 |

learning | 8 | 23 | 32.38 | 0.00 | 4.21 | 1249.00 | 14.12 | 12.68 | 16.93 |

learning | 8 | 0 | 31.56 | 0.00 | 4.50 | 1110.00 | 13.05 | 12.36 | 16.94 |

learning | 8 | 2 | 29.66 | 0.00 | 4.90 | 975.00 | 11.22 | 11.68 | 14.66 |

learning | 8 | 3 | 29.02 | 0.00 | 4.60 | 944.00 | 10.91 | 11.36 | 14.66 |

learning | 8 | 5 | 27.16 | 0.00 | 3.11 | 927.00 | 11.29 | 10.98 | 14.65 |

⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ |

Entropy $H_B(D_L)$ | Entropy $H_C(D_L)$ | Entropy $H_U(D_L)$ | |||||||

Mean | 13.16 | 12.49 | |||||||

Std. Dev. | 2.06 | 1.34 | |||||||

Minimum | 9.69 | 9.43 | |||||||

Maximum | 30.88 | 14.68 | |||||||

Normalized | 68.369% | 64.893% | |||||||

Learning/Test | Month | Hour | Temperature | Shortwave Radiation (W/m²) | Wind Speed (m/s) | Energy Demand (MWh) | Log-Loss (Bayesian Network) | Log-Loss (Complete Network) | Log-Loss (Unconnected Network) |

$LL_B(E_T)$ | $LL_C(E_T)$ | $LL_U(E_T)$ | |||||||

test | 8 | 1 | 30.60 | 0.00 | 4.80 | 1031.00 | 13.41 | 14.68 | 16.90 |

test | 8 | 4 | 28.16 | 0.00 | 3.70 | 926.00 | 10.74 | 10.68 | 14.68 |

test | 8 | 15 | 33.20 | 318.62 | 2.56 | 1554.00 | 14.62 | 12.68 | 20.63 |

test | 8 | 18 | 32.71 | 192.24 | 2.13 | 1468.00 | 13.74 | 14.68 | 19.37 |

test | 8 | 0 | 27.09 | 0.00 | 4.75 | 1113.00 | 11.09 | 11.09 | 16.44 |

test | 8 | 1 | 25.87 | 0.00 | 7.53 | 1033.00 | 13.69 | 12.68 | 17.62 |

test | 8 | 4 | 23.27 | 0.00 | 8.90 | 928.00 | 15.82 | 14.68 | 15.77 |

test | 8 | 5 | 23.10 | 0.00 | 6.82 | 928.00 | 11.71 | 11.51 | 15.04 |

test | 8 | 14 | 28.61 | 353.33 | 3.19 | 1412.00 | 19.10 | ? | 19.35 |

test | 8 | 20 | 27.70 | 27.59 | 4.12 | 1459.00 | 12.39 | 12.09 | 19.00 |

⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ | ⁞ |

Entropy $H_B(D_T)$ | Entropy $H_C(D_T)$ | Entropy $H_U(D_T)$ | |||||||

Mean | 13.23 | 12.41 | |||||||

Std. Dev. | 2.12 | 1.31 | |||||||

Minimum | 9.69 | 9.43 | |||||||

Maximum | 26.07 | 14.68 | |||||||

Normalized | 68.725% | 64.499% |

The complete **Log-Loss Table** serves as the basis for calculating the measures that are reported at the bottom of the report.

The measures reported at the bottom of the **Test Set** and **Learning Set** tab, are shown side by side on the **Comparison** tab:

For clarity, we match up the report's labels to the notation introduced at the beginning of this topic and the corresponding definitions.

Label in Report | Definition in Context of Test Set | Definition in Context of Learning Set |
---|---|---|

Entropy (H), i.e., mean of Log-Loss Values of all observations | ${H_B}({D_T}) = \overline{L{L_B}({E_T})}$ | ${H_B}({D_L}) = \overline{L{L_B}({E_L})}$ |

Normalized Entropy (Hn) | ${H_{BN}}({D_T}) = \frac{{{H_B}({D_T})}}{{{{\log }_2}({S_D})}}$ | ${H_{BN}}({D_L}) = \frac{{{H_B}({D_L})}}{{{{\log }_2}({S_D})}}$ |

Hn(Complete) | ${H_{CN}}({D_T}) = \frac{{{H_C}({D_T})}}{{{{\log }_2}({S_D})}}$ | ${H_{CN}}({D_L}) = \frac{{{H_C}({D_L})}}{{{{\log }_2}({S_D})}}$ |

Hn(Unconnected) | ${H_{UN}}({D_T}) = \frac{{{H_U}({D_T})}}{{{{\log }_2}({S_D})}}$ | ${H_{UN}}({D_L}) = \frac{{{H_U}({D_L})}}{{{{\log }_2}({S_D})}}$ |

Contingency Table Fit | ${CTF_B} = 100 \times \frac{{{H_U}({D_T}) - {H_B}({D_T})}}{{{H_U}({D_T}) - {H_C}({D_T})}}$ | ${CTF_B} = 100 \times \frac{{{H_U}({D_L}) - {H_B}({D_L})}}{{{H_U}({D_L}) - {H_C}({D_L})}}$ |

Deviance | $De{v_B}({D_T}) = 2N \times \ln(2) \times \left( {H_B}({D_T}) - {H_C}({D_T}) \right)$ | $De{v_B}({D_L}) = 2N \times \ln(2) \times \left( {H_B}({D_L}) - {H_C}({D_L}) \right)$ |

Number of Processed Observations | $N(D_T)$ | $N(D_L)$ |

### Impossible Observations

**Impossible Observations** refer to observations/evidence $E_T$ in the **Test Set** $D_T$, which are entirely incompatible with the network $B$ that was learned from the **Learning Set** $D_L$.

### Kolmogorov-Smirnov Test

The **Kolmogorov-Smirnov Test** (KS Test) is a common tool for comparing statistical distributions. Here, we use it to compare the distribution similarity between two samples, i.e. learning and test, so this test is only computed for the comparison panel.

More specifically, it compares the distributions of the Log-Losses. The values Z, D and the corresponding p-value are displayed.

### Extract Data Set

The final element in the report window is the **Extract Data Set** button. This is a practical tool for identifying and examining outliers, e.g., those at the far end of the right tail of the histogram.

- Clicking the
**Extract Data Set**button brings up a new window that allows you to extract observations from the dataset according to the criteria you define: **Right Tail Extraction**selects the specified percentage of observations, beginning with the highest**Log-Loss**value.

**Interval Extraction**allows you to specify a lower and upper boundary of Log-Loss values to be included.- Upon selecting either method and clicking OK, you are prompted to choose a file name and location.
- BayesiaLab saves the observations that meet the criteria in csv format.
- Note that the
**Log-Loss**values that are used for extraction are not included in the saved dataset.