bayesia logo
BayesiaLab
Analysis Network Performance Overall Learning Set

Network Performance Analysis Overall — Learning Set

Context

  • This Overall Performance Report evaluates a network with regard to a dataset that does not have a Learning/Test Set split.
  • If your dataset does have Learning/Test Set split, please see Report for Learning and Test Set.

Notation

  • BB denotes the Bayesian network to be evaluated.
  • DLD_L represents the Dataset from which network BB was learned.
  • ELE_L represents EE evidence. Evidence refers to an n-dimensional observation, i.e., one row or record in the dataset DLD_L, from which the Bayesian network BB was learned.
  • NLN_L refers to the number of observations ELE_L in the dataset DLD_L.
  • CC refers to a Complete or fully connected network, in which all nodes have a direct link to all other nodes. Therefore, the complete network CC is an exact representation of the chain rule. As such, it does not utilize any conditional independence assumptions for representing the Joint Probability Distribution.
  • UU represents an Unconnected network, in which there are no connections between nodes, which means that all nodes are marginally independent.

Example

To explain and illustrate the Overall Performance Report, we use a Bayesian network model that was generated with one of BayesiaLab's Unsupervised Learning algorithms. This network is available for download here:

ElPaso.xbl

Overall Performance Report

Density & Distribution Function

The top part of the Report features a plot area, which offers two views:

  • Density Function
    • The x-axis represents the Log-Loss values in increasing order.
    • The y-axis shows the probability density for each Log-Loss value on the x-axis.
  • Distribution Function
    • The observations EE in the dataset DD are sorted in ascending order according to their Log-Loss values:
      • The x-axis shows the observation number.
      • The y-axis shows the Log-Loss value corresponding to each observation.

Click on the thumbnail images to enlarge the screenshots of each view:

Density Function (Histogram)Distribution Function

The radio buttons on the bottom-left of the window allow you to switch the view between the Density function (Histogram) and the Distribution function.

Either view provides a visualization of the Log-Loss values for all observations in the dataset DLD_L given the to-be-evaluated Bayesian network BB. Thus, the plots provide you with a visual representation of how well the network BB fits the dataset DLD_L.

The Density view, in particular, allows you to judge the fit of the network by looking at the shape of the Log-Loss histogram.

The bars at the low end of the x-axis represented well-fitting observations. Conversely, the bars that are part of the long tail on the right represent poorly-fitting observations.

While in the Density view, you can adjust the Number of Intervals used for the histogram within a range from 1 to 100, as illustrated in the following animation:

Log-Loss

The computation of Log-Loss values is at the very core of this Overall Performance Report.

The Log-Loss value reflects the number of bits required to encode the n-dimensional evidence ELE_L, i.e, an observation, row, or record in the dataset DLD_L given the to-be-evaluated Bayesian network BB:

LLB(EL)=log2(PB(EL))L{L_B}(E_L) = - {log_2}\left( {{P_B}(E_L)} \right)

where PB(EL)P_B(E_L) is the joint probability of evidence E computed by the Bayesian network BB:

PB(EL)=PB(e1,...,en){P_B}(E_L) = {P_B}({e_1},...,{e_n})

In other words, the lower the probability of evidence ELE_L given the Bayesian network BB, the higher is the Log-Loss LLB(EL)LL_B(E_L). As such, the Log-Loss value of an observation represents its fit to network B.

So, to produce the plots and all related metrics, BayesiaLab has to perform the following computations:

  • LLB(EL)LL_B(E_L), the Log-Loss value for each observation/evidence in the Learning Set based on the learned and to be-evaluated Bayesian network B.
  • LLC(EL)LL_C(E_L), the Log-Loss value for each observation/evidence in the Learning Set based on the complete network C.
  • LLU(EL)LL_U(E_L), the Log-Loss value for each observation/evidence in the Learning Set based on the unconnected network UU.

The following Log-Loss Table is an extract of the first ten rows each from the Learning Set D_L with the computed Log-Loss values for each record:

Log-Loss Table

Evidence E from the Dataset DComputed Values
MonthHourTemperatureShortwave Radiation (W/m2)Wind Speed (m/s)Energy Demand (MWh)Log-Loss (Bayesian Network)Log-Loss (Complete Network)Log-Loss (Unconnected Network)
LLB(EL)LL_B(E_L)LLC(EL)LL_C(E_L)LLU(EL)L{L_U}(E_L)
81836.57213.62157413.4215.0022.06
81936.04105.911.9157413.5515.0021.68
82034.7142.722.14148511.9311.6819.4
82133.9402.75147011.9212.0017.73
82233.1903.55137811.8111.0917.73
82332.3804.21124913.6912.4116.93
8031.5604.5111012.9112.1916.93
8130.604.8103113.2113.4116.93
8229.6604.997511.1611.6814.7
8329.0204.694410.8511.1914.7
HB(DL)H_B(D_L)HC(DL)H_C(D_L)HU(DL)H_U(D_L)
Mean13.1712.5617.46
Std. Dev.2.081.402.17
Minimum9.759.4814.37
Maximum31.7815.0031.06
log2(SX)=19.2467{log_2}({S_X}) = 19.2467Normalized68.44%65.27%90.73%

Performance Measures

Below the plot area of the window, the Overall Performance Report shows a range of quality measures.

For clarity, we match up the report's labels to the notation introduced at the beginning of this topic.

Label in ReportNotation in this TopicExplanation
Entropy (H)HB(DL)H_B(D_L)Mean of Log-Loss Values of all observations ELE_L in the dataset DLD_L
Normalized Entropy (Hn)HBN(DL)H_{BN}(D_L)
Hn(Complete)HCN(DL)H_{CN}(D_L)
Hn(Unconnected)HUN(DL)H_{UN}(D_L)
Contingency Table FitCTFB(DL)CTF_B(D_L) see Normalized Entropies
DevianceDevB(DL)Dev_B(D_L)DevB=2N×ln(2)×(HB(DL)HC(DL)){Dev_B} = 2N \times \ln (2) \times \left( {{H_B}({D_L}) - {H_C}({D_L})} \right)
Number of Processed ObservationsN(DL)N(D_L)

Entropy

The first item, Entropy (H)(H), refers to the evaluated network BB. Hence, it is also denoted Entropy HBH_B elsewhere in this topic for clarity.

More specifically, Entropy HB(DL)H_B(D_L) is the arithmetic mean of all Log-Loss values LLB(EL)LL_B(E_L) of each observation in the dataset DLD_L given network BB. In the Data Table above, Entropy HB(DL)H_B(D_L) is highlighted.

Normalized Entropies

With Entropy not being directly interpretable as a standalone value, the report includes the Normalized Entropy (Hn)(Hn). Here, Normalized Entropy (Hn)(Hn) also refers to the evaluated network BB.

Note that in the standalone topic on Entropy, we defined Normalized Entropy on the basis of a single variable with one set of states.

Here, however, we need to consider that we have several variables with differing numbers of states. So, we require a more general definition of Normalized Entropy:

where

  • X\cal X is the set of variables in network BB.
  • is the size of the Joint Probability Distribution, i.e., the number of state combinations defined by all variables in BB.

With that, can calculate the value:

{H_{BN}}({\cal X}) = \frac{{{H_B}({\cal X})}}{{{{\log }_2}({S_{\cal X}})}} = \frac{{{13.1733}}}}{{{19.2467}}}} = 68.44\%

Furthermore, the report provides the Normalized Entropies for a complete (fully-connected) network CC and the unconnected network UU.

Complete (Fully-Connected) Network CC

Hn(Complete)H_n(Complete) refers to the Normalized Entropy computed from all observations with a complete network CC (depicted below), which is the best-fitting representation of the observations.

Unconnected Network UU

Hn(Unconnected)H_n(Unconnected) is the Normalized Entropy obtained with an unconnected network UU, which is the worst-fitting representation of the observations.

Contingency Table Fit (CTF)

Contingency Table Fit (CTF) measures the quality of the representation of the Joint Probability Distribution by a Bayesian network BB in comparison to a complete network CC.

BayesiaLab's CTF is defined as:

where

  • HU(D)H_U(\mathcal{D}) is the entropy of the dataset with the unconnected network UU.
  • HB(D)H_B(\mathcal {D}) is the entropy of the dataset with the network BB.
  • HC(D)H_C(\mathcal {D}) is the entropy of the dataset with the complete network CC.

Interpretation

  • CBC_B is equal to 100 if the Joint Probability Distribution is represented without any approximation, i.e., the entropy of the evaluated network BB is the same as the one obtained with the complete network CC.
  • CBC_B is equal to 0 if the Joint Probability Distribution is represented by considering that all the variables are independent, i.e., the entropy of the evaluated network BB is the same as the one obtained with the unconnected network UU.
  • CBC_B can also be negative if the parameters of network BB do not correspond to the dataset.
  • The dimensions represented by Not-Observable Nodes are excluded from this computation.

Deviance

  • The Deviance measure is based on the difference between the Entropy of the to-be-evaluated network BB and the Entropy of the complete (i.e., fully-connected) network CC.

Definition

Deviance is formally defined as:

DB=2N×ln(2)×(HB(D)HC(D)){D_B} = 2N \times \ln (2) \times \left( H_B(\mathcal {D}) - H_C(\mathcal {D}) \right)

where

  • HB(D)H_B(\mathcal {D}) is the Entropy of the dataset given the to-be-evaluated network BB.
  • HC(D)H_C(\mathcal {D}) is the Entropy of the dataset given the complete (i.e., fully-connected) network CC.
  • NN is the size of the dataset.

Using the values from the Data Table above, we obtain:

DB=2N×ln(2)×(HBHC)=2×32,759×0.6932×13.173312.5627=27,735.579{D_B} = 2N \times \ln (2) \times \left( {{H_B} - {H_C}} \right) = 2 \times 32,759 \times 0.6932 \times 13.1733 - 12.5627 = 27,735.579

Interpretation

  • The closer the Deviance value is to 0, the better the network BB represents the dataset.

Report Footer

Extract Data Set

The final element in the report window is the Extract Data Set button. This is a practical tool for identifying and examining outliers, e.g., those at the far end of the right tail of the histogram.

  • Clicking the Extract Data Set button brings up a new window that allows you to extract observations from the dataset according to the criteria you define:

  • Right Tail Extraction selects the specified percentage of observations, beginning with the highest Log-Loss value.

  • Interval Extraction allows you to specify a lower and upper boundary of Log-Loss values to be included.

  • Upon selecting either method and clicking OK, you are prompted to choose a file name and location.

  • BayesiaLab saves the observations that meet the criteria in CSV format.

  • Note that the Log-Loss values that are used for extraction are not included in the saved dataset.


Copyright © 2024 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.