bayesia logo
BayesiaLab
Relationship with Target Node

Relationship with the Target Node

Context

  • The Correlation with Target Node report is available for any target node.

Usage

  • You can run this report either from the Analysis menu, or by using the shortcut R.
  • This report, which can be saved in HTML format, has several sections:

Analysis Context

  • This section lists any variable observations that apply while analysis is performed.

Marginal Probability Distribution

  • This section lists the probability distribution of the Target Node, given the observed variables listed in the Analysis Context.

Distribution on the Learning Set

  • If a database is associated with the network, the Target Node's probability distribution is computed directly from the database.

Target Mean Purity

  • The Target Node's Mean Purity is computed if a database is associated with the network. For each record in the database, we check if the network correctly predicts the actual Target State.

The table underneath shows, state-by-state, the results of the predictions. The column Neighborhood indicates for what state the network prediction was wrong, along with the corresponding percentage.

Performance Indices

  • If a database is associated, two performance indices are computed for the network:
  • The Contingency Table Fit represents the degree of fit between the network's joint probability distribution and the associated data. The better the network represents the database, the higher is the value of the Contingency Table Fit. This measure, computed from the database's mean log-likelihood is equal to 100% if the joint is completely represented (as in a fully connected network), or 0% if the joint is represented by a fully disconnected network. The dimensions represented by any not-observable nodes are excluded from the computation.
  • The Deviance measure is computed from the difference between the network's mean log-likelihood and the database's mean log-likelihood. More the value tends towards 0 the more the network is close to the database.

Node Significance with Respect to the Information Gain Brought by the Node to the Knowledge of the Target Node

  • This shows a list of nodes, sorted by descending order according to the information they bring to the knowledge of the Target Node. The nodes that do not provide any information do not appear in this list. This corresponds to the Target Node Analysis.

  • Mutual Information is the amount of information brought by each variable to the Target Node.

  • Mutual Information (%) is the amount of information brought by each variable to the Target Node,** as compared to an unconnected network.

  • Relative Significance: ratio between the Mutual Information brought by each variable and the maximum Mutual Information, i.e., the value shown at the top of the list.

  • The Mean Value column displays the nodes' mean values. Each node's mean is computed as follows:

    • if the node has values associated with its states, the mean is computed from them.
    • If the node is continuous, its mean is computed from the intervals.
    • If the node is discrete with integer or real-values states, the mean is computed from them.
    • Otherwise, a default set of values from 0 to the number of states minus one is used.
  • Chi2-test or G-test:

  • Independence tests, Chi2 or G-test, are computed from the network for each variable and the Target Node.

  • It is possible to change the type of independence test in Statistical Tools settings in `Menu > .

  • Degrees of Freedom indicates the degree of freedom between each variable and the Target Node in the network.

  • p-value: represents the independence probability between each variable with the Target Node in the network.

  • Chi2 or G-test (Data): The independence tests, Chi2 or G-test, are computed as above, but from the associated database, if applicable.

  • Degrees of Freedom (Data): The degree of freedom is computed as above, but from the associated database, if applicable.

  • p-value (Data): The independence probability of the test is computed as above, but from the associated database, if applicable.

Node Relative Significance with Respect to the Information Gain Brought by the Node to the Knowledge of the Target Value

  • For each value of the target, except for the filtered state, list of nodes, sorted by descending order according to their relative contribution to the knowing of the target value (if the node has only two states, this list is identical to the preceding one).

  • The nodes that do not bring any information do not appear in this list. That corresponds to the Target State Analysis.

  • Binary Mutual Information: Amount of information brought by each variable to the knowledge of the state of the target variable.

  • Binary Mutual Information (%): Amount of information brought by each variable to the knowledge of the state of the target variable compared to an unconnected network.

  • Binary Relative Significance: Ratio between the mutual information brought by each variable and the greater mutual information.

  • Mean Value: Displays the nodes' mean value of each target's state.

  • ModalValue: For each influencing node, description of the modal value (the most probable) with respect to the context and the observed state of the target node. This modal value comes with its probability. This section allows establishing the profile of this target value.

  • A priori Modal Value: For each influencing node, description of the modal value when the target node is unobserved (but knowing the context). That makes it possible to define the profile when the target variable is unobserved.

  • Variation: Measure indicating the variation between the a priori modal value and the modal value when we know what the value of the target variable is. The used formula is: log2(P(X=modalvalue))+log2(P(X=modalvalueTarget=observedvalue))-log_2(P(X=modal value))+log_2(P(X=modal value|Target=observed value)). In Information Theory, this measure represents how many bits are won to represent the probability of X when the target value is known. Values printed in blue represent positive variations (the posterior probability of the modal value is greater than the prior one) while values printed in red represent negative variations. Obviously, no variation is reported if the posterior modal value is different from the prior modal value. The modal value is then displayed in blue.

  • Maximum Positive/Negative Variation: Measures indicating the states that have been the most impacted by the observation of the corresponding target state. The first one indicates the state with the greatest increase whereas the second one indicates the state with the greatest decrease.

These measures correspond to the longest (right and left) grey arrows in the monitor (Inference mode) showing the probability variation of states. These measures are particularly useful for the variables that have more than two states!

When a node has an associated comment, this node appears as a HTML link, and the comment can be displayed by pointing on the node.

Mapping

  • If the target variable is a hidden variable, as it is the case for example for the variable Cluster induced by Data Clustering, a button Mapping allows generating a mapping of the values of this variable.

Quadrants

A button Quadrants displays the Quadrant chart of the node's relative significance with the target relatively to each node's mean:

The points represent the variables. If a node has a color then the point will be displayed in this color. The node's name is displayed at the right of the point. When moving the mouse on the point, the coordinates are displayed in the top panel. The top panel shows the number of points displayed as well as the evidence context. It is possible to zoom in the chart as in the scatter of points.

The chart's contextual menu allows the user to:

  • Display or not the nodes' comment instead of their names
  • Copy the chart as image or as table of points (text or html)
  • Print the chart

The chart is automatically resized when the size of the window changes. This chart is cut into four quadrants whose separations are:

  • Along X-axis: the mean of the variables' means
  • Along Y-axis: the mean of the relative significance of the nodes with the target
  • Quadrant 1: Top right: contains the important variables greater than the mean
  • Quadrant 2: Bottom right: contains the important variables few or not important but greater than the mean
  • Quadrant 3: Bottom left: contains the few or not important variables that are below the mean
  • Quadrant 4: Top left: contains the important variables but that stay below the mean It is possible to save the image of the chart in a file with the corresponding button.

Copyright © 2025 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.