Chapter 4 : Analyzing a bayesian network
The readability of bayesian networks is often put forward as a great advantage. However, bayesian networks are not so readable. Indeed, this readability mainly depends on the number of nodes and arcs, and an arc between two variables only brings the information that the two connected nodes are, probably, dependent.
In order to improve the bayesian networks’ readability, BayesiaLab has powerful automatic positioning algorithms and a complete and original analysis tool box.
4.1 Automatic positioning
(View menu, Automatic positioning sub-menu, Dynamic positioning item).
(View menu, Automatic positioning sub-menu, Genetic algorithm item).
Two algorithms are available to improve the readability of the graph layout:
- A dynamic algorithm that is particularly efficient with arborescent structures or weakly connected Bayesian networks, and that takes into account:
- the relationship between the nodes (the parents try to get a position above their children),
- the crossing of the arcs to try to avoid them,
- the force of the arcs (cf. 4.2)
- the weight of the nodes, defined according to the number of children and parents of each node.
- A genetic algorithm to process the Bayesian networks that are highly connected. This algorithm can take into account (the weight of each factor can be set by using the preferences):
- the relationship between the nodes (the parents try to get a position above their children),
- the overlapping of the nodes
- the force of the arcs (cf. 4.2)
- the intersection of the arcs with the other arcs and with the nodes.
4.2 Analysis of the arcs
(Validation mode, Inference menu, Arc Analysis item).
This tool is a global analysis tool useful to emphasize the importance of the role of each arc in the entire structure. The arc's thickness is proportional to the strength of the probabilistic relations that it represents in the global probability law. This information is used by the automatic positioning algorithms: a short arc indicates an important probabilistic relation.
4.3 Analysis of the target node
(Validation mode, Inference menu, Target Node Analysis item).
This second analysis tool is more local than the arc analysis as the analysis is concentrated toward the target variable. The target node analysis allows the user to see the amount of information being contributed by each node to the knowledge of the target node. The lighter the shade of the square inside the node, the greater the amount of information it carries.
4.4 Analysis of the target state
(Validation mode, Inference menu, Target State Analysis item).
BayesiaLab has a third analysis tool even more local than the target node analysis since the analysis is focused on a state of the target variable. In this analysis, we can quickly see, for each node, two bits of information related to its probabilistic relation with the target:
- the type of influence of the variable on a particular state of the target variable thanks to the symbol inside the node,
- the information gain brought by the node on the knowledge of the target state, shown by the clearness of the symbol.
The symbol contained inside the node schematizes the evolution of the conditional probability of the target state with respect to the each modality of the node. The exact evolution is available by using the node contextual menu “Influence analysis wrt target modality”
4.5 Target Analysis Report
(Validation mode, Inference menu, Target Analysis Report item).
Beside these three, analysis tools, BayesiaLab also offers the possibility to generate an analysis report relative to a target node. This HTML report contains:
- A description of the observed variables when the analysis is carried out.
- The probability distribution of the target variable knowing the observed variables (context).
- The list of nodes that have a probabilistic dependence with the target, sorted by descending order according to their relative contribution to the knowing of the target variable.
- For each value of the target, the list of nodes that have a probabilistic dependence with the target, sorted by descending order according to their relative contribution to the knowing of the target value
- A profile for each value of the target, described by the modal value of each influencing nodes. These profiles are compared with the a priori modal values of the nodes i.e. when the target variable is unobserved
The analysis report below has been carried out on the Bayesian network of our physician. The analysis concerned the young and the target node is Cancer.
4.6 Evidence Analysis Report
(Validation mode, Inference menu, Evidence Analysis Report item).
BayesiaLab can generate a second report dedicated to the analysis of the set of evidences. This analysis goal is to know if the evidences are contradictory or if they are in full agreement with each other. This report, which needs the specification of a root evidence to indicate the reference conclusion, contains:
- Context of the analysis
- A global measure to determine if the observations are contradictory
- An analysis of the evidences with respect to the root evidence, with the evidence set that confirm the root evidence, the evidence set that contradict it and the evidences that are neutral
The report below corresponds to an evidence analysis on a patient suffering from Dyspnea (root evidence). This report indicates a global contradiction in the set of evidences, with two subsets: smoker and Dyspnea, and no bronchitis and normal XRay.
4.7 Causal analysis
(Validation mode, Inference menu, Show the edges item).
(Validation mode, Inference menu, Global Edges Orientation item).
The last analysis tool concerns the causal analysis. It allows removing the orientation of the arcs which orientation can be inverted without changing the joint probability law. For example, with a data base describing only if patients are smokers and if they are suffering from cancer, it is impossible to know, only by using these data, which is the cause and which is the consequence. In that case, it is just possible to set a probabilistic dependence between the two variables. However, as Bayesian networks are directed graphs, the learning algorithms have to set orientation to each relations. Some are then randomly chosen. Displaying the edges allows then to highlight the arcs which orientation does not result from a random choice (cf. 3.3).




