bayesia logo

Pearson Correlation

Context

  • In BayesiaLab's approach to learning and analyzing Bayesian networks, statistical concepts play a secondary role compared to concepts from the field of Information Theory (see Key Concepts).
  • Nevertheless, statistical measures, such as correlation, can provide certain insights that are not available from non-statistical measures.

Definition

The Pearson Correlation Coefficient rr between two nodes XX and YY is defined as the covariance of the two corresponding variables divided by the product of their standard deviations:

r=cov(X,Y)σXσYr = \frac{{{\mathop{\rm cov}} (X,Y)}}{{{\sigma _X}{\sigma _Y}}}

where the covariance is defined by:

cov(X,Y)=x,yp(x,y)×(VxvX)×(VyvY){\mathop{\rm cov}} (X,Y) = \sum\limits_{x,y} {p(x,y) \times ({V_x} - {v_X})} \times ({V_y} - {v_Y})

and the standard deviation:

σX=xpx×(VxvX)2{\sigma _X} = \sqrt {{{\sum\limits_x {{p_x} \times ({V_x} - {v_X})} }^2}}
  • Vx{{V_x}} is the value that is associated with the state xx.
  • vX{{v_X}} is the Expected Value of node XX
  • px{{p_x}} is the marginal probability of state xx returned by the Bayesian network
  • p(x,y){p(x,y)} is the joint probability of states xx and yy returned by the Bayesian network

Special Considerations

  • For calculating the Pearson Correlation rr, BayesiaLab must use the values of node states.
  • In BayesiaLab, there are Discrete Nodes and Continuous Nodes with discretized numerical states. As a result, the value of a node's state may not always be apparent:
    • For Discrete Nodes and Continuous Nodes that have states with integer or real values, BayesiaLab uses these numerical values directly.
    • For Discrete Nodes and Continuous Nodes that have states without values, e.g., {red, green, blue}, BayesiaLab uses the indices of the states as values, i.e., {red, green, blue} would have the values {0, 1, 2} for the purpose of calculating rr. Note that the index of states starts at 0.
    • For Continuous Nodes, BayesiaLab uses these mean values of each interval.
  • Please see Mean, Value, and Standard Deviations for a detailed discussion.

Usage

  • To display the Pearson Correlation on the arcs of the network, select Menu > Analysis > Visual > Overall > Arc > Pearson Correlation or press the G key as a shortcut.
  • The width of each arc in the network is now proportional to the Pearson Correlation.
  • An additional control panel is available in the Toolbar, which allows you to define the Pearson Correlation threshold for the arcs.
  • By moving the slider or typing in a specific value, BayesiaLab grays out all arcs that fall below that threshold.
  • Alternatively, you can use the previous and next buttons to step through the specific thresholds at which arcs are added and disappear respectively.
  • Furthermore, you can specify the following options in the control panel:
    • displays only those arcs that have a negative correlation greater than the value specified as a threshold. So, in mode, a threshold of 0.5 means that correlations in the range of 1R0.5 -1 \le R \le - 0.5 will be shown.
    • displays only those arcs that have a correlation with an absolute value greater than the one specified as a threshold. So, in mode, a threshold of 0.5 means that correlations in the range of 1R0.5 -1 \le R \le -0.5 and 0.5R10.5 \le R \le 1 will be shown.
    • displays only those arcs that have a positive correlation greater than the value specified as a threshold. So, in mode, a threshold of 0.5 means that correlations in the range of 1R0.5- 1 \le R \le - 0.5 will be shown.
  • Click the Arc Comment icon on the Toolbar to display the Pearson Correlation values as a comment label on each arc. Alternatively, you can select Menu > View > Show Arc Comments.
  • Positive and negative correlations are marked blue and red respectively. This color assignment reflects the convention followed in BayesiaLab.
  • By clicking the checkmark icon for validation, you can save all computed Pearson Correlation values as Arc Comments so that they will be retained even after this analysis concludes. This validation also saves the widths of the arcs as a graphical property.
⚠️

Note that once values have been saved as Arc Comments, they are merely static text labels, which will not be updated if the network changes. |

  • Clicking the cancel icon concludes the analysis without saving any information from the analysis.

Copyright © 2025 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.