Stratification
Context
- There are many research questions, in which the cases of interest are very rare compared to regular observations.
- For example, when modeling fraud, the number of fraudulent transactions is presumably small compared to the legit transactions.
- As a result, it would be difficult for a learning algorithm to detect associations between nodes related to those rare instances of fraud.
- With Stratification, you can modify the probability distributions within nodes by creating internal weights for specific states, i.e., the rare but important states.
- The probability distributions that are modified in this way push the learning algorithm towards discovering a network that is structurally more complex and can, thus, better represent rare observations.
- However, once the structure is learned, the parameters, i.e., the Conditional Probability Tables, are estimated on the original, unstratified dataset.
Usage
- Select the nodes to be stratified.
- Go to
Menu > Learning > Stratification
. - A dialog box opens up in which you can specify the proportions of each state of the selected nodes.
- The marginal distributions of the selected nodes are shown in separate panels.
- At the bottom of each panel, the Entropy values that correspond to the distributions.
- Move the sliders to set the proportions to the desired levels or type in the percentages directly:
- As you change the probability, the Entropy values are updated.
- Once you confirm the probabilities by clicking OK, the Stratification is set.
- All stratified nodes are now marked with the Stratification indicator .
- Additionally, the database icon in the Status Bar is tagged with a Stratification icon .
- You can remove the Stratification by right-clicking on the icon in the Status Bar and then selecting Remove Stratification from the Contextual Menu.