Supervised Multivariate

Context

Algorithm Details & Recommendations

  • The Supervised Multivariate discretization algorithm focuses on representing the multivariate probabilistic dependencies involving a Target variable.

  • It utilizes Random Forests to find the most useful thresholds for predicting the Target variable.

  • Its function can be summarized as follows:

    • Data Perturbation generates a range of datasets.

    • For each perturbed dataset, a multivariate tree is learned to predict the Target variable with a subset of variables. If a structure is already defined, it is used to bias the selection of the variables for each dataset.

    • Extracting the most frequent thresholds produces the final discretization.

  • The Supervised Multivariate takes into account the Minimum Interval Weight and can improve the generalization capability of the model.

  • Being based on Random Forests, this algorithm is computationally expensive and stochastic by nature.

  • After the conclusion of the Data Import Wizard, the Supervised Multivariate discretization algorithm is also available from Main Menu > Learning > Discretization.

  • Not that the Supervised Multivariate discretization algorithm is not available via Node Context Menu > Node Editor > States > Curve > Generate a Discretization.

Last updated

Logo

Bayesia USA

info@bayesia.us

Bayesia S.A.S.

info@bayesia.com

Bayesia Singapore

info@bayesia.com.sg

Copyright © 2024 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.