BayesiaLab
Joint Probability & Joint Probability Distribution

Joint Probability & Joint Probability Distribution (JPD)

Definition

  • A Joint Probability Distribution is the distribution of Joint Probabilities.
  • A Joint Probability is the probability of specific values of variables jointly occurring in a domain.

Example

  • We observe the variables HairColorHair\, Color and EyeColorEye\, Color in a population of college students.

  • Joint Probability refers to the probability of specific values for HairColorHair\, Color and EyeColorEye\, Color jointly occurring in this population.

  • For instance,

    • P(EyeColor=Blue,HairColor=Blond)=15.86P(Eye\, Color=Blue,\, Hair\, Color=Blond)=15.86% means that the probability of a student having blue eyes and blond hair in the given population is 15.86%.
    • P(EyeColor=Green,HairColor=Black)=0.85P(Eye\, Color=Green,\, Hair\, Color=Black)=0.85% means that the probability of having green eyes and black hair in that population is only 0.85%.
  • We can now look across all possible combinations of HairColorHair\, Color and EyeColorEye\, Color, compute all Joint Probabilities and list them in a Joint Probability Table, with one row for each combination of the states of the variables.

  • In this example, the size of the Joint Probability Table is manageable:

    NumberofStates(HairColor)×NumberofStates(EyeColor)=4×4=16Number\, of\, States\, (Hair\, Color) × Number\,of\,States\, (Eye\, Color) = 4 × 4 = 16

  • This Joint Probability Table is a direct and complete representation of the Joint Probability Distribution for the variables HairColorHair\, Color and EyeColorEye\, Color:

    Hair ColorEye ColorJoint Probability
    BlackBrown11.49%
    BrownBrown20.10%
    RedBrown4.39%
    BlondBrown1.18%
    BlackBlue3.38%
    BrownBlue14.19%
    RedBlue2.87%
    BlondBlue15.88%
    BlackHazel2.53%
    BrownHazel9.12%
    RedHazel2.36%
    BlondHazel1.69%
    BlackGreen0.84%
    BrownGreen4.90%
    RedGreen2.36%
    BlondGreen2.70%
    Sum100.00%

Relevance

  • As the Joint Probability Distribution covers all possible combinations, it represents all regularities and patterns (or the lack thereof) within a domain.
  • Knowing the Joint Probability Distribution is required for performing two key operations for data analysis and inference:
    • Marginalization, which is calculating the marginal probability of a variable, e.g., P(HairColor=Black)=18.25P(Hair\, Color=Black)=18.25%.
    • Conditioning, which refers to inferring the values of a variable, given a specific value of another variable, e.g., P(HairColor=BlondEyeColor=Blue)=43.7P(Hair\, Color=Blond | Eye\, Color=Blue)=43.7%.

Challenge

  • In high-dimensional domains, however, calculating and listing the Joint Probabilities in a Joint Probability Table can become intractable.
  • The size of a Joint Probability Table grows exponentially with the number of variables. For example, if we had 20 variables with 4 states each, the size of the corresponding Joint Probability Table would exceed 1 trillion rows.
  • While the arithmetic is straightforward, the sheer number of calculations can easily exceed the available computational power, both for generating the Joint Probability Table as well as for performing Marginalization and Conditioning.
  • "The only way to deal with such large distributions is to constrain the nature of the variable interactions in some manner, both to render specification and ultimately inference in such systems tractable. The key idea is to specify which variables are independent of others, leading to a structured factorisation of the joint probability distribution. Bayesian Belief Networks are a convenient framework for representing such factorisations into local conditional distributions." (Barber, 2012)
  • This means that Bayesian networks are extremely practical for approximating Joint Probability Distributions in complex, high-dimensional problem domains.

References

  • Barber, D. (2012). Bayesian Reasoning and Machine Learning. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511804779

For North America

Bayesia USA

4235 Hillsboro Pike
Suite 300-688
Nashville, TN 37215, USA

+1 888-386-8383
info@bayesia.us

Head Office

Bayesia S.A.S.

Parc Ceres, Batiment N 21
rue Ferdinand Buisson
53810 Change, France

For Asia/Pacific

Bayesia Singapore

1 Fusionopolis Place
#03-20 Galaxis
Singapore 138522


Copyright © 2024 Bayesia S.A.S., Bayesia USA, LLC, and Bayesia Singapore Pte. Ltd. All Rights Reserved.