Jointly Distributed Random Variables

Essential
Last updated: Tags: Probability, Random Variables

Prerequisites

When you have a single random variable XX, its distribution tells you the probability of any event of the form {XB}\{X \in B\}. But two random variables XX and YY defined on the same probability space can be correlated in ways their individual distributions never reveal — knowing XX can change what you expect from YY. To capture this, you need their joint distribution: the probability measure that (X,Y)(X, Y) induces together on R2\mathbb{R}^2.

Joint distribution and joint CDF

Let X,Y:ΩRX, Y : \Omega \to \mathbb{R} be random variables on the same probability space (Ω,F,P)(\Omega, \mathcal{F}, P). The pair (X,Y)(X, Y) is naturally viewed as a single random vector taking values in R2\mathbb{R}^2. Its joint distribution is the push-forward measure P(X,Y)P_{(X,Y)} on (R2,B(R2))(\mathbb{R}^2, \mathcal{B}(\mathbb{R}^2)):

P(X,Y)(A)P({ω:(X(ω),Y(ω))A}),AB(R2).P_{(X,Y)}(A) \coloneqq P\bigl(\{\omega : (X(\omega), Y(\omega)) \in A\}\bigr), \quad A \in \mathcal{B}(\mathbb{R}^2).

The joint cumulative distribution function (joint CDF) is

FX,Y(x,y)P(Xx,Yy),(x,y)R2.F_{X,Y}(x, y) \coloneqq P(X \leq x,\, Y \leq y), \quad (x, y) \in \mathbb{R}^2.

It is non-decreasing in each argument, right-continuous in each argument, tends to 00 when either argument goes to -\infty, and tends to 11 as both arguments go to ++\infty. The probability of a rectangle (a,b]×(c,d](a, b] \times (c, d] is recovered by inclusion-exclusion:

P(a<Xb,c<Yd)=FX,Y(b,d)FX,Y(a,d)FX,Y(b,c)+FX,Y(a,c).P(a < X \leq b,\, c < Y \leq d) = F_{X,Y}(b, d) - F_{X,Y}(a, d) - F_{X,Y}(b, c) + F_{X,Y}(a, c).

Discrete case: joint PMF

(X,Y)(X, Y) is jointly discrete if there is a countable set SR2S \subseteq \mathbb{R}^2 such that P((X,Y)S)=1P((X,Y) \in S) = 1. The joint probability mass function (joint PMF) is

pX,Y(x,y)P(X=x,Y=y).p_{X,Y}(x, y) \coloneqq P(X = x,\, Y = y).

It satisfies pX,Y(x,y)0p_{X,Y}(x,y) \geq 0 for all (x,y)(x,y) and

(x,y)SpX,Y(x,y)=1.\sum_{(x,y) \in S} p_{X,Y}(x, y) = 1.

Example. Roll two fair dice independently. Let XX be the result of the first die and YY the result of the second. The joint PMF is pX,Y(i,j)=136p_{X,Y}(i, j) = \frac{1}{36} for i,j{1,,6}i, j \in \{1, \ldots, 6\} and 00 otherwise.

Absolutely continuous case: joint PDF

(X,Y)(X, Y) is jointly absolutely continuous if there exists a non-negative measurable function fX,Y:R2[0,)f_{X,Y} : \mathbb{R}^2 \to [0, \infty) such that for every Borel set AR2A \subseteq \mathbb{R}^2:

P((X,Y)A)=AfX,Y(x,y)dxdy.P\bigl((X, Y) \in A\bigr) = \iint_A f_{X,Y}(x, y) \, dx \, dy.

The function fX,Yf_{X,Y} is the joint probability density function (joint PDF). It satisfies

R2fX,Y(x,y)dxdy=1.\iint_{\mathbb{R}^2} f_{X,Y}(x, y) \, dx \, dy = 1.

The joint CDF is recovered by integration:

FX,Y(x,y)=xyfX,Y(s,t)dtds,F_{X,Y}(x, y) = \int_{-\infty}^{x} \int_{-\infty}^{y} f_{X,Y}(s, t) \, dt \, ds,

and when fX,Yf_{X,Y} is continuous, 2xyFX,Y(x,y)=fX,Y(x,y)\dfrac{\partial^2}{\partial x \, \partial y} F_{X,Y}(x,y) = f_{X,Y}(x, y).

Example. The uniform distribution on the unit square has joint PDF fX,Y(x,y)=1f_{X,Y}(x, y) = 1 for (x,y)[0,1]2(x, y) \in [0,1]^2 and 00 elsewhere. Then P(X12,Y12)=14P(X \leq \tfrac{1}{2},\, Y \leq \tfrac{1}{2}) = \tfrac{1}{4}.

Marginal distributions

Given the joint distribution, you can recover the distribution of each variable individually by integrating out (or summing out) the other variable. These are called the marginal distributions.

Discrete case

pX(x)=P(X=x)=ypX,Y(x,y),pY(y)=P(Y=y)=xpX,Y(x,y).p_X(x) = P(X = x) = \sum_{y} p_{X,Y}(x, y), \qquad p_Y(y) = P(Y = y) = \sum_{x} p_{X,Y}(x, y).

Absolutely continuous case

fX(x)=+fX,Y(x,y)dy,fY(y)=+fX,Y(x,y)dx.f_X(x) = \int_{-\infty}^{+\infty} f_{X,Y}(x, y) \, dy, \qquad f_Y(y) = \int_{-\infty}^{+\infty} f_{X,Y}(x, y) \, dx.

The marginal CDF satisfies FX(x)=limy+FX,Y(x,y)F_X(x) = \lim_{y \to +\infty} F_{X,Y}(x, y), consistent with both formulas.

Expectation of functions of two variables

For a jointly absolutely continuous pair (X,Y)(X, Y) with joint PDF fX,Yf_{X,Y}, the expectation of a measurable function gg is

E[g(X,Y)]=R2g(x,y)fX,Y(x,y)dxdy,E[g(X, Y)] = \iint_{\mathbb{R}^2} g(x, y)\, f_{X,Y}(x, y)\, dx\, dy,

provided the integral converges absolutely. In the discrete case, replace the integral with (x,y)Sg(x,y)pX,Y(x,y)\sum_{(x,y) \in S} g(x,y)\, p_{X,Y}(x,y). This is the two-variable extension of LOTUS from Expectation.

The joint carries more information than the marginals

Knowing both marginal distributions PXP_X and PYP_Y does not determine the joint distribution P(X,Y)P_{(X,Y)}.

Counterexample. Let UUniform(0,1)U \sim \operatorname{Uniform}(0, 1) and define two pairs:

  • Pair 1: (X1,Y1)=(U,U)(X_1, Y_1) = (U, U) — both variables are always equal.
  • Pair 2: (X2,Y2)=(U,1U)(X_2, Y_2) = (U, 1 - U) — knowing one determines the other exactly.

Both X1,X2X_1, X_2 and Y1,Y2Y_1, Y_2 have the same Uniform(0,1)\operatorname{Uniform}(0,1) marginal distribution. Yet the joint distributions are completely different: P(X1=Y1)=1P(X_1 = Y_1) = 1 while P(X2+Y2=1)=1P(X_2 + Y_2 = 1) = 1. The joint law encodes the dependence structure — information the marginals throw away.

Summary

  • The joint distribution of (X,Y)(X,Y) is the push-forward measure P(X,Y)P_{(X,Y)} on R2\mathbb{R}^2; the joint CDF is FX,Y(x,y)=P(Xx,Yy)F_{X,Y}(x,y) = P(X \leq x, Y \leq y).
  • For jointly discrete (X,Y)(X,Y): the joint PMF pX,Y(x,y)=P(X=x,Y=y)p_{X,Y}(x,y) = P(X=x, Y=y) sums to 11; marginals are obtained by summing over the other variable.
  • For jointly absolutely continuous (X,Y)(X,Y): the joint PDF fX,Yf_{X,Y} integrates to 11; marginals are obtained by integrating out the other variable: fX(x)=fX,Y(x,y)dyf_X(x) = \int f_{X,Y}(x,y)\,dy.
  • E[g(X,Y)]=gfX,YdxdyE[g(X,Y)] = \iint g\, f_{X,Y}\,dx\,dy (continuously) or gpX,Y\sum g\, p_{X,Y} (discretely).
  • The joint distribution is strictly richer than the pair of marginals: different joint laws can share identical marginals.