When you have a single random variable X, its distribution tells you the probability of any event of the form {X∈B}. But two random variables X and Y defined on the same probability space can be correlated in ways their individual distributions never reveal — knowing X can change what you expect from Y. To capture this, you need their joint distribution: the probability measure that (X,Y) induces together on R2.
Joint distribution and joint CDF
Let X,Y:Ω→R be random variables on the same probability space (Ω,F,P). The pair (X,Y) is naturally viewed as a single random vector taking values in R2. Its joint distribution is the push-forward measure P(X,Y) on (R2,B(R2)):
P(X,Y)(A):=P({ω:(X(ω),Y(ω))∈A}),A∈B(R2).
The joint cumulative distribution function (joint CDF) is
FX,Y(x,y):=P(X≤x,Y≤y),(x,y)∈R2.
It is non-decreasing in each argument, right-continuous in each argument, tends to 0 when either argument goes to −∞, and tends to 1 as both arguments go to +∞. The probability of a rectangle (a,b]×(c,d] is recovered by inclusion-exclusion:
P(a<X≤b,c<Y≤d)=FX,Y(b,d)−FX,Y(a,d)−FX,Y(b,c)+FX,Y(a,c).
Discrete case: joint PMF
(X,Y) is jointly discrete if there is a countable set S⊆R2 such that P((X,Y)∈S)=1. The joint probability mass function (joint PMF) is
pX,Y(x,y):=P(X=x,Y=y).
It satisfies pX,Y(x,y)≥0 for all (x,y) and
(x,y)∈S∑pX,Y(x,y)=1.
Example. Roll two fair dice independently. Let X be the result of the first die and Y the result of the second. The joint PMF is pX,Y(i,j)=361 for i,j∈{1,…,6} and 0 otherwise.
Absolutely continuous case: joint PDF
(X,Y) is jointly absolutely continuous if there exists a non-negative measurable function fX,Y:R2→[0,∞) such that for every Borel set A⊆R2:
P((X,Y)∈A)=∬AfX,Y(x,y)dxdy.
The function fX,Y is the joint probability density function (joint PDF). It satisfies
∬R2fX,Y(x,y)dxdy=1.
The joint CDF is recovered by integration:
FX,Y(x,y)=∫−∞x∫−∞yfX,Y(s,t)dtds,
and when fX,Y is continuous, ∂x∂y∂2FX,Y(x,y)=fX,Y(x,y).
Example. The uniform distribution on the unit square has joint PDF fX,Y(x,y)=1 for (x,y)∈[0,1]2 and 0 elsewhere. Then P(X≤21,Y≤21)=41.
Marginal distributions
Given the joint distribution, you can recover the distribution of each variable individually by integrating out (or summing out) the other variable. These are called the marginal distributions.
Discrete case
pX(x)=P(X=x)=y∑pX,Y(x,y),pY(y)=P(Y=y)=x∑pX,Y(x,y).
Absolutely continuous case
fX(x)=∫−∞+∞fX,Y(x,y)dy,fY(y)=∫−∞+∞fX,Y(x,y)dx.
The marginal CDF satisfies FX(x)=limy→+∞FX,Y(x,y), consistent with both formulas.
Expectation of functions of two variables
For a jointly absolutely continuous pair (X,Y) with joint PDF fX,Y, the expectation of a measurable function g is
E[g(X,Y)]=∬R2g(x,y)fX,Y(x,y)dxdy,
provided the integral converges absolutely. In the discrete case, replace the integral with ∑(x,y)∈Sg(x,y)pX,Y(x,y). This is the two-variable extension of LOTUS from Expectation.
Knowing both marginal distributions PX and PY does not determine the joint distribution P(X,Y).
Counterexample. Let U∼Uniform(0,1) and define two pairs:
- Pair 1: (X1,Y1)=(U,U) — both variables are always equal.
- Pair 2: (X2,Y2)=(U,1−U) — knowing one determines the other exactly.
Both X1,X2 and Y1,Y2 have the same Uniform(0,1) marginal distribution. Yet the joint distributions are completely different: P(X1=Y1)=1 while P(X2+Y2=1)=1. The joint law encodes the dependence structure — information the marginals throw away.
Summary
- The joint distribution of (X,Y) is the push-forward measure P(X,Y) on R2; the joint CDF is FX,Y(x,y)=P(X≤x,Y≤y).
- For jointly discrete (X,Y): the joint PMF pX,Y(x,y)=P(X=x,Y=y) sums to 1; marginals are obtained by summing over the other variable.
- For jointly absolutely continuous (X,Y): the joint PDF fX,Y integrates to 1; marginals are obtained by integrating out the other variable: fX(x)=∫fX,Y(x,y)dy.
- E[g(X,Y)]=∬gfX,Ydxdy (continuously) or ∑gpX,Y (discretely).
- The joint distribution is strictly richer than the pair of marginals: different joint laws can share identical marginals.