Independence of Random Variables

Independence is the cleanest possible relationship between two random variables: knowing the value of one tells you nothing about the other. The formal definition translates this intuition into a factorisation condition on the joint distribution, and from it the entire theory of independent sums, product expectations, and variance additivity follows.

Definition

Two random variables $X$ and $Y$ are independent if for every pair of Borel sets $B_1, B_2 \in \mathcal{B}(\mathbb{R})$ :

P(X \in B_1,\, Y \in B_2) = P(X \in B_1) \cdot P(X \in B_2).

Equivalently, the joint distribution $P_{(X,Y)}$ is the product measure $P_X \otimes P_Y$ .

Because $\mathcal{B}(\mathbb{R}^2)$ is generated by rectangles $B_1 \times B_2$ , the product-measure condition on rectangles extends to the entire joint law.

In terms of the CDF

An equivalent characterisation: $X$ and $Y$ are independent if and only if

F_{X,Y}(x, y) = F_X(x) \cdot F_Y(y) \quad \text{for all } (x, y) \in \mathbb{R}^2. \tag{1}

This is often the most convenient form to check.

Discrete case

For jointly discrete $(X, Y)$ , independence is equivalent to the joint PMF factorising:

p_{X,Y}(x, y) = p_X(x) \cdot p_Y(y) \quad \text{for all } (x, y).

Absolutely continuous case

For jointly absolutely continuous $(X, Y)$ , independence is equivalent to the joint PDF factorising:

f_{X,Y}(x, y) = f_X(x) \cdot f_Y(y) \quad \text{for almost every } (x, y). \tag{2}

Example. The uniform distribution on the unit square has $f_{X,Y}(x,y) = 1$ on $[0,1]^2$ , and the marginals are both $f_X(x) = 1$ , $f_Y(y) = 1$ on $[0,1]$ . Since $1 = 1 \cdot 1$ , the PDF factorises and $X, Y$ are independent.

Contrast this with the uniform distribution on the unit disk $\{x^2 + y^2 \leq 1\}$ : the density is $1/\pi$ inside the disk and $0$ outside. The marginal of $X$ has density $f_X(x) = \frac{2}{\pi}\sqrt{1-x^2}$ for $x \in [-1,1]$ , which is not a constant — so the joint density does not factorise, and $X$ and $Y$ are not independent.

Independence implies factorisation of expectations

Theorem. If $X$ and $Y$ are independent and $g, h : \mathbb{R} \to \mathbb{R}$ are bounded measurable functions, then

E[g(X)\, h(Y)] = E[g(X)] \cdot E[h(Y)]. \tag{3}

Proof. Since $P_{(X,Y)} = P_X \otimes P_Y$ , the Fubini–Tonelli theorem gives:

E[g(X) h(Y)] = \iint g(x)\, h(y)\, d(P_X \otimes P_Y)(x, y) = \int g(x)\, dP_X(x) \cdot \int h(y)\, dP_Y(y) = E[g(X)] \cdot E[h(Y)].

Taking $g = h = \operatorname{id}$ gives the special case most often used in practice:

E[XY] = E[X] \cdot E[Y] \quad \text{when } X, Y \text{ are independent and integrable.} \tag{4}

This identity will appear in Covariance and Correlation, where it immediately implies $\operatorname{Cov}(X, Y) = 0$ for independent variables.

Independence of functions

If $X$ and $Y$ are independent and $g, h : \mathbb{R} \to \mathbb{R}$ are measurable, then $g(X)$ and $h(Y)$ are also independent. The key observation is that $\{g(X) \in B_1\} = \{X \in g^{-1}(B_1)\}$ , so independence of $X$ and $Y$ on preimage events carries over to $g(X)$ and $h(Y)$ .

Pairwise vs mutual independence

For a collection of three or more random variables $X_1, X_2, \ldots, X_n$ , there are two distinct notions:

Pairwise independence: every pair $X_i, X_j$ with $i \neq j$ is independent.
Mutual independence: for every non-empty subset $I \subseteq \{1, \ldots, n\}$ and every collection of Borel sets $(B_i)_{i \in I}$ :

P\!\left(\bigcap_{i \in I} \{X_i \in B_i\}\right) = \prod_{i \in I} P(X_i \in B_i).

Pairwise independence does not imply mutual independence.

Counterexample. Let $X_1, X_2$ be independent $\operatorname{Bernoulli}(\tfrac{1}{2})$ variables and set $X_3 = X_1 \oplus X_2$ (XOR, i.e.\ addition mod 2). Each pair is pairwise independent: for instance, $P(X_1 = a, X_3 = b) = \tfrac{1}{4}$ for every $a, b \in \{0,1\}$ , matching $\tfrac{1}{2} \cdot \tfrac{1}{2}$ . But the triple fails mutual independence because knowing both $X_1$ and $X_2$ determines $X_3$ completely:

P(X_1 = 0,\, X_2 = 0,\, X_3 = 0) = \tfrac{1}{4} \neq \tfrac{1}{2} \cdot \tfrac{1}{2} \cdot \tfrac{1}{2} = \tfrac{1}{8}.

Unless stated otherwise, “independence” for $n \geq 3$ variables always means mutual independence.

Summary

$X$ and $Y$ are independent when $P_{(X,Y)} = P_X \otimes P_Y$ , equivalently $F_{X,Y}(x,y) = F_X(x)\, F_Y(y)$ .
For discrete variables: joint PMF factorises, $p_{X,Y}(x,y) = p_X(x)\, p_Y(y)$ .
For absolutely continuous variables: joint PDF factorises, $f_{X,Y}(x,y) = f_X(x)\, f_Y(y)$ a.e.
Independence implies $E[g(X)\,h(Y)] = E[g(X)]\,E[h(Y)]$ for bounded measurable $g, h$ ; in particular $E[XY] = E[X]\,E[Y]$ .
$g(X)$ and $h(Y)$ are independent whenever $X$ and $Y$ are.
Pairwise independence does not imply mutual independence: the XOR counterexample demonstrates this gap.