Bernoulli Distribution

Essential
Last updated: Tags: Probability, Random Variables, Distributions

Prerequisites

The Bernoulli distribution is the simplest non-trivial random variable: a single binary trial that either succeeds or fails. Every more complex discrete distribution — Binomial, Geometric, Negative Binomial — is built directly on top of it.

Definition

A random variable XX follows a Bernoulli distribution with parameter p[0,1]p \in [0, 1], written XBernoulli(p)X \sim \text{Bernoulli}(p), if it takes only the values 00 and 11 with

P(X=1)=p,P(X=0)=1p.P(X = 1) = p, \qquad P(X = 0) = 1 - p.

The value 11 is conventionally called success and 00 is called failure. The single parameter pp is the success probability.

PMF in closed form

The two cases can be combined into a single formula:

P(X=k)=pk(1p)1k,k{0,1}.P(X = k) = p^k (1-p)^{1-k}, \qquad k \in \{0, 1\}.

CDF

The cumulative distribution function is piecewise constant:

F(x)P(Xx)={0x<0,1p0x<1,1x1.F(x) \coloneqq P(X \le x) = \begin{cases} 0 & x < 0, \\ 1 - p & 0 \le x < 1, \\ 1 & x \ge 1. \end{cases}

Mean

The expected value of XX follows directly from the definition of expectation for a discrete random variable:

E[X]=0P(X=0)+1P(X=1)=0(1p)+1p=p.E[X] = 0 \cdot P(X = 0) + 1 \cdot P(X = 1) = 0 \cdot (1-p) + 1 \cdot p = p.

So E[X]=pE[X] = p: the mean is simply the success probability.

Variance

To compute Var(X)=E[X2](E[X])2\text{Var}(X) = E[X^2] - (E[X])^2, first note that because X{0,1}X \in \{0, 1\} we have X2=XX^2 = X, so E[X2]=E[X]=pE[X^2] = E[X] = p. Therefore

Var(X)=pp2=p(1p).\text{Var}(X) = p - p^2 = p(1-p).

The variance is maximised at p=12p = \tfrac{1}{2} (maximum uncertainty) and collapses to zero at p=0p = 0 or p=1p = 1 (the outcome is certain).

Moment generating function

The moment generating function (MGF) of XX is

M(t)E[etX]=et0(1p)+et1p=(1p)+pet.M(t) \coloneqq E[e^{tX}] = e^{t \cdot 0}(1-p) + e^{t \cdot 1} p = (1 - p) + p e^t.

This compact expression makes it straightforward to derive the MGF of the Binomial distribution by multiplying nn independent copies.

Summary

  • XBernoulli(p)X \sim \text{Bernoulli}(p) models a single binary trial with success probability p[0,1]p \in [0,1].
  • PMF: P(X=k)=pk(1p)1kP(X = k) = p^k(1-p)^{1-k} for k{0,1}k \in \{0,1\}.
  • Mean: E[X]=pE[X] = p.
  • Variance: Var(X)=p(1p)\text{Var}(X) = p(1-p), maximised at p=12p = \tfrac{1}{2}.
  • MGF: M(t)=(1p)+petM(t) = (1-p) + pe^t.
  • The Bernoulli distribution is the atomic building block for the Binomial, Geometric, and Negative Binomial distributions.