Normal Distribution

Essential
Last updated: Tags: Probability, Random Variables, Distributions

Prerequisites

Measure almost any naturally occurring quantity — adult heights, measurement errors, test scores, the average of many independent observations — and the same bell-shaped curve keeps appearing. The Normal distribution (also called the Gaussian distribution) is the universal distribution of averages, and understanding it is essential to virtually every branch of applied mathematics and statistics.

The Gaussian integral

Before defining the Normal distribution, we need one classical result: the integral

Iex2dx=π.I \coloneqq \int_{-\infty}^{\infty} e^{-x^2}\, dx = \sqrt{\pi}.

Proof via polar coordinates. Consider I2I^2:

I2=(ex2dx) ⁣(ey2dy)=R2e(x2+y2)dxdy.I^2 = \left(\int_{-\infty}^{\infty} e^{-x^2}\, dx\right)\!\left(\int_{-\infty}^{\infty} e^{-y^2}\, dy\right) = \iint_{\mathbb{R}^2} e^{-(x^2 + y^2)}\, dx\, dy.

Convert to polar coordinates x=rcosθx = r\cos\theta, y=rsinθy = r\sin\theta, with r0r \geq 0 and θ[0,2π)\theta \in [0, 2\pi). The Jacobian is rr, and x2+y2=r2x^2 + y^2 = r^2:

I2=02π0er2rdrdθ=2π0rer2dr.I^2 = \int_0^{2\pi}\int_0^{\infty} e^{-r^2} r\, dr\, d\theta = 2\pi \int_0^{\infty} r e^{-r^2}\, dr.

Substitute u=r2u = r^2, du=2rdrdu = 2r\, dr:

I2=2π012eudu=π[eu]0=π.I^2 = 2\pi \int_0^{\infty} \frac{1}{2} e^{-u}\, du = \pi \cdot \bigl[-e^{-u}\bigr]_0^{\infty} = \pi.

Since I>0I > 0, we conclude I=πI = \sqrt{\pi}. \square

A useful rescaled form follows immediately: substituting x=t/2x = t/\sqrt{2} gives

et2/2dt=2π.\int_{-\infty}^{\infty} e^{-t^2/2}\, dt = \sqrt{2\pi}.

Standard Normal distribution

The standard Normal distribution ZN(0,1)Z \sim N(0, 1) has probability density function (PDF)

φ(x)12πex2/2,xR.\varphi(x) \coloneqq \frac{1}{\sqrt{2\pi}}\, e^{-x^2/2}, \qquad x \in \mathbb{R}.

Verification that φ\varphi integrates to 1

φ(x)dx=12πex2/2dx=12π2π=1,\int_{-\infty}^{\infty} \varphi(x)\, dx = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} e^{-x^2/2}\, dx = \frac{1}{\sqrt{2\pi}} \cdot \sqrt{2\pi} = 1,

using the Gaussian integral result above. The factor 1/2π1/\sqrt{2\pi} is precisely the normalising constant.

General Normal distribution

Let μR\mu \in \mathbb{R} be a location parameter (mean) and σ2>0\sigma^2 > 0 be a scale parameter (variance). A random variable XX follows a Normal distribution with mean μ\mu and variance σ2\sigma^2, written XN(μ,σ2)X \sim N(\mu, \sigma^2), if

Xμ+σZ,ZN(0,1).X \coloneqq \mu + \sigma Z, \qquad Z \sim N(0, 1).

Equivalently, XX has PDF

f(x)12πσ2exp ⁣((xμ)22σ2),xR.f(x) \coloneqq \frac{1}{\sqrt{2\pi\sigma^2}}\,\exp\!\left(-\frac{(x-\mu)^2}{2\sigma^2}\right), \qquad x \in \mathbb{R}.

Verification. Substituting z=(xμ)/σz = (x - \mu)/\sigma transforms the integral f(x)dx\int_{-\infty}^\infty f(x)\,dx into φ(z)dz=1\int_{-\infty}^\infty \varphi(z)\,dz = 1.

The parameter σσ2\sigma \coloneqq \sqrt{\sigma^2} is the standard deviation.

Mean

E[X]=E[μ+σZ]=μ+σE[Z].E[X] = E[\mu + \sigma Z] = \mu + \sigma E[Z].

By the symmetry of φ\varphi about zero, E[Z]=0E[Z] = 0 (the integrand xφ(x)x \varphi(x) is an odd function). Therefore

E[X]=μ.E[X] = \mu.

Variance

We need Var(Z)=E[Z2]\operatorname{Var}(Z) = E[Z^2] for the standard Normal (since E[Z]=0E[Z] = 0). Apply integration by parts with u=xu = x and dv=xex2/2dxdv = x e^{-x^2/2}\,dx, so v=ex2/2v = -e^{-x^2/2}:

E[Z2]=12πx2ex2/2dx=12π([xex2/2]+ex2/2dx).E[Z^2] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^{\infty} x^2 e^{-x^2/2}\, dx = \frac{1}{\sqrt{2\pi}} \left(\Bigl[-x e^{-x^2/2}\Bigr]_{-\infty}^{\infty} + \int_{-\infty}^{\infty} e^{-x^2/2}\, dx\right).

The boundary term vanishes since xex2/20x e^{-x^2/2} \to 0 as x|x| \to \infty. The remaining integral equals 2π\sqrt{2\pi}, so

E[Z2]=12π2π=1.E[Z^2] = \frac{1}{\sqrt{2\pi}} \cdot \sqrt{2\pi} = 1.

Hence Var(Z)=1\operatorname{Var}(Z) = 1. For the general case, using X=μ+σZX = \mu + \sigma Z and independence:

Var(X)=σ2Var(Z)=σ2.\operatorname{Var}(X) = \sigma^2 \operatorname{Var}(Z) = \sigma^2.

Affine stability

Theorem. If XN(μ,σ2)X \sim N(\mu, \sigma^2) and a,bRa, b \in \mathbb{R} with a0a \neq 0, then

aX+bN(aμ+b,  a2σ2).aX + b \sim N(a\mu + b,\; a^2\sigma^2).

Proof. Write X=μ+σZX = \mu + \sigma Z with ZN(0,1)Z \sim N(0,1). Then

aX+b=a(μ+σZ)+b=(aμ+b)+(aσ)Z.aX + b = a(\mu + \sigma Z) + b = (a\mu + b) + (a\sigma) Z.

This is of the form μ+σZ\mu' + \sigma' Z with μ=aμ+b\mu' = a\mu + b and σ=aσ\sigma' = a\sigma, so aX+bN(aμ+b,a2σ2)aX + b \sim N(a\mu + b,\, a^2\sigma^2). \square

Corollary. Any XN(μ,σ2)X \sim N(\mu, \sigma^2) can be standardised: (Xμ)/σN(0,1)(X - \mu)/\sigma \sim N(0, 1).

Central Limit Theorem

The Normal distribution is not just one of many distributions — it is the universal limit of standardised sums. The Central Limit Theorem (CLT) makes this precise.

Theorem (CLT). Let X1,X2,X_1, X_2, \ldots be independent and identically distributed (i.i.d.) random variables with mean μ\mu and finite variance σ2>0\sigma^2 > 0. Define the standardised sum

Zn(X1+X2++Xn)nμσn.Z_n \coloneqq \frac{(X_1 + X_2 + \cdots + X_n) - n\mu}{\sigma\sqrt{n}}.

Then ZndN(0,1)Z_n \xrightarrow{d} N(0, 1) as nn \to \infty: for every zRz \in \mathbb{R},

limnP(Znz)=Φ(z)zφ(t)dt.\lim_{n \to \infty} P(Z_n \leq z) = \Phi(z) \coloneqq \int_{-\infty}^{z} \varphi(t)\, dt.

Why this makes the Normal ubiquitous. Any observed quantity that is the aggregate effect of many small, independent contributions — measurement noise, biological traits, financial returns — is well approximated by a Normal distribution, regardless of the shape of the individual contributing distributions. The CLT is the mathematical explanation for the bell curve’s prevalence throughout science.

Summary

  • ZN(0,1)Z \sim N(0,1) has PDF φ(x)=12πex2/2\varphi(x) = \frac{1}{\sqrt{2\pi}} e^{-x^2/2}; the normalising constant 1/2π1/\sqrt{2\pi} follows from the Gaussian integral et2/2dt=2π\int_{-\infty}^\infty e^{-t^2/2}\,dt = \sqrt{2\pi}.
  • XN(μ,σ2)X \sim N(\mu, \sigma^2) has PDF f(x)=12πσ2exp ⁣((xμ)2/(2σ2))f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}\exp\!\bigl(-(x-\mu)^2/(2\sigma^2)\bigr) and satisfies X=μ+σZX = \mu + \sigma Z with ZN(0,1)Z \sim N(0,1).
  • Mean: E[X]=μE[X] = \mu (by symmetry of the standard Normal).
  • Variance: Var(X)=σ2\operatorname{Var}(X) = \sigma^2 (derived via integration by parts).
  • Affine stability: aX+bN(aμ+b,a2σ2)aX + b \sim N(a\mu + b, a^2\sigma^2); in particular (Xμ)/σN(0,1)(X-\mu)/\sigma \sim N(0,1).
  • Central Limit Theorem: the standardised sum of any nn i.i.d. finite-variance variables converges in distribution to N(0,1)N(0,1), explaining the Normal’s ubiquity.