Gamma Distribution

Essential
Last updated: Tags: Probability, Random Variables, Distributions

Prerequisites

Suppose events occur in a Poisson process at rate λ\lambda and you want to know how long until the α\alpha-th event arrives. The waiting time for the first event is Exponential; the waiting time for the α\alpha-th event follows the Gamma distribution, a two-parameter family that generalises the Exponential to arbitrary numbers of events.

The gamma function

Before defining the distribution, recall the gamma function Γ:(0,)(0,)\Gamma : (0, \infty) \to (0, \infty):

Γ(α)0tα1etdt.\Gamma(\alpha) \coloneqq \int_0^{\infty} t^{\alpha - 1} e^{-t}\, dt.

Two key properties make Γ\Gamma the right normalising constant:

Recursion. Integration by parts gives Γ(α+1)=αΓ(α)\Gamma(\alpha + 1) = \alpha\,\Gamma(\alpha) for all α>0\alpha > 0.

Integer values. Combined with Γ(1)=0etdt=1\Gamma(1) = \int_0^\infty e^{-t}\,dt = 1, the recursion yields

Γ(n)=(n1)!for every positive integer n.\Gamma(n) = (n-1)! \quad \text{for every positive integer } n.

Half-integer. A celebrated result gives Γ(1/2)=π\Gamma(1/2) = \sqrt{\pi}, so Γ\Gamma extends the factorial to non-integer arguments.

Definition

Let α>0\alpha > 0 be the shape parameter and λ>0\lambda > 0 be the rate parameter. A random variable XX follows a Gamma distribution, written XGamma(α,λ)X \sim \operatorname{Gamma}(\alpha, \lambda), if its probability density function is

f(x)λαΓ(α)xα1eλx,x>0.f(x) \coloneqq \frac{\lambda^\alpha}{\Gamma(\alpha)}\, x^{\alpha-1} e^{-\lambda x}, \quad x > 0.

When α=1\alpha = 1 this reduces to f(x)=λeλxf(x) = \lambda e^{-\lambda x}, recovering Exp(λ)\operatorname{Exp}(\lambda). Large α\alpha shifts probability mass toward larger values, reflecting a longer wait for more events.

Verification that ff is a valid PDF

Non-negativity is immediate. For the integral, substitute uλxu \coloneqq \lambda x (so x=u/λx = u/\lambda, dx=du/λdx = du/\lambda):

0λαΓ(α)xα1eλxdx=λαΓ(α)0(uλ)α1euduλ=1Γ(α)0uα1eudu=Γ(α)Γ(α)=1.\int_0^{\infty} \frac{\lambda^\alpha}{\Gamma(\alpha)}\, x^{\alpha-1} e^{-\lambda x}\, dx = \frac{\lambda^\alpha}{\Gamma(\alpha)} \int_0^{\infty} \left(\frac{u}{\lambda}\right)^{\alpha-1} e^{-u} \frac{du}{\lambda} = \frac{1}{\Gamma(\alpha)} \int_0^{\infty} u^{\alpha-1} e^{-u}\, du = \frac{\Gamma(\alpha)}{\Gamma(\alpha)} = 1.

Sum of exponential random variables

The most concrete way to understand the Gamma distribution is through its relationship to the Exponential. For integer α=n\alpha = n:

Theorem. If X1,X2,,XnX_1, X_2, \ldots, X_n are independent with XiExp(λ)X_i \sim \operatorname{Exp}(\lambda), then

SnX1+X2++XnGamma(n,λ).S_n \coloneqq X_1 + X_2 + \cdots + X_n \sim \operatorname{Gamma}(n, \lambda).

Proof via moment generating functions. The MGF of XiExp(λ)X_i \sim \operatorname{Exp}(\lambda) is

MXi(t)=E[etXi]=λλt,t<λ.M_{X_i}(t) = E[e^{tX_i}] = \frac{\lambda}{\lambda - t}, \quad t < \lambda.

Since the XiX_i are independent, the MGF of their sum factors:

MSn(t)=i=1nMXi(t)=(λλt)n.M_{S_n}(t) = \prod_{i=1}^n M_{X_i}(t) = \left(\frac{\lambda}{\lambda - t}\right)^n.

One can verify that Gamma(n,λ)\operatorname{Gamma}(n, \lambda) has the same MGF (by computing E[etX]E[e^{tX}] via the substitution u=(λt)xu = (\lambda - t)x), and since the MGF uniquely determines the distribution, SnGamma(n,λ)S_n \sim \operatorname{Gamma}(n, \lambda). \square

In the Poisson process interpretation: SnS_n is the time of the nn-th arrival, and the theorem confirms it is Gamma(n,λ)\operatorname{Gamma}(n, \lambda).

Mean

The mean of XGamma(α,λ)X \sim \operatorname{Gamma}(\alpha, \lambda) can be read off from the recursion of Γ\Gamma:

E[X]=0xλαΓ(α)xα1eλxdx=λαΓ(α)0xαeλxdx.E[X] = \int_0^{\infty} x \cdot \frac{\lambda^\alpha}{\Gamma(\alpha)}\, x^{\alpha-1} e^{-\lambda x}\, dx = \frac{\lambda^\alpha}{\Gamma(\alpha)} \int_0^{\infty} x^{\alpha} e^{-\lambda x}\, dx.

Substitute u=λxu = \lambda x to obtain 0xαeλxdx=Γ(α+1)/λα+1\int_0^\infty x^\alpha e^{-\lambda x}\,dx = \Gamma(\alpha+1)/\lambda^{\alpha+1}, so

E[X]=λαΓ(α)Γ(α+1)λα+1=Γ(α+1)λΓ(α)=αΓ(α)λΓ(α)=αλ.E[X] = \frac{\lambda^\alpha}{\Gamma(\alpha)} \cdot \frac{\Gamma(\alpha+1)}{\lambda^{\alpha+1}} = \frac{\Gamma(\alpha+1)}{\lambda\,\Gamma(\alpha)} = \frac{\alpha\,\Gamma(\alpha)}{\lambda\,\Gamma(\alpha)} = \frac{\alpha}{\lambda}.

For integer α=n\alpha = n this matches the intuition: the expected waiting time for nn independent Exp(λ)\operatorname{Exp}(\lambda) events is n(1/λ)n \cdot (1/\lambda).

Variance

Similarly, compute E[X2]E[X^2] by the same substitution:

E[X2]=λαΓ(α)Γ(α+2)λα+2=α(α+1)λ2.E[X^2] = \frac{\lambda^\alpha}{\Gamma(\alpha)} \cdot \frac{\Gamma(\alpha+2)}{\lambda^{\alpha+2}} = \frac{\alpha(\alpha+1)}{\lambda^2}.

Therefore

Var(X)=E[X2](E[X])2=α(α+1)λ2α2λ2=αλ2.\operatorname{Var}(X) = E[X^2] - (E[X])^2 = \frac{\alpha(\alpha+1)}{\lambda^2} - \frac{\alpha^2}{\lambda^2} = \frac{\alpha}{\lambda^2}.

Additive property

A consequence of the MGF argument above is the additive property: if X1Gamma(α1,λ)X_1 \sim \operatorname{Gamma}(\alpha_1, \lambda) and X2Gamma(α2,λ)X_2 \sim \operatorname{Gamma}(\alpha_2, \lambda) are independent with the same rate, then

X1+X2Gamma(α1+α2,λ).X_1 + X_2 \sim \operatorname{Gamma}(\alpha_1 + \alpha_2,\, \lambda).

The shapes add while the rate is preserved. This is consistent with the sum-of-exponentials interpretation: pooling α1\alpha_1 and α2\alpha_2 i.i.d. Exp(λ)\operatorname{Exp}(\lambda) waiting times yields a Gamma(α1+α2,λ)\operatorname{Gamma}(\alpha_1 + \alpha_2, \lambda) waiting time. Note that the additive property fails if the two rates differ.

Summary

  • The gamma function Γ(α)=0tα1etdt\Gamma(\alpha) = \int_0^\infty t^{\alpha-1} e^{-t}\,dt satisfies Γ(α+1)=αΓ(α)\Gamma(\alpha+1) = \alpha\,\Gamma(\alpha) and Γ(n)=(n1)!\Gamma(n) = (n-1)! for positive integers nn.
  • XGamma(α,λ)X \sim \operatorname{Gamma}(\alpha, \lambda) has PDF f(x)=λαxα1eλx/Γ(α)f(x) = \lambda^\alpha x^{\alpha-1} e^{-\lambda x} / \Gamma(\alpha) for x>0x > 0, with shape α>0\alpha > 0 and rate λ>0\lambda > 0.
  • For integer α=n\alpha = n: a sum of nn independent Exp(λ)\operatorname{Exp}(\lambda) variables is Gamma(n,λ)\operatorname{Gamma}(n, \lambda).
  • Mean: E[X]=α/λE[X] = \alpha/\lambda.
  • Variance: Var(X)=α/λ2\operatorname{Var}(X) = \alpha/\lambda^2.
  • Additive property: independent Gamma(α1,λ)\operatorname{Gamma}(\alpha_1, \lambda) and Gamma(α2,λ)\operatorname{Gamma}(\alpha_2, \lambda) sum to Gamma(α1+α2,λ)\operatorname{Gamma}(\alpha_1 + \alpha_2, \lambda).