A generating function packages infinitely many numbers into a single analytic object. The moment generating function (MGF) packages all the moments of a random variable into one power series. Because many useful distributions have simple MGFs, and because the MGF of a sum of independent variables is the product of their MGFs, the MGF is a powerful tool for computing distributions of sums and proving limit theorems.
Definition
The moment generating function of a random variable X is
MX(t):=E[etX],t∈R,
defined for all t in an open interval around 0 where the expectation is finite. For a discrete distribution:
MX(t)=k∑etxkpk.
For an absolutely continuous distribution:
MX(t)=∫−∞+∞etxfX(x)dx.
Both are the Laplace transform of the distribution (with s=−t).
Recovering moments from the MGF
Expand etX as a power series:
etX=k=0∑∞k!(tX)k=k=0∑∞k!Xktk.
Taking expectations (justified by dominated convergence when MX(t) is finite near 0):
MX(t)=k=0∑∞k!E[Xk]tk=k=0∑∞k!μk′tk.(1)
This shows that MX(t) is a power series in t whose coefficients encode the raw moments. Differentiating k times and evaluating at t=0:
MX(k)(0)=E[Xk]=μk′.(2)
So the k-th moment is the k-th derivative of the MGF at zero. This is the defining property: the MGF generates the moments.
MGFs of standard distributions
Distribution
MX(t)
Domain
Bernoulli(p)
(1−p)+pet
t∈R
Bin(n,p)
(1−p+pet)n
t∈R
Poisson(λ)
exp(λ(et−1))
t∈R
Exp(λ)
λ−tλ
t<λ
Gamma(α,λ)
(λ−tλ)α
t<λ
N(μ,σ2)
exp(μt+2σ2t2)
t∈R
Derivation for N(0,1). With density f(x)=2π1e−x2/2:
Applications. Property (3) makes it easy to identify the distribution of a sum of independent variables by comparing MGFs:
X∼Bin(m,p), Y∼Bin(n,p) independent: MX+Y(t)=(1−p+pet)m+n, so X+Y∼Bin(m+n,p).
X∼Poisson(λ), Y∼Poisson(μ) independent: MX+Y(t)=e(λ+μ)(et−1), so X+Y∼Poisson(λ+μ).
X∼N(μ1,σ12), Y∼N(μ2,σ22) independent: MX+Y(t)=e(μ1+μ2)t+(σ12+σ22)t2/2, so X+Y∼N(μ1+μ2,σ12+σ22).
Uniqueness: the MGF determines the distribution
Theorem. If MX(t) is finite for all t in some open interval (−δ,δ) with δ>0, then MX uniquely determines the distribution of X.
More precisely: if MX(t)=MY(t) for all t∈(−δ,δ), then PX=PY (the distributions are identical).
This is why you can safely identify distributions by their MGFs — the equality MX+Y=MN(μ1+μ2,σ12+σ22) in the calculation above really does imply that X+Y is normal.
The existence caveat. The MGF need not exist (may be +∞) for all t=0. The Cauchy distribution has no MGF. The log-normal distribution has an MGF that is +∞ for every t>0. When the MGF does not exist in a neighbourhood of 0, the moment sequence may not determine the distribution (the log-normal is the classic example). In such cases, the characteristic functionφX(t)=E[eitX] (with i=−1) always exists and always determines the distribution, making it the more general tool for theoretical work.
Cumulants
The cumulant generating function is the logarithm of the MGF:
KX(t):=lnMX(t)=lnE[etX].
Its derivatives at 0 are the cumulantsκk:=KX(k)(0). The first two cumulants are the mean and variance:
κ1=E[X]=μ,κ2=Var(X)=σ2.
For independent X,Y: KX+Y(t)=KX(t)+KY(t), so cumulants add under independence, just like variances. This additivity makes cumulants particularly convenient in many calculations.
Summary
The MGFMX(t)=E[etX] encodes all moments as derivatives at 0: MX(k)(0)=E[Xk].
Standard MGFs: Binomial (1−p+pet)n; Poisson eλ(et−1); Exponential λ/(λ−t); Normal eμt+σ2t2/2.
Multiplicative property: MX+Y=MX⋅MY for independent X,Y — this identifies distributions of sums.
Uniqueness: when MX is finite near 0, it uniquely determines the distribution of X.
The MGF may not exist (e.g.\ Cauchy, log-normal); in such cases the characteristic function E[eitX] always exists and always determines the distribution.
The cumulant generating functionlnMX(t) has derivatives at 0 equal to the cumulants, which add under independence.