Illustration of the Central Limit Theorem

applet-magic.com Thayer Watkins Silicon Valley & Tornado Alley USA

What is illustrated below is the histogram for 2000 repetitions of taking samples of n random variables and computing the sum. Each time the display is refreshed a new set of 2000 repetitions of the samples is created.

The random variable is uniformly distributed between -0.5 and +0.5. The sum is normalized by dividing by the square root of the sample size n. This keeps the dispersion of the distribution constant. Otherwise with larger n the distribution would be more spread out. As the sample size n gets larger the distribution more closely approximates the shape of the normal distribution with mean equal to zero.

The History of the Central Limit Theorem

William J. Adams, in his book The Life and Times of the Central Limit Theorem says that the germination of the Central Limit Theorem began with Abraham de Moivre, a French Hugenot refugee in London.

Abraham de Moivre
De Moivre was a superb mathematician who fled the renewed persecution of Protestants after the revocation of the Edict of Nantes. In London he became acquainted with the top English scientists and mathematicians, including Isaac Newton, but he could not secure an academic appointment. To support himself he worked as a consultant on problems of finance, insurance and probability, the latter being for gamblers. He investigated the limits of the binomial distribution as the number of trials increases without bound and found that the function exp(-x²) came up in connection with this problem. In particular, de Moivre sought to determine the probability of the most frequent occurrence in a binomial distribution, which found to be approximated by

2/(B√n)(1/2ⁿ) for large values of n
and where
log B = 1 - 1/12 + 1/360 - 1/1260 + 1/1680

James Stirling discovered that B is equal to √2π.
Now it is well known that the peak of the binomial distribution for two equally likely events is of the form:

The probability of m=n/2 successes in n=2m trials is
((m!)(m!)/(2m)!)(1/2^2m)

The use of Stirling's formula for the factorial, which apparently was essentially discovered by de Moivre, gives the result found by de Moivre.
While de Moivre found the role played by exp(-x²/2) as the limit of other distribution he did not think of
(1/√2π)exp(-x²/2) as being a distribution in its own right.
The formulation of the normal distribution came with Thomas Simpson in connection with the distribution errors in astronomical observation. This idea was was expanded upon by the German mathematician Carl Friedrich Gauss who then developed the principle of least squares. Independently the French mathematicians Pierre Simon de Laplace and Adrien-Marie Legendre also developed these ideas. In some countries, including Germany the normal distribution is known as the Gaussain distribution and in France it is known as the Laplacian distribution.

Carl Friedrich Gauss

Pierre Simon de Laplace

It was with Laplace's work that the first inklings of the Central Limit Theorem appeared. But the rigorous proof of the Central Limit Theorem came from the Russian mathematicians. P.L. Chebyshev started the project to obtain a rigorous development of the Central Limit Theorem and his students, Andrei A. Markov and Alexander M. Lyapunov.

P.L. Chebyshev

Andrei A. Markov

Alexander M. Lyapunov

It was Lyapunov's analysis that led to the modern characteristic function approach to the Central Limit Theorem.

Illustration of the Central Limit Theorem in Terms of Characteristic Functions

Consider the distribution function

p(z) = 1 if -1/2 ≤ z ≤ +1/2
= 0 otherwise

which was the basis for the previous illustrations of the Central Limit Theorem. This distribution has mean value of zero and its variance is 2(1/2)³/3 = 1/12. Its standard deviation is thus 1/√12 or 1/(2√3).
The variance of

(z₁ + z₂ + ... + z_n)/√n
is
(1/√n)²(nVar(z)) = Var(z).

The division of the sum by √n results in the normalized sum having a constant variance. Thus all of the distribution functions of the normalized sums should have a variance of 1/12.
The characteristic function of p(z)

The background for the theory of characteristic functions is given elsewhere
The characteristic function φ(ω) for the distribution, is

Re(φ(ω)) = ∫_-1/2^1/2cos(ωz)dz
= 2sin(ω/2)/ω

and

Im(φ(ω)) = ∫_-1/2^1/2sin(ωz)dz = 0

Thus φ(ω) = sin(ω/2)/(ω/2). The function sin(x)/x comes up so often that it has been given a name, sinc(x). Therefore φ(ω) = sinc(ω/2).
The characteristic function of the sum of two independent random variables is the product of their characteristic functions. Thus the sum of n independent variables with the above distribution would then be:

[φ(ω)]ⁿ = sincⁿ(ω/2)

The variable plotted in the previous illustrations is the normalized sum of n variables, where the sum is divided by √n. The effect on the characteristic function of multiplying the random variable by a scaling factor s is to multiply the parameter in the characteristic function by the scaling factor. Therefore the characteristic function of the normalized sum of n random variables with distribution p(z) is

φ_n(ω) = sincⁿ((ω/2)/√n)

The logarithm of this characteristic function is

log(φ_n(ω)) = n log( sinc((ω/2)/√n))

Now the problem is to find the limit of this function as n increases without bound. As n so increases the above expression approaches the indeterminate form ∞0. To find the limit in such cases L'Hospital Rule can be applied. For applying L'Hospital's Rule it is best to represent the above log-characteristic function as:

log(φ_n(ω)) =
log( sinc((ω/2)/√n))/(1/n)

This approaches the indeterminate form 0/0 as n → ∞.
To facilitate the determination of the limit it is convenient to let ω/2)/√n be represented as ζ. Also note that

since sinc(ζ) = sin(ζ)/ζ
sinc'(ζ) = cos(ζ)/ζ - sin(ζ)/ζ²
sinc"(ζ) = - sin(ζ)/ζ - cos(ζ)/ζ² - cos(ζ)/ζ² + 2sin(ζ)/ζ³
= - sin(ζ)/ζ - 2cos(ζ)/ζ² + 2sin(ζ)/ζ³
= - sin(ζ)/ζ - 2(ζcos(ζ) - sin(ζ))/ζ³

Furthermore, note that ∂ζ/∂n = -(1/2)(ω/2)n^-3/2. Now taking derivatives of numerator and denominator with respect to n gives:

(sinc'(ζ)(ω/2)(-1/2)n^-3/2)/sinc((ζ))
(-1/n²)

This reduces to the form

(1/2)(ω/2)(sinc'(ζ)/sinc(ζ))
(1/√n)

This again approaches the indeterminate form 0/0 so it is necessary to again take derivatives of the numerator and denominator.
The derivative of the numerator of the above form is

(1/2)(ω/2)((sinc"(ζ)/sinc(ζ))-((sinc'(ζ)/sinc(ζ))²)∂ζ/∂n

The derivative of the denominator is -(1/2)n^-3/2. This cancels out the term involving n in ∂ζ/∂n.
Thus the ratio is

(1/2)(ω/2)((sinc"(ζ)/sinc(ζ))-((sinc'(ζ)/sinc(ζ))²)(ω/2)

The limit of ζ as n increases without bound is zero and since

sinc(0) = 1
sinc'(0) = 0
sinc"(0) = -1 + 2/3 = -1/3

the ratio approaches the limit of

-(1/2)(1/3)(ω/2)²
= -(1/12)ω²/2

Since the log-characteristic function of a normal distribution with mean zero and standard deviation σ is

- σ²ω²/2

the limiting log-characteristic function of the normalized sum of variables with the distribution function p(z) as n→∞ is that of a normal distribution with mean of zero and variance of 1/12. This is an instance of the Central Limit Theorem.

HOME PAGE OF applet-magic
HOME PAGE OF Thayer Watkins

The History of the Central Limit Theorem

2/(B√n)(1/2n) for large values of n and where log B = 1 - 1/12 + 1/360 - 1/1260 + 1/1680

The probability of m=n/2 successes in n=2m trials is ((m!)(m!)/(2m)!)(1/22m)

Illustration of the Central Limit Theorem in Terms of Characteristic Functions

p(z) = 1 if -1/2 ≤ z ≤ +1/2 = 0 otherwise

(z1 + z2 + ... + zn)/√n is (1/√n)2(nVar(z)) = Var(z).

The characteristic function of p(z)

Re(φ(ω)) = ∫-1/21/2cos(ωz)dz = 2sin(ω/2)/ω and Im(φ(ω)) = ∫-1/21/2sin(ωz)dz = 0

[φ(ω)]n = sincn(ω/2)

φn(ω) = sincn((ω/2)/√n)

log(φn(ω)) = n log( sinc((ω/2)/√n))

log(φn(ω)) = log( sinc((ω/2)/√n))/(1/n)

since sinc(ζ) = sin(ζ)/ζ sinc'(ζ) = cos(ζ)/ζ - sin(ζ)/ζ2 sinc"(ζ) = - sin(ζ)/ζ - cos(ζ)/ζ2 - cos(ζ)/ζ2 + 2sin(ζ)/ζ3 = - sin(ζ)/ζ - 2cos(ζ)/ζ2 + 2sin(ζ)/ζ3 = - sin(ζ)/ζ - 2(ζcos(ζ) - sin(ζ))/ζ3

(sinc'(ζ)(ω/2)(-1/2)n-3/2)/sinc((ζ)) (-1/n2)

(1/2)(ω/2)(sinc'(ζ)/sinc(ζ)) (1/√n)

(1/2)(ω/2)((sinc"(ζ)/sinc(ζ))-((sinc'(ζ)/sinc(ζ))2)∂ζ/∂n

(1/2)(ω/2)((sinc"(ζ)/sinc(ζ))-((sinc'(ζ)/sinc(ζ))2)(ω/2)

sinc(0) = 1 sinc'(0) = 0 sinc"(0) = -1 + 2/3 = -1/3

-(1/2)(1/3)(ω/2)2 = -(1/12)ω2/2

- σ2ω2/2

2/(B√n)(1/2ⁿ) for large values of n
and where
log B = 1 - 1/12 + 1/360 - 1/1260 + 1/1680

The probability of m=n/2 successes in n=2m trials is
((m!)(m!)/(2m)!)(1/2^2m)

p(z) = 1 if -1/2 ≤ z ≤ +1/2
= 0 otherwise

(z₁ + z₂ + ... + z_n)/√n
is
(1/√n)²(nVar(z)) = Var(z).

Re(φ(ω)) = ∫_-1/2^1/2cos(ωz)dz
= 2sin(ω/2)/ω

and

Im(φ(ω)) = ∫_-1/2^1/2sin(ωz)dz = 0

[φ(ω)]ⁿ = sincⁿ(ω/2)

φ_n(ω) = sincⁿ((ω/2)/√n)

log(φ_n(ω)) = n log( sinc((ω/2)/√n))

log(φ_n(ω)) =
log( sinc((ω/2)/√n))/(1/n)

since sinc(ζ) = sin(ζ)/ζ
sinc'(ζ) = cos(ζ)/ζ - sin(ζ)/ζ²
sinc"(ζ) = - sin(ζ)/ζ - cos(ζ)/ζ² - cos(ζ)/ζ² + 2sin(ζ)/ζ³
= - sin(ζ)/ζ - 2cos(ζ)/ζ² + 2sin(ζ)/ζ³
= - sin(ζ)/ζ - 2(ζcos(ζ) - sin(ζ))/ζ³

(sinc'(ζ)(ω/2)(-1/2)n^-3/2)/sinc((ζ))
(-1/n²)

(1/2)(ω/2)(sinc'(ζ)/sinc(ζ))
(1/√n)

(1/2)(ω/2)((sinc"(ζ)/sinc(ζ))-((sinc'(ζ)/sinc(ζ))²)∂ζ/∂n

(1/2)(ω/2)((sinc"(ζ)/sinc(ζ))-((sinc'(ζ)/sinc(ζ))²)(ω/2)

sinc(0) = 1
sinc'(0) = 0
sinc"(0) = -1 + 2/3 = -1/3

-(1/2)(1/3)(ω/2)²
= -(1/12)ω²/2

- σ²ω²/2