Statistics homework 7
Theory: History and derivation of the normal distribution. Touch, at least, the following three i mportant perspectives, putting them into an historical context to understand how the idea developed:
- as approximation of binomial (De Moivre)
- as error curve (Gauss)
- as limit of sum of independent r.v.’s (Laplace)
Practice:
-
Create a simulation with graphics to convince yourself of the pointwise convergence of the empirical CDF to the theoretical distribution (Glivenko-Cantelli theorem). Use a simple random variable of your chooice for such a demonstration.
-
Generate sample paths of jump processes which at each time considered t = 1, …, n perform jumps computed as:
- σ sqrt(1/n) R(t) where R(t) is a [-1,1] Rademacher random variable (https://en.wikipedia.org/wiki/Rademacher_distribution).
- σ sqrt(1/n) * Z(t), where Z(t) is a N(0,1) random variable (https://en.wikipedia.org/wiki/Normal_distribution)
and see what happens as n (simulation parameter) becomes larger.
Practice theory: Do a research about the random walk process and its properties. Compare your finding with your applications drawing your personal conclusions. Explain based on your exercise the beaviour of the distribution of the stochastic process (check out “Donsker’s invariance principle”). What are, in particular, its mean and variance at time n ?
Theory
The normal distribution
A probability distribution is a function \(f(x)\) so that:
\[P(a < x < b ) = \int_{a}^{b} f(x) dx\]The normal distribution is a family of distributions, given by:
\[f(x) = \frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\frac{(x-\mu)^{2}}{2\sigma^{2}}}\]The standard normal has \(\mu = 0\) and \(\sigma = 1\), i.e:
\[f(x) = \frac{1}{\sqrt{2\pi}}e^{\frac{x^{2}}{2}}\]Changing \(\mu\) changes the location of the curve, and changing \(\sigma\) changes the spread of the curve.
History
In 1733, DeMoivre noted that \(n! \sim Bn^{n+\frac{1}{2}}e^{-n}\), James Stirling determined that \(B = \sqrt{2\pi}\). Therefore, if we consider the binomial distribution the probability of k successes in n trials is:
\[\binom{n}{k}p^{k}(1-p)^{n-k} =\] \[= \frac{n!}{k!(n-k)!}p^{k}(1-p)^{n-k} \sim\] \[\sim \frac{1}{\sqrt{2\pi np(1-p)}}e^{-\frac{(k-np)^{2}}{2np(1-p)}}\]Let \(\sigma^{2} = np(1-p)\) and \(\mu = np\), and you have:
\[\binom{n}{k}p^{k}(1-p)^{n-k} \simeq \frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\frac{(k-\mu)^{2}}{\sigma^{2}}}\]Therefore, DeMoivre first used the Normal Distribution as an approximation for probabilities of binomial distribution experiments where n is very large.
In 1774, Laplace proposed the first of his probability distributions for errors \(\phi\) that must be symmetric about \(0\) and monotone decreasing with \(x\). As we have no reason to suppose a different law for the ordinates than for their differences, assumed \(\phi(x) \sim \frac{d\phi(x)}{dx}\) Thus:
\[\phi(x) = \frac{m}{2}e^{-m|x|}\]Recall:
\[f(x) = \frac{1}{\sqrt{2\pi\sigma^{2}}}e^{-\frac{(x - \mu)^{2}}{2\sigma^{2}}}\]In 1778 Laplace discovered the same distribution found by DeMoivre, when he was deriving the central limit theorem. Laplace showed that even if a distribution is not normally distributed the means of repeated samples from the distribution would be very nearly normal and the larger is the sample size the closer the distribution would be to a normal distribution.
On January 1, 1801, Giuseppe Piazzi spotted something he believed to be a new planet named Ceres, but six weeks later he lost it. Gauss suggested looking in a different area of the sky than most other astronomers, and he was right.
Gauss concluded that the probability density for the error is:
\[\phi(x) = \frac{h}{\sqrt{\pi}}e^{-h^{2}x^{2}}\]Let \(p\) be the true but unknown error, let \(M_{1}, \dots, M_{n}\) be estimates of p, let \(\phi(x)\) be the probability density function of the random error (recall: \(P(a < x < b) = \int_{a}^{b} \phi(x)\)):
- \(\phi(x)\) has maximum at \(x = 0\)
- \[\phi(-x) = \phi(x)\]
- The average, \(\bar{M} = \frac{1}{n}\sum_{i=1}^{n} M_{i}\), is the most likely value of p
[1]https://www.math.utah.edu/~kenkel/normaldistributiontalk.pdf
[2]https://www.youtube.com/watch?v=BXof869EC68
Practice 1
I used a set of bernullian processes:
How it works
Practice 2
How it works
Download projects
Practicce theory
In mathematics, as said in [1] a random walk is a mathematical object, known as a stochastic or random process, that describes a path that consists of a succession of random steps on some mathematical space such as the integers.
These are some examples of random walk:
- A path starting from 0 and moving at each step of -1 or +1 with equal probability
- The path described by a molecule that travels through a liquid (Brownian motion)
- the search path of a foraging animal
- the price of a fluctuating stock and the financial status of a gambler
From these examples, it is easy to understand that random walk processes are largely used in various application and fields.
Lattice random walks
The most popular type of random walks is the lattice based one, where at each step the location jumps to another site according to some probability distribution. In a simple random walk, the location can only jump to neighboring sites of the lattice, forming a lattice path. There exist 2 different types of random walk: simple symmetric random walk the probabilities of the location jumping to each one of its immediate neighbors are the same, ** simple bordered symmetric random walk** if the state space is limited to finite dimensions.
Gaussian random walk
A random walk having a step size that varies according to a normal distribution is used as a model for real-world time series data such as financial markets.
Here, the step size is the inverse cumulative normal distribution \(\Phi^{-1}(z,\mu ,\sigma )\) where \(0 \leq z \leq 1\) is a uniformly distributed random number, and \(\mu\) and \(\sigma\) are the mean and standard deviations of the normal distribution, respectively.
If \(\mu\) is nonzero, the random walk will vary about a linear trend. If \(v_{s}\) is the starting value of the random walk, the expected value after n steps will be \(v_{s} + n\mu\).
For the special case where \(\mu\) is equal to zero, after n steps, the translation distance’s probability distribution is given by \(N(0, n\sigma^{2})\), where \(N()\) is the notation for the normal distribution, n is the number of steps, and \(\sigma\) is from the inverse cumulative normal distribution as given above.
Conclusions from the applications
From the Donsker’s theorem:
Let \(X_{1},X_{2}, \dots\) be a sequence of independent and identically distributed (i.i.d.) random variables with mean 0 and variance 1. Let \(S_{n} = \sum_{i=1}^{n} X_{i}\). The stochastic process \(S = (S_{n})_{n \in \mathbb{N}}\) is known as a random walk.
Define the diffusely rescalated random walk (partial-sum process) by:
\[W^{(n)}(t) = \frac{S_{\lfloor nt \rfloor}}{\sqrt{n}}, \ \ t \in [0, 1]\]The central limit theorem asserts that \(W^{n}(1)\) converges in distribution to a standard Gaussian random variable \(W(1)\) as \(n \to \infty\).
Donsker’s invariance principle extends this convergence to the whole function \(W^{(n)} = (W^{(n)}(t))_{t \in [0,1]}\). More precisely, in its modern form, Donsker’s invariance principle states that: As random variables taking values in the Skorokhod space \(D[0, 1]\), the random function \(W^{(n)}\) converges in distribution to a standard Brownian motion.
It is easily understandable that from the application 2 if we increase n to a large number the random walk converges in distribution to a standard Brownian motion.