Statistics homework 6
Theory:
- Do a research about the following topics:
- The law of large numbers LLN, the various definitions of convergence
- The convergence of the Binomial to the normal and Poisson distributions
- The central limit theorem [in anticipation of a topic we will study later]
Practice: Exercise (also partially described in video 04)
Generate and represent m “sample paths” of n point each (m, n are program parameters), where each point represents a pair of:
time index t, and relative frequency of success f(t),
where f(t) is the sum of t Bernoulli random variables with distribution B(x, p) = p^x(1-p)^(1-x) observed at the various times up to t: j=1, …, t..
At time n (last time) and one other chosen inner time 1<j<n (where j is a user parameter) represent with a histogram the distribution of f(t).
See also what happens if you replace the relative frequency f(t) with the absolute frequency n(t) or by standard relative frequency: (f(t)-p) / sqrt(p(1-p)/t) [ or simply the “normalized” sum of bernoulli r.v.’s: n(t) / Math.sqrt(t) ].
Comment briefly on the convergence results you see.
Practice theory: Do a web research about the various methods proposed to compute the running median (one pass, online algorithms). Store (cite all sources and attributions) the algorithm(s) that you think is(are) a good candidate, explaining briefly how it works and possibly try a quick demo.
Theory 1
The law of large numbers
In probability theory, the law of large numbers describes the result of performing for a large amount of times the same experiment. The law says that the average of the results obtained from all the trials tends to be close to the expected value and the more trials we do, the more the average will be close to that value.
There exist 2 different versions of the law: the strong one and the weak one.
Weak law
States that the average of the trial’s results converges in probability to the expected value:
\[\bar{X}_{n} \to^{P} \mu \ when \ n \to \infty\] \[\forall \epsilon > 0 \ \lim_{n \to \infty} Pr(|\bar{X}_{n} - \mu| < \epsilon ) = 1\]Strong law
The difference between the weak one is that this states that the average almost surely converges to the expected value:
\[\bar{X}_{n} \to^{a.s.} \mu \ when \ n \to \infty\] \[\lim_{n \to \infty} Pr(\bar{X}_{n} = \mu) = 1\]The convergence of binomial
To poisson distribution
The poisson limit theorem describes this convergence.
It states that the poisson distribution may be used as an approximation of binomial distributions under certain conditions.
Theorem
From [1] Let \(p_{n}\) be a sequence of real numbers in \([0,1]\) such that the sequence \(np_{n}\) converges to a finite limit \(\lambda\). Then:
\[\lim_{n \to \infty} \binom{n}{k} p_{n}^{k}(1 - p_{n})^{n-k} = e^{-\lambda}\frac{\lambda^{k}}{k!}\]Proofs
\[\lim_{n \to \infty} \binom{n}{k} p_{n}^{k}(1 - p_{n})^{n-k} \simeq \lim_{n \to \infty} \frac{n(n-1)(n-2) \dots (n - k + 1)}{k!}\left(\frac{\lambda}{n} \right)^{k}\left( 1 - \frac{\lambda}{n} \right)^{n - k} =\] \[= \lim_{n \to \infty} \frac{n^{k} + O(n^{k+1})}{k!}\frac{\lambda^{k}}{n^{k}}\left( 1 - \frac{\lambda}{n} \right)^{n - k} =\] \[= \lim_{n \to \infty} \frac{\lambda^{k}}{k!}\left( 1 - \frac{\lambda}{n} \right)^{n - k}\]Since
\[\lim_{n \to \infty} \left( 1 - \frac{\lambda}{n} \right)^{n} = e^{-\lambda}\]and
\[\lim_{n \to \infty} \left( 1 - \frac{\lambda}{n} \right)^{k} = 1\]This leaves
\[\binom{n}{k} p_{n}^{k}(1 - p_{n})^{n-k} \simeq \frac{\lambda^{k}e^{-\lambda}}{k!}\]To normal distribution
Every normal distribution is defined by 2 real values: the mean which indicates the center of the distribution and the standard deviation which measures the spread of the distribution.
If we do \(n\) experiments with a probability of success \(p\) for each of these trials, we can use a normal distribution to approximate the binomial distribution as follows:
- The mean is equal to \(np\)
- The standard deviation is equal to \(\sqrt{np(1-p)}\)
As said in [2] by using some mathematics it can be shown that there are a few conditions that we need to use a normal approximation to the binomial distribution. The number of observations \(n\) must be large enough, and the value of \(p\) so that both \(np\) and \(n(1 - p)\) are greater than or equal to 10
The central limit theorem
In probability theory the central limit theorem (CLT) establishes that usually the sum of independent random variables, the normalized sum tends toward a normal distribution
[1] https://en.wikipedia.org/wiki/Law_of_large_numbers [2] https://www.thoughtco.com/normal-approximation-to-the-binomial-distribution-3126589
Practice
The code can be found here
The program proves the law of large numbers, in fact all the paths if we increase the number of experiments tends toward the expected value, indicated as the red line.
How it works
Practice theory
As describen in [1] a moving average is a calculation to analyze data points by creating a series of averages of different subsets of the full data set. There exist different kinds of moving averages:
- Simple moving average: is an unweighted mean of the previous \(k\) data-points over a total of \(n\) (\(k < n\)): \(SMA_{k,next} = SMA_{k,prev} + \frac{1}{k}(p_{n+1} - p_{n-k+1})\)
- Cumulative moving average: used when we want to compute the mean for all values got until a certain moment: \(CMA_{n+1} = CMA_{n} + \frac{x_{n+1} - CMA_{n}}{n+1}\)
- Weighted moving average: adds weights to data at different positions in the sample window. Every time we get a new value \(p_{M+1}\):
- \[Total_{M+1} = Total_{M} + p_{M+1} - p_{M-n+1}\]
- \[Numerator{M+1} = Numerator{M} + np_{M+1} - Total_{M}\]
- \[WMA_{M+1} = \frac{Numerator_{M+1}}{n + (n-1) + (n-2) + \dots + 2 + 1}\]
- Empirical moving average: is a first-order infinite impulse response filter that applies weighting factors which decrease exponentially:
- \[EMA_{t} = \begin{cases} Y_{1} t=1 \\ \alpha Y_{t} + (1-\alpha) \cdot EMA_{t-1} t>1 \end{cases}\]