edoardo@home:~$

Statistics homework 5

Theory:

  1. Think and explain in your own words what is the role that probability plays in Statistics and the relation between the observed distribution and frequencies their “theoretical” counterparts. Do some practical examples where you explain how the concepts of an abstract probability space relate to more “concrete” and “real-world” objects when doing statistics.

  2. Explain the Bayes Theorem and its key role in statistical induction. Describe the different paradigms that can be found within statistical inference (such as “bayesian”, “frequentist” [Fisher, Neyman]).

Practice: Given 2 variables from a csv compute and represent the statistical regression lines (X to Y and viceversa) and the scatterplot. Optionally, represent also the histograms on the “sides” of the chart (one could be draw vertically and the other one horizontally, in the position that you prefer). [Remember that all our charts must alway be done within “dynamic viewports” (movable/resizable rectangles). No third party libraries, to ensure ownership of creative process. May choose the language you prefer.].

Practice theory: Do a web research about the various methods to generate, from a Uniform([0,1)), all the most important random variables (discrete and continuous). Collect all source code you think might be useful code of such algorithms (keep credits and attributions wherever applicable), as they will be useful for our next simulations.



Theory 1

Probability is a mathematical abstraction of what statistics is.

From [1] a probability space is a triple \((\Omega, \Sigma, P)\) where:

  • \(\Omega\) is the sample space, an arbitrary non-empty set
  • \(\Sigma\) is a \(\sigma-algebra\) and is a subset of the powerset of \(\Omega\)
  • \(P : \Sigma \to [0, 1]\) a function on \(F\) such that:
    • is countably additive \(P( \cup_{i}^{\infty} A_{i}) = \sum_{i}^{\infty} P(A_{i})\)
    • the measure of entire sample space is equal to one (\(P(\Omega) = 1\) and \(\forall i,j A_{i} \cap A_{j} = \emptyset\))

Let’s consider now a generic empirical distribution of a variable used in statistical analysis, we can assign the triple described above to the elements of this empirical distribution like this:

  • \(\Omega\) corresponds to the values that the variable can assume
  • \(\Sigma\) corresponds to all the possible events that can happen considering all values in \(\Omega\)
  • \(P\) gives the relative frequency of an event

The function P answers also to the question of what is the probability that a certain event happens in this empirical distribution. For instance, consider a variable that counts the grade range with 3 values: low, medium, high and let’s consider this empirical data:

Variable Value Rel. frequency
low 10 0.17
medium 30 0.50
high 20 0.34

If we want to know the probability of getting a low score we know already know that, in fact, is equal to the relative frequency of that value in the empirical distribution.

Theory 2

In inferential statistics given an empirical distribution, that is a subset of a bigger unknown distribution, we want to inference this bigger distribution. The small empirical set that we know may derive from more than one distribution with different parameters (e.g. for a normal distribution we would have two parameters: \(\sigma\), and \(\mu\)). So we have a set of different distributions described by a certain amount of parameters, we define the i-th distribution as \(\Theta_{i}\) that is a vector of parameters that characterize that particular distribution.

More formally we have n unknown distributions \(\{ \Theta_{1}, \Theta_{2}, \dots, \Theta_{n} \}\) and a known distribution \(E\) and \(\exists i \in [0, n] \land E \implies \Theta_{i}\) we need to know which one of the unknown distribution derived the known distribution \(E\).

Here comes Bayes theorem to help find the most probable unknown distribution, knowing \(E\):

\[P(\Theta|E) = \frac{P(\Theta \land E)}{P(E)} = \frac{P(E|\Theta) P(\Theta)}{P(E)}\]

Where the first element at the left of the equal is called state of nature or posterior probability respect to \(E\) and \(P(\Theta)\) is called prior probability respect to \(E\), this second probability is completely unknown and usually assumptions are made.

There are two paradigms that can be found in statistical inference: the Bayesian one and the frequentist one.

The first one consists in inferencing the unknown distribution starting from a known distribution using the Bayesian theorem as shown above, adding prior probabilities.

The second, as described in [1] is a type of statistical inference that draws conclusions from sample data by emphasizing the frequency or proportion of the data. In this kind of statistics we can analyze only repeatable random events, in fact after a large amount of repetitions the relative frequencies of each event tends to be equal to the probability that the event occurs.

[1] https://reflectivedata.com/dictionary/frequentist-statistics/



Practice

The code can be found here

How it works



Practice Theory

This github repository contains the most common discrete and continuous distribution in c#.