st101 ยป

Stats Cheat Sheet


This a list of useful formulas from ST101: Introduction to Statistics. The list is a reference and does not include any explanations of the formulas.


Statistical Formulas


Empirical Mean


\mu = \frac{1}{N} \cdot \sum_{i=1}^N \quad X_i

Properties (for independent random variables X and Y):

  1. Mean(X + Y) = Mean(X) + Mean(Y)
  2. Mean(X \times Y) = Mean(X) \times Mean(Y)


Variance


\sigma^2 = Var(X)

\sigma^2 = \frac{1}{N}\sum_{i=1}^N \quad (x_i -\mu)^2

\sigma^2 = \frac{\Sigma \quad X_i^2}{N} - \frac{(\Sigma \quad X_i)^2}{N^2}

\sigma^2 = \frac{1}{N} \sum_{i=1}^N \quad (x_i-\mu)^2


Standard Deviation


\sigma = \sqrt{Var(X)}

\sigma = \sqrt{\frac{1}{N}\sum\_{i=1}^N (X_i-\mu)^2}

Properties (for independent random variables X and Y):

Cov(X,Y) = \frac{1}{N}\sum \quad [(X - \bar{X}) \times (Y - \bar{Y})]

Var(X + Y) = Var(X) + Var(Y) + 2 \times Cov(X,Y)

Var(\alpha X) = \alpha^2 Var(X)

Note:

  • Cov(X,Y) = 0 for independent X and Y variables


Bernoulli Distribution


Bernoulli distribution (a special case of binomial distribution) is a discrete probability distribution with two possible outcomes X_i=\{0,1\} such that X_i=0 occurs with probability p and X_i=1 with probability q=(1-p). An example given in the course is a coin flip.

\mu=p \sigma^2=p(1-p)


Binomial Coefficient


\binom{n}{k} = \frac{n!}{k!(n-k)!}


Binomial Probability Distribution


P(n, k, p) = \frac{n!}{k!(n-k)!} p^k (1-p)^{(n-k)}


Standard Normal Distribution


N(x; \mu, \sigma^2) = \frac{1}{\sqrt{2 \cdot \pi \cdot \sigma^2}} \cdot e^{[-\frac{1}{2} \cdot \frac{(x - \mu)^2}{\sigma^2}]}


Confidence Interval


CI = 1.96\sqrt{\frac{p(1-p)}{N}}

General Form:

Size of CI = a \sqrt{\frac{\sigma^2}{N}}

\frac{1}{N} \sum{X_i} \pm a \sqrt{\frac{\sigma^2}{N}}

Note:

  • a=1.96 for N \ge 30
  • a is the t-value computed for (N-1) degrees of freedom and confidence level p.


Linear Regression


y = bx + a

b = \frac{\sum_{i=1}^n[(x_i - \bar{x})(y_i - \bar{y})]}{\sum_{i=1}^n(x_i - \bar{x})^2}

a = \bar{y} - b\bar{x}

r = \frac{\sum_{i}[(x_i - \bar{x})(y_i - \bar{y}]}{\sqrt{\sum_{i}(x_i - \bar{x})^2\sum_{i}(y_i - \bar{y})^2}}

Z_x = \frac{x_i - \bar{x}}{\sigma_x}

Note:

  • r is the correlation coefficient.
  • Z_x is the standard score.

Probability Rules



Joint Occurrence


P(B\cdot C \mid A) = P(B \mid A) \times P(C \mid A\cdot B)

Note:

P(A \cdot B \cdot C) = P(B \cdot C \mid A) \times P(A)

P(A \cdot B \cdot C) = P(C \mid A \cdot B) \times P(A \cdot B)

P(A \cdot B \cdot C) = P(C \mid A \cdot B) \times P(B \mid A) \times P(A)

P(B \cdot C \mid A) \times P(A) = P(C \mid A \cdot B) \times P(B \mid A) \times P(A)

P(B\cdot C \mid A) = P(B \mid A) \times P(C \mid A\cdot B)


Independent Joint Probabilities


P(B \cdot C \mid A) = P(B \mid A) \times P(C \mid A)


Total Probability


P(A) = P(A \mid B) \times P(B) + P(A \mid \neg B) \times P(\neg B)

P(C \mid A) = P(C \mid A \cdot B) \times P(B \mid A) + P(C \mid A \cdot \neg B) \times P(\neg B \mid A)


Bayes' Rule


P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}


Binomial Coefficient


\binom{n}{k} = \frac{n!}{k!(n-k)!}


Binomial Probability Distribution


P(n, k, p) = \frac{n!}{k!(n-k)!} p^k (1-p)^{(n-k)}