Stats Cheat Sheet

This a list of useful formulas from ST101: Introduction to Statistics. The list is a reference and does not include any explanations of the formulas.

Statistical Formulas

Empirical Mean

Properties (for independent random variables X and Y):

1. \mu (X + Y) = \mu (X) + \mu (Y)
2. \mu (XY) = \mu_X \times \mu_Y

Variance

1. \sigma^2 = Var(X)
2. \sigma^2 = \frac{1}{N}\sum_{i=1}^N \quad (x_i -\mu)^2
4. \sigma^2 = \frac{1}{N} \sum_{i=1}^N \quad (x_i-\mu)^2

Standard Deviation

\sigma = \sqrt{Var(X)}

\sigma = \sqrt{\frac{1}{N}\sum\_{i=1}^N (X_i-\mu)^2}

Properties (for independent random variables X and Y):

1. Cov(X,Y) = \frac{1}{N}\sum \quad [(X - \bar{X}) \times (Y - \bar{Y})]
2. Var(X + Y) = Var(X) + Var(Y) + 2 \times Cov(X,Y)
3. Var(\alpha X) = \alpha^2 Var(X)

Note:

• Cov(X,Y) = 0 for independent X and Y variables

Bernoulli Distribution

Bernoulli distribution (a special case of binomial distribution) is a discrete probability distribution with two possible outcomes X_i=\{0,1\} such that X_i=0 occurs with probability p and X_i=1 with probability q=(p-1). An example given in the course is a coin flip.

1. \mu=p
2. \sigma^2=p(1-p)

Binomial Coefficient

\binom{n}{k} = \frac{n!}{k!(n-k)!}

Binomial Probability Distribution

P(n, k, p) = \frac{n!}{k!(n-k)!} p^k (1-p)^{(n-k)}

Standard Normal Distribution

N(x; \mu, \sigma^2) = \frac{1}{\sqrt{2 \pi \cdot \sigma^2}} e^{[-\frac{1}{2} \cdot \frac{(x - \mu)^2}{\sigma^2}]}

Confidence Interval

CI = 1.96\sqrt{\frac{p(1-p)}{N}}

General Form:

1. Size of CI = a \sqrt{\frac{\sigma^2}{N}}
2. \frac{1}{N} \sum{X_i} \pm a \sqrt{\frac{\sigma^2}{N}}

Note:

• a=1.96 for N \ge 30
• a is the t-value computed for (N-1) degrees of freedom and confidence level p.

Linear Regression

y = bx + a

b = \frac{\sum_{i=1}^n[(x_i - \bar{x})(y_i - \bar{y})]}{\sum_{i=1}^n(x_i - \bar{x})^2}

a = \bar{y} - b\bar{x}

r = \frac{\sum_{i}[(x_i - \bar{x})(y_i - \bar{y}]}{\sqrt{\sum_{i}(x_i - \bar{x})^2\sum_{i}(y_i - \bar{y})^2}}

Z_x = \frac{x_i - \bar{x}}{\sigma_x}

Note:

• r is the correlation coefficient.
• Z_x is the standard score.

Probability Rules

Joint Occurrence

P(B\cdot C \mid A) = P(B \mid A) \times P(C \mid A\cdot B)

Note:

P(A \cdot B \cdot C) = P(B \cdot C \mid A) \times P(A)

P(A \cdot B \cdot C) = P(C \mid A \cdot B) \times P(A \cdot B)

P(A \cdot B \cdot C) = P(C \mid A \cdot B) \times P(B \mid A) \times P(A)

P(B \cdot C \mid A) \times P(A) = P(C \mid A \cdot B) \times P(B \mid A) \times P(A)

P(B\cdot C \mid A) = P(B \mid A) \times P(C \mid A\cdot B)

Independent Joint Probabilities

P(B\cdot C\mid A) = P(B\mid A) \times P(C\mid A)

Total Probability

P(A) = P(A \mid B) \times P(B) + P(A \mid \neg B) \times P(\neg B)

P(C \mid A) = P(C \mid A \cdot B) \times P(B \mid A) + P(C \mid A \cdot \neg B) \times P(\neg B \mid A)

Bayes' Rule

P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}

Binomial Coeffecient

\binom{n}{k} = \frac{n!}{k!(n-k)!}

Binomial Probability Distribution

P(n, k, p) = \frac{n!}{k!(n-k)!} p^k (1-p)^{(n-k)}