testing
view markdownwonderful summarizing blog post
basics
- data snooping - decide which hypotheses to test after examining data
 - null hypothesis $H_0$ vs alternative hypothesis $H_1$
 - types
    
- simple hypothesis $\theta = \theta_0$
 - composite hypothesis $\theta > \theta_0$ or $\theta < \theta_0$
 - two-sided test: $H_0: \theta = \theta_0 : vs. : H_1 \theta \neq \theta_0$
 - one-sided test: $H_0: \theta \leq \theta_0 : vs. : H_1: \theta > \theta_0$
 
 - significance levels
    
- stat. significant: p = 0.05
 - highly stat. significant: p = 0.01
 
 - errors
    
- $\alpha$ - type 1 - reject $H_0$ but $H_0$ true
 - $\beta$ - type 2 - fail to reject $H_0$ but $H_0$ false
 
 - p-value = probability, calculated assuming that the null hypothesis is true, of obtaining a value of the test statistic at least as contradictory to $H_0$ as the value calculated from the available sample
 - power: $1 - \beta$
 - adjustments
    
- bonferroni procedure - we are doing 3 tests with 5% confidence, so we actually do 5/3% for each test in order to restrict everything to 5% total
 - Benjamini–Hochberg procedure - controls for false discovery rate
 
 - note: ranking is often more important than actual FDR control (because we just need to know what experiments to do)
 
gaussian theory
- normal theory: assume $\epsilon_i$ ~ $N(0, \sigma^2)$
 - distributions
    
- suppose $Z_1, …, Z_n$ ~ iid N(0, 1)
 - chi-squared: $\chi_d^2$ ~ $\sum_i^d U_i^2$ w/ d degrees of freedom
        
- $(d-1)S^2/\sigma^2 \text{ proportional to } \chi_{d-1}^2$
 
 - student’s t: $U_{d+1} / \sqrt{d^{-1} \sum_1^d U_i^2}$ w/ d degress of freedom
 
 - t-test: test if mean is nonzero
    
- test null $\theta_k=0$ w/ $t = \hat{\theta}k / \hat{SE}$ where $SE = \hat{\sigma} \cdot \sqrt{\Sigma{kk}^{-1}}$
 - t-test: reject if |t| is large
 - when n-p is large, t-test is called the z-test
 - under null hypothesis t follows t-distr with n-p degrees of freedom
 - here, $\hat{\theta}$ has a normal distr. with mean $\theta$ and cov matrix $\sigma^2 (X^TX)^{-1}$
        
- e independent of $\hat{\theta}$ and $||e||^2 ~ \sigma^2 \chi^2_d$ with d = n-p
 
 - observed stat. significance level = P-value - area of normal curve beyond $\pm \hat{\theta_k} / \hat{SE}$
 - if 2 vars are statistically significant, said to have independent effects on Y
 
 - f-test: test if any of non-zero means
    
- null hypothesis: $\theta_i = 0, i=p-p_0, …, p$
 - alternative hypothesis: for at least one $ i \in {p-p_0, …, p}, : \theta_i \neq 0$
 - $F = \frac{(||X\hat{\theta}||^2 - ||X\hat{\theta}^{(s)}||^2) / p_0}{||e||^2 / (n-p)} $ where $\hat{\theta^{(s)}}$ has last $p_0$ entries 0
 - under null hypothesis, $||X\hat{\theta}||^2 - ||X\hat{\theta}^{(s)}||^2$ ~ $U$, $||e||^2$ ~ $V$, $F$ ~ $\frac{U/p_0}{V/(n-p)}$ where $ U : indep : V$, $U$ ~ $\sigma^2 \chi^2{p_0}$, $V$ ~ $\sigma^2 \chi{n-p}^2$
 - there is also a partial f-test
 
 
statistical intervals
- interval estimates come with confidence levels
 - $Z=\frac{\bar{X}-\mu}{\sigma / \sqrt{n}}$
 - For p not close to 0.5, use Wilson score confidence interval (has extra terms)
 - confidence interval - if multiple samples of trained typists were selected and an interval constructed for each sample mean, 95 percent of these intervals contain the true preferred keyboard height
    
- frequentist idea
 
 
tests on hypotheses
- Var($\bar{X}-\bar{Y})=\frac{\sigma_1^2}{m}+\frac{\sigma_2^2}{n}$
 - tail refers to the side we reject (e.g. upper-tailed=$H_a:\theta>\theta_0$
 - we try to make the null hypothesis a statement of equality
 - upper-tailed - reject large values
 - $\alpha$ is computed using the probability distribution of the test statistic when $H_0$ is true, whereas determination of b requires knowing the test statistic distribution when $H_0$ is false
 - type 1 error usually more serious, pick $\alpha$ level, then constrain $\beta$
 - can standardize values and test these instead
 
testing LR coefficients
- confidence interval construction
    
- confidence interval (CI) is range of values likely to include true value of a parameter of interest
 - confidence level (CL) - probability that the procedure used to determine CI will provide an interval that covers the value of the parameter - if we remade it 100 times, 95 would contain the true $\theta_1$
 
 - $\hat{\beta_0} \pm t_{n-2,\alpha /2} * s.e.(\hat{\beta_0}) $
    
- for $\beta_1$
        
- with known $\sigma$
            
- $\frac{\hat{\beta_1}-\beta_1}{\sigma(\hat{\beta_1})} \sim N(0,1)$
 - derive CI
 
 - with unknown $\sigma$
            
- $\frac{\hat{\beta_1}-\beta_1}{s(\hat{\beta_1})} \sim t_{n-2}$
 - derive CI
 
 
 - with known $\sigma$
            
 
 - for $\beta_1$
        
 
ANOVA (analysis of variance)
- y - called dependent, response variable
 - x - independent, explanatory, predictor variable
 - notation: $E(Y|x^) = \mu_{Y\cdot x^} = $ mean value of Y when x = $x^*$
 - Y = f(x) + $\epsilon$
 - linear: $Y=\beta_0+\beta_1 x+\epsilon$
 - logistic: $odds = \frac{p(x)}{1-p(x)}=e^{\beta_0+\beta_1 x+\epsilon}$
 - we minimize least squares: $SSE = \sum_{i=1}^n (y_i-(b_0+b_1x_i))^2$
 - $b_1=\hat{\beta_1}=\frac{\sum (x_i-\bar{x})(y_i-\bar{y})}{\sum (x_i-\bar{x})^2} = \frac{S_{xy}}{S_{xx}}$
 - $b_0=\bar{y}-\hat{\beta_1}\bar{x}$
 - $S_{xy}=\sum x_iy_i-\frac{(\sum x_i)(\sum y_i)}{n}$
 - $S_{xx}=\sum x_i^2 - \frac{(\sum x_i)^2}{n}$
 - residuals: $y_i-\hat{y_i}$
 - SSE = $\sum y_i^2 - \hat{\beta}_0 \sum y_i - \hat{\beta}_1 \sum x_iy_i$
 - SST = total sum of squares = $S_{yy} = \sum (y_i-\bar{y})^2 = \sum y_i^2 - (\sum y_i)^2/n$
 - $r^2 = 1-\frac{SSE}{SST}=\frac{SSR}{SST}$ - proportion of observed variation that can be explained by regression
 - $\hat{\sigma}^2 = \frac{SSE}{n-2}$
 - $T=\frac{\hat{\beta}1-\beta_1}{S / \sqrt{S{xx}}}$ has a t distr. with n-2 df
 - $s_{\hat{\beta_1}}=\frac{s}{\sqrt{S_{xx}}}$
 - $s_{\hat{\beta_0}+\hat{\beta_1}x^} = s\sqrt{\frac{1}{n}+\frac{(x^-\bar{x})^2}{S_{xx}}}$
 - sample correlation coefficient $r = \frac{S_{xy}}{\sqrt{S_xx}\sqrt{S_{yy}}}$
 - this is a point estimate for population correlation coefficient = $\frac{Cov(X,Y)}{\sigma_X\sigma_Y}$
 - make fisher transformation - this test statistic also tests correlation
 - degrees of freedom
 - one-sample T = n-1
 - T procedures with paired data - n-1
 - T procedures for 2 independent populations - use formula ~= smaller of n1-1 and n2-1
 - variance - n-2
 - use z-test if you know the standard deviation—