-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path14_app_generalisation.tex
34 lines (21 loc) · 2.04 KB
/
14_app_generalisation.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
\section{Generalisation from one-sample $T$-test to other models \label{App.generalisation}}
In section \ref{ss.est} we declare how all $Z$ statistics can be written in the form $Z =\frac{ b }{ \widetilde{SE}(b)} \sqrt{n}$
where $\widetilde{SE}(b) = SE(b) \sqrt{n}$. For a one sample t test $\widetilde{SE}(b) = \sigma$. Here we explain how this approach can be applied to other types of models.
\subsection{Two-sample $T$-test}
For a two-sample $T$-test, the $Z$ statistic is:
$$
Z = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{\sigma_1^2}{n_1} +\frac{\sigma_2^2}{n_2}}} = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{\sigma_1^2}{n_1/n} +\frac{\sigma_2^2}{n_2/n}}}\sqrt{n},
$$
where $\bar{X}_1$ and $\bar{X}_2$ are the sample means of each group, $\sigma_1^2$ and $\sigma_2^2$ are the respective variances, $n_1$ and $n_2$ are the respective sample sizes and $n =n_1+n_2$ is the total sample size.
Here $b=\bar{X}1-\bar{X}2$ and $\widetilde{SE} = \sqrt{\frac{\sigma_1}{n1/n}+\frac{\sigma_2}{n2/n}}$. The relative SE can be seen as only depending on the relative sample sizes, and thus the computed power set a total sample size and will be appropriate for a new study that has the same group variances relative sample sizes.
\subsection{Linear regression}
In linear regression, the statistic $Z$ for a $n \times p$ design matrix $X$ and contrast $c$ can be written as:
$$
Z = \frac{c\hat\beta}{\sqrt{c(X'X)^{-1}c'\sigma^2}} = \frac{c\hat\beta}{\sqrt{c(\frac{1}{n}\sum_ix_i'x_i)^{-1}c'\sigma^2}}\sqrt{n},
$$
where $x_i$ is the $i$th row of $X$, and thus $\widetilde{SE}(b) = \sqrt{c(\frac{1}{n}\sum_ix_i'x_i)^{-1}c'\sigma^2}$. To consider an arbitrary number of subjects, we need to define a $p$-dimensional distribution $F$ to generate possible covariate values, $x_i \sim F$. The
$p \times p$ term in the denominator can be seen as the uncentered second moment of this distribution:
$$
\frac{1}{n}\sum_i x_i'x_i \approx \mu'\mu+\text{cov}(x)
$$
where $\mu=E(x)$. Thus the sample size can be set arbitrarily, with the assumption that new observations of $x$ can be drawn from $F$.