vignettes/sequential_testing.Rmd
sequential_testing.Rmd
The sprtt
package is the implementation of sequential probability ratio tests using the associated t-statistic (sprtt). This vignette describes the theoretical background of these tests.
Other recommended vignettes cover:
a general guide, how to use the package and
an extended use case.
With a sequential approach, data is continuously collected and an analysis is performed after each data point, which can lead to three different results (Wald, 1945):
The data collection is terminated because enough evidence has been collected for the null hypothesis (H0).
The data collection is terminated because enough evidence has been collected for the alternative hypothesis (H1).
The data collection will continue as there is not yet enough evidence for either of the two hypotheses.
Basically it is not necessary to perform an analysis after each data point β several data points can also be added at once. However, this affects the sample size (N) and the error rates (Schnuerch & Erdfelder, 2020).
The efficiency of sequential designs has already been examined. Reductions in the sample by 50% and more were found in comparison to analyses with fixed sample sizes (Schnuerch & Erdfelder, 2020; Wald, 1945). Sequential hypothesis testing is therefore particularly suitable when resources are limited because the required sample size is reduced without compromising predefined error probabilities.
The sequential t-test is based on the Sequential Probability Ratio Test (SPRT) by Abraham Wald (1947), which is a highly efficient sequential hypothesis test. However, the usage of WaldΒ΄s SPRT is limited in the case of normally distributed data, because the variance has to be known or specified in the hypothesis. Rushton (1950, 1952) and Hajnal (1961) have further developed the SPRT using the t-statistic. The basic idea is to transform the sequence of observations (which is dependent on the variance) into a sequence of the associated t-statistic (which is independent of the variance).
In the SPRT the null and alternative hypotheses are defined as follows, with π representing the model parameter :
\[ H_0:\ π\ =\ π_0 \\ H_1:\ π\ =\ π_1 \]
The test statistic of the SPRT is based on a likelihood ratio, which is a measure of the relative evidence in the data for the given hypotheses. More specifically, it is the ratio of the likelihood of the alternative hypothesis to the likelihood of the null hypothesis at the m-th step of the sampling process (LRm).
\[ LR_{m} = \frac {f(data_m | H_1)} {f(data_m | H_0)} = \frac {π(x_1,...,x_m | π_1)} {π(x_1,...,x_m | π_0)} \]
Before the transformation into the t-statistic, the model parameter π contains the parameters of a normal distribution: the mean (Β΅) and the standard deviation (π). Therefore, the Wald SPRT requires prior knowledge about the variance (π2) or a specification in the hypotheses.
After the transformation of the observed values into the associated t-statistic, the model parameter π contains the parameters of the non-central t-distribution: the degrees of freedom (df) and the non-centrality parameter (π₯).
\[ {π(x_1,...,x_m | Β΅,π)} => {π(t_2,...,t_m | df,π₯)} \]
For the calculation of the degrees of freedom, only the sample size of the group(s) is needed. The non-centrality parameter also requires a specification of the expected effect size in form of Cohen`s d (d).
To eventually calculate the LR of the sequential t-test, only the current tm-statistic is necessary. Rushton (1950) demonstrated that an SPRT can be performed by simply considering the ratio of probability densities for the most recent tm statistic under the alternative and null hypothesis at any m-th stage. Thus, the test statistic for a one and two-sided sequential t-test can be calculated as follows:
\[ LR_{m,\ one-sided\ sequential\ t-test} = \frac {π(t_m | π_1)} {π(t_m | π_0)} \\ LR_{m,\ two-sided\ sequential\ t-test} = \frac {π(t_m^2 | π_1)} {π(t_m^2 | π_0)}. \]
To account for the fact that the algebraic sign is unknown in a two-sided test, the t-value is squared (Rushton, 1952).
After the calculation of the test statistic, the decision will be either to continue sampling or to terminate the sampling and accept one of the hypotheses. Wald (1945) defined the following rules for the SPRT:
Condition | Decision |
---|---|
LRm β€ B | accept H0 and reject H1 |
B < LRm < A | continue sampling |
LRm β€ A | accept H1 and reject H0 |
The A and B boundaries are calculated with the previously defined error rates πΌ (Type I error) and π½ (Type II error) as follows:
\[ A = \left( \frac{1 - π½}{πΌ} \right) \\ B = \left( \frac{π½}{1 - πΌ} \right). \]
In summary, three specifications are required to calculate a sequential t-test:
the πΌ error probability (usually 0.05 or less),
the π½ error probability (usually .20 or less), and
CohenΒ΄s d (either as the expected effect size or as the lower limit for a substantial effect).
Hajnal, J. (1961). A two-sample sequential t-test. Biometrika, 48(1/2), 65.
Rushton, S. (1950). On a sequential t-test. Biometrika, 37(3/4), 326.
Rushton, S. (1952). On a two-sided sequential t-test. Biometrika, 39(3/4), 302.
Schnuerch, M., & Erdfelder, E. (2020). Controlling decision errors with minimal costs: The sequential probability ratio t test. Psychological Methods, 25(2), 206β226.
Wald, A. (1947). Sequential analysis,β john wiley & sons. New York, NY.
Wald, A. (1945). Sequential tests of statistical hypotheses. The Annals of Mathematical Statistics, 16(2), 117β186.