# Sequential One-Way ANOVA

#### Meike Steinhilber

#### 2024-01-08

Source:`vignettes/one_way_anova.Rmd`

`one_way_anova.Rmd`

## Overview

The `sprtt`

package is a **s**equential
**p**robability **r**atio
**t**ests **t**oolbox
(**sprtt**). This vignette describes the theoretical
background of these tests.

Other recommended vignettes cover:

a general guide, how to use the package and

an extended use case.

## What is a sequential test procedure?

With a sequential approach, data is continuously collected and an analysis is performed after each data point, which can lead to three different results (A. Wald, 1945):

The data collection is

*terminated*because enough evidence has been collected for the null hypothesis (H_{0}).The data collection is

*terminated*because enough evidence has been collected for the alternative hypothesis (H_{1}).The data collection

*will continue*as there is not yet enough evidence for either of the two hypotheses.

Basically it is not necessary to perform an analysis after each data point — several data points can also be added at once. However, this affects the sample size (N) and the error rates (Schnuerch et al., 2020).

The efficiency of sequential designs has already been examined. Reductions in the sample by 50% and more were found in comparison to analyses with fixed sample sizes (Schnuerch et al., 2020; A. Wald, 1945). Sequential hypothesis testing is therefore particularly suitable when resources are limited because the required sample size is reduced without compromising predefined error probabilities.

## What is the sequential one-way ANOVA?

The sequential one-way fixed effects ANOVA is based on the Sequential
Probability Ratio Test (SPRT) by Abraham Abraham
Wald (1947), which is a highly efficient sequential hypothesis
test. It can be used instead of *t*-tests if the means of two or
more groups are compared. For detailed information see the public
preprint (Steinhilber et al., 2023). Note:
keep in mind that the repeated measurement ANOVA is not implemented yet
in the sprtt package.

## How to plot the ANOVA results

### Scenario 1: Perfect data

In this case, we have data that are perfectly balanced and in a perfect sampling order. Here, we can use the default value of the ‘plot’ argument ‘single’ or we can choose ‘balanced’.

```
set.seed(333)
data <- sprtt::draw_sample_normal(3, f = 0.25, max_n = 30)
# calculate the SPRT -----------------------------------------------------------
anova_results <- sprtt::seq_anova(y~x, f = 0.25, data = data, plot = TRUE)
anova_results <- sprtt::seq_anova(y~x, f = 0.25, data = data, plot = TRUE, seq_steps = "single")
anova_results <- sprtt::seq_anova(y~x, f = 0.25, data = data, plot = TRUE, seq_steps = "balanced")
# plot the results -------------------------------------------------------------
sprtt::plot_anova(anova_results)
```

### Scenario 2: Unbalanced data in an imperfect order

In this case, we have a data set with unbalanced sample sizes between the groups and the data points are not in a perfect order. Because the order is not perfect, it does not make sense to use the ‘balanced’ option. Because the first data points (2*k_groups) are not equally distributed between the groups (some groups have less than 2 data points), the option ‘single’ would output an error.

**Thus, we need to define the sequential steps
ourselves.**

Here, we start later (after 12 data points) but then calculate the LR after every single data point.

```
set.seed(333)
data <- sprtt::draw_sample_normal(3, f = 0.25, max_n = 50, sample_ratio = c(1,2,2))
data <- data[sample(nrow(data)),] # destroy the perfect order of the data
# calculate the SPRT -----------------------------------------------------------
anova_results <- sprtt::seq_anova(y~x, f = 0.25, data = data,
plot = TRUE,
seq_steps = 12:nrow(data) # we start with the first 12 data points instead of the first 6
)
# plot the results -------------------------------------------------------------
sprtt::plot_anova(anova_results,
labels = TRUE,
position_labels_x = 0.2,
position_labels_y = 0.2,
position_lr_x = 0.1,
position_lr_y = -1.8,
font_size = 20,
line_size = 1,
highlight_color = "darkblue"
)
```

## References

*Journal of Mathematical Psychology*,

*95*, 102326. https://doi.org/10.1016/j.jmp.2020.102326

*PsyArXiv*. https://doi.org/10.31234/osf.io/m64ne

*The Annals of Mathematical Statistics*,

*16*(2), 117–186.

*Sequential analysis*. Wiley.