library(pwr)
4 Continuous Outcome
4.1 Comparing 2 groups
4.1.1 Overview
pwr.t.test()
function
- One-sample t test (
type
= “one.sample”) - Two-sample t test (
type
= “two.sample”) - Paired t test (
type
= “paired”)
Cohen’s d is used as the effect size
- Very small (d = 0.01)
- Small (d = 0.2)
- Medium (d = 0.5)
- Large (d = 0.8)
- Very large (d = 1.2)
- Huge (d = 2)
In the example below, I will be used these setting for default values:
- Medium effect size
- A two-tailed test
- A significance of 0.05 and a power of 80%
4.1.2 One-sample t-test
Cohen’s d:
\[ d = \frac{\mu_1 - \mu_0}{SD} \]
Where:
- \(\mu_0\) = mean under \(H_0\)
- \(\mu_1\) = mean under \(H_1\)
- \(SD\) = SD under \(H_0\)
4.1.2.1 Ex 1: New dietary supplement
Does the introduction of a new dietary supplement reduce systolic blood pressure in patients with stage 1 hypertension more effectively than the currently recommended lifestyle modifications alone?
The primary outcome is the mean change in systolic blood pressure (mmHg) after a 12-week supplementation period.
Let’s assume a medium effect size (\(d = 0.5\))
cohen.ES(test = "t", size = "medium")
Conventional effect size from Cohen (1982)
test = t
size = medium
effect.size = 0.5
pwr.t.test(d = 0.5,
sig.level = 0.05, power = 0.8,
type = "one.sample", alternative = "two.sided")
One-sample t test power calculation
n = 33.36713
d = 0.5
sig.level = 0.05
power = 0.8
alternative = two.sided
- N = 34
- If dropout rate of 20%, a total of 43 samples are required
For non-parametric test: adding 15% gives a total of 65.
4.1.2.2 Ex 2: New DM Drug
Let’s propose a study of a new drug to reduce hemoglobin A1c in type 2 diabetes over a 1 year study period. You estimate that your recruited participants will have a mean baseline A1c of 9.0, which will be unchanged by your placebo, but reduced (on average) to 7.0 by the study drug.
let’s say 5.0 and 17.0 for min and max of Hgb A1c
<- (17 - 5)/4
sd_approx <- (9 - 7) / sd_approx # delta / sd
d1
pwr.t.test(
n = NULL,
sig.level = 0.05,
type = "two.sample",
alternative = "two.sided",
power = 0.80,
d = d1
)
Two-sample t test power calculation
n = 36.30569
d = 0.6666667
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number in *each* group
N = 37 in each group (Assuming a 20% dropout rate in each arm, would require 37*5/4 subjects per arm)
If study on 50 participants, what would the power be?
pwr.t.test(
n = 25, # note that n is per arm
sig.level = 0.05,
type = "two.sample",
alternative = "two.sided",
power = NULL, # ?
d = 0.66
)
Two-sample t test power calculation
n = 25
d = 0.66
sig.level = 0.05
power = 0.6280322
alternative = two.sided
NOTE: n is number in *each* group
4.1.3 Two-sample t-test
Cohen’s d for Welch:
\[ d = \frac{ \mu_1 - \mu_2 }{SD_{pool}} \]
Where
\[ SD_{pool} = \sqrt{ (SD_1^2 + SD_2^2)/2 } \]
pwr.t.test(d = 0.5, sig.level = 0.05, power = 0.8,
type = "two.sample",
alternative = "two.sided")
Two-sample t test power calculation
n = 63.76561
d = 0.5
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number in *each* group
Assuming a p-value of 0.05 and a power of 80% in a two-tailed test, when the effect size d = 0.5
N = 64 x 2
If dropout rate of 20%, Total = 160
Mann-Whitney U test:
- For non-parametric test add an additional 15% for each group would give a total of 240 people.
4.1.4 Paired t-test
pwr.t.test(d = 0.5, sig.level = 0.05, power = 0.8,
type = "paired",
alternative = "two.sided")
Paired t test power calculation
n = 33.36713
d = 0.5
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number of *pairs*
Assuming a p-value of 0.05 and a power of 80% in a two-tailed test, the minimum number of pairs required to demonstrate statistical significance is 34 when the effect size d = 0.5.
Considering the dropout rate of 20%, a total of 43 pairs are required.
Paired Wilcoxon test
- For non-parametric test, add an additional 15%, the total 65 pairs are required
4.2 Comparing ≥3 groups
4.2.1 ANOVA (Parametric)
Studies that compare averages of three or more groups.
k
: number of comparison groupsf
: means the effect size (Cohen’s \(f\))
\[ f = \sqrt{ \frac{ \sum_{i=1}^{k} p_i \times (\mu_i - \mu)^2 }{\sigma^2} } \]
Effect Size (f-values)
- Small = 0.1
- Medium = 0.25
- Large = 0.4
cohen.ES(test = "anov", size = "medium")
Conventional effect size from Cohen (1982)
test = anov
size = medium
effect.size = 0.25
pwr.anova.test(k = 3 , f = 0.25, sig.level = 0.05, power = 0.8)
Balanced one-way analysis of variance power calculation
k = 3
n = 52.3966
f = 0.25
sig.level = 0.05
power = 0.8
NOTE: n is number in each group
Assume that the p-value is 0.05, the power is 80%, and the two-tailed test is performed. When the total comparison group was three groups and the effect size value was 0.25, the number of subjects calculated was 53 in each group.
Considering a dropout rate of 20%, a total of 198 samples are required, which is calculated as 66 per group.
4.2.2 Kruskal-Wallis test (Non-parametric)
For non-parametric test, add an additional 15% of each group