Statistical power

Completion status: this resource is ~50% complete.

Educational level: this is a tertiary (university) resource.

Statistical power is the likelihood that a statistical test will:

return a significant result based on a sample from a population in which there is a real effect.
reject the null hypothesis when the alternative hypothesis is true (i.e. that it will not make a Type II error).

Power can range between 0 and 1, with higher values indicating a greater likelihood of detecting an effect.

What is statistical power?

Statistical power is the probability of correctly rejecting a false H₀ (i.e., getting a significant result when there is a real difference in the population).

Desirable power

Power ≥ .80 generally considered desirable
Power ≥ .60 is typical of studies published in major psychology journals

Increasing power

Power will be higher when the:

effect size (ES) is larger
sample size (N) is larger
critical value (α) is larger

Jacob Cohen published the “bible” of power analysis,^[1] which has remained one of the definitive works on statistical power. In this book, he provides some guidelines for what are typically considered “small,” “medium,” and “large” effect sizes. Cohen did not intend these numbers to be set in stone. They were meant to be suggestions, based on his experience with results published on various topics in major journals. However, Cohen believed that they should be ignored or replaced when more appropriate values were known for a specific research area.^[2] For better or worse, Cohen’s suggestions have been widely adopted, becoming as conventional as the “p < .05” rule for statistical significance, although unfortunately with little improvement in the power of typical published psychology research.^[3]

Statistical Test	Effect Size	Small	Medium	Large
Correlation	r	.10	.30	.50
t-test	d (or SMD)	.20	.50	.80
ANOVA	f (not F!)	.10	.25	.40
Chi-Square	w	.10	.30	.50
Multiple Regression	f²	.02	.15	.35

Estimating power

Statistical power can be calculated prospectively and retrospectively.

If possible, calculate expected power before conducting a study, based on:

Estimated N, the sample size needed to achieve a specified power level
Critical α, the value below which a p value would be considered "significant" (i.e., rejecting the null hypothesis)
Expected or minimum ES (e.g., from related research)

It is possible to solve for necessary sample size, if the effect size, alpha, and desired power are known. This is often called an "a priori" power analysis. Ideally it would be done when planning a study, and often is done in grant proposals, which use 80% power as a common convention.

It also is possible to solve for power after a study is done (when the N, effect size, and critical α all are determined). This is a "post hoc" power analysis, and generally is considered the least helpful, since the study is already completed.

More helpfully, we can solve for the critical effect size that we would have had a set power to detect, given the critical α and a set N. This is sometimes referred to as a "sensitivity" power analysis.^[4]

Power calculators

Try searching using terms such as "statistical power calculator" and maybe also the type of test, and you should turn up links to useful pages such as:

The GPower program, which is free software available for Windows and Macintosh. This software was sponsored by grants from the German government, and it is available in English as well as German language versions.^[4] Examples that go through estimating power for several commonly used statistical tests using the GPower software are hosted on the UCLA website here.
Statistical power calculators
One Sample Test Using Average Values
Post-hoc Statistical Power Calculator for Multiple Regression

References

↑ Cohen, Jacob (1988). Statistical power analysis for the behavioral sciences (2nd ed ed.). Hillsdale, N.J.: L. Erlbaum Associates. ISBN 978-0-8058-0283-2. OCLC 17877467. https://www.worldcat.org/oclc/17877467.
↑ Cohen, Jacob (1992). "A power primer.". Psychological Bulletin 112 (1): 155–159. doi:10.1037/0033-2909.112.1.155. ISSN 1939-1455. http://doi.apa.org/getdoi.cfm?doi=10.1037/0033-2909.112.1.155.
↑ Cohen, Jacob (1994-12). "The earth is round (p < .05).". American Psychologist 49 (12): 997–1003. doi:10.1037/0003-066X.49.12.997. ISSN 1935-990X. http://doi.apa.org/getdoi.cfm?doi=10.1037/0003-066X.49.12.997.
↑ ^4.0 ^4.1 Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg (2009-11). "Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses". Behavior Research Methods 41 (4): 1149–1160. doi:10.3758/BRM.41.4.1149. ISSN 1554-351X. http://link.springer.com/10.3758/BRM.41.4.1149.

[1] Cohen, Jacob (1988). Statistical power analysis for the behavioral sciences (2nd ed ed.). Hillsdale, N.J.: L. Erlbaum Associates. ISBN 978-0-8058-0283-2. OCLC 17877467. https://www.worldcat.org/oclc/17877467.

[2] Cohen, Jacob (1992). "A power primer.". Psychological Bulletin 112 (1): 155–159. doi:10.1037/0033-2909.112.1.155. ISSN 1939-1455. http://doi.apa.org/getdoi.cfm?doi=10.1037/0033-2909.112.1.155.

[3] Cohen, Jacob (1994-12). "The earth is round (p < .05).". American Psychologist 49 (12): 997–1003. doi:10.1037/0003-066X.49.12.997. ISSN 1935-990X. http://doi.apa.org/getdoi.cfm?doi=10.1037/0003-066X.49.12.997.

[:0-4] 4.0 ^4.1 Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg (2009-11). "Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses". Behavior Research Methods 41 (4): 1149–1160. doi:10.3758/BRM.41.4.1149. ISSN 1554-351X. http://link.springer.com/10.3758/BRM.41.4.1149.

[1]

[2]

[3]

[4]