Pos Hoc Tests and multiple testing
The goal of ANOVA and other statistical frameworks if to contrast variation among groups but not to identify where variation comes from.For instance, in ANOVA, the null hypothesis is that at least two groups (within or across factors) vary significantly in their means. But there could be more groups. Post Hoc tests are procedures that allow to identify the groups that differ significantly (i.e., beyond expected by differences that could be found by chance alone from sampling from the same statistical populations). Post Hoc (Latin) means “after this”.
Post-hoc tests generate analytical challenges because multiple statistical tests are conducted and false positives (i.e., type I errors, reject null hypotheses that are true) are expected. With the development of technology in Biology and other fields one can end up dealing from 1000s and even millions of statistical tests and associated p-values.
As such, we need to apply robust frameworks to avoid large numbers of false positives while also attempting to reduce the chances of false negatives (i.e., type II errors, do not reject false null hypothesis). As such, the field of procedures for multiple testing is an old and extremely active field in statistics.
In this lecture we will develop an understanding of why is important to adjust for inflated type I error, and cover routine and robust procedures for multiple testing.
Here is a pedagogical guide that I wrote to facilitate the understanding of the issues underlying multiple testing.
The Multiple Comparisons Problem, the Sprightly Pedagogue.
One of the most robust and widely used procedures involving adjusting for inflated type I errors due to multiple testing is FDR, the false discovery rate. We will cover FDR in details in this lecture, but this video provides a similar yet different style of presentation for FDR.
False Discovery Rates, FDR, clearly explained; STATQUEST by the University of North Carolina.