Lecture 12: Statistical Hypotheses Testing

October 25th, 2022 (7th week of classes)
Read and watch everything

Statistical Hypothesis Testing: Generating evidence-based conclusion without complete biological knowledge

One of the most important roles of statistics is to generate evidence to support research hypotheses. That said, the framework of statistical hypothesis testing to support research hypotheses can be used in many fields that are not research driven such as election pools, marketing preferences, temporal changes in society, and pretty much almost any question based on sample data can be transformed into a statistical hypothesis testing.

Evidence in general means information, facts or data supporting (or contradicting) a claim, prediction assumption or hypothesis.

When referring to “evidence from the scientific literature” we mean the empirical studies published in peer-reviewed scholarly journals.

What is a research hypothesis:

A hypothesis is a supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation (Oxford dictionary).

A hypothesis is a proposition made as a basis for reasoning, without any assumption of its truth (Oxford dictionary).

Hypotheses [plural form] can be thought as educated guesses that have not been supported by data yet.

They cannot be proven right or wrong from the data. Hypotheses can be said to be either refuted or supported by the data generated.

The framework here that will be used for hypothesis testing is based on P-values, which is a frequentist approach.

The statistical hypothesis framework (most often involving statistical tests) is a quantitative method of statistical inference that allows to generate evidence for or against a research hypothesis.

The research hypothesis is translated into a statistical question. The statistical question is then stated as two mutually exclusive hypotheses (called null and alternative hypotheses).

The framework most often involves estimating a probability value that serves as a quantitative indicator of support for or against the research hypothesis.

Some interesting videos that complement this lecture:

What is a P-Value and Why Does it Matter? By NOVA PBD Official. The famous Lady tasting tea problem: https://en.wikipedia.org/wiki/Lady_tasting_tea

Not Even Scientists Can Easily Explain p-values; and p-values are used by scientists everyday! This is not great! Please make sure that you know what it means at the end of this lecture.

After watching the video, read this small article: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5804470/

Mark Chang (2017) well stated: “A smaller p-value indicates a discrepancy between the hypothesis and the observed data. In this sense, p-value measures the strength of evidence against the null hypothesis. However, p-value is not a probability of a null hypothesis being true. Rejecting H0 at α = .05 level with p = .02 does not mean we have a 5% or 2% probability of making a mistake.”; Educational and Psychological Measurement, 77, 475-488.