Lecture 2: Statistical Hypotheses Testing

January 16, 2024 (1st week of classes)

Statistical Hypothesis Testing: Generating evidence-based conclusion without complete biological knowledge - part 1

In your introductory statistics course, you should have covered the essentials of statistical hypothesis testing. This concept is crucial in statistics and revisiting it will provide a solid foundation as we embark on more advanced topics. Statistical hypothesis testing plays a vital role in generating evidence to support research hypotheses.

Moreover, the principles of hypothesis testing are applicable beyond research-driven fields. They can be used in contexts such as electoral polls, marketing preferences, and analyzing societal changes over time, among a plethora of fields. Essentially, any question that can be formulated using sample data can be addressed through statistical hypothesis testing.

The term ‘evidence’ generally refers to information, facts, or data that either support or contradict a claim, prediction, assumption, or hypothesis. Specifically, when we mention ‘evidence from the scientific literature,’ we are referring to empirical scientific studies published in peer-reviewed scholarly journals.

What exactly is a research hypothesis? According to the Oxford Dictionary, a hypothesis is a supposition or proposed explanation based on limited evidence, serving as a starting point for further investigation. It is a proposition made for the purpose of reasoning, without any presumption of its truth. Hypotheses, often considered educated guesses, have yet to be substantiated by data. They are not proven right or wrong solely through data but can be supported or refuted based on evidence generated from data.

The framework for hypothesis testing, which we will use in this course, primarily employs P-values, a frequentist approach. This framework involves translating a research hypothesis into a statistical question, which is then articulated as two mutually exclusive hypotheses: the null and the alternative. This process typically includes calculating a probability value (P-value) that serves as a quantitative measure indicating the level of support for or against a statistical hypothesis.

To elaborate, the transition from a research hypothesis to a statistical hypothesis begins by first clearly defining a specific question or assertion derived from the broader research question. A research hypothesis typically proposes a relationship or effect, such as ‘Increased study time improves exam scores.’ This proposition suggests a direction of influence and anticipates a specific outcome based on theoretical understanding or previous research.

The next step is to formulate this research hypothesis into a testable statistical format, which involves setting up two opposing statements: the null hypothesis (H0) and the alternative hypothesis (H1). The null hypothesis typically states that there is no relationship or effect, directly challenging the research hypothesis. In our example, H0 would be ‘Increased study time has no effect on exam scores.’ Conversely, the alternative hypothesis supports the research hypothesis, stating that there is an effect or relationship. Hence, H1 would be ‘Increased study time improves exam scores.’

The statistical testing process then involves collecting data and using appropriate statistical tests to calculate a P-value. This P-value quantifies the probability of observing the collected data, or something more extreme, if the null hypothesis were true. A small P-value (typically less than 0.05 or 0.01) indicates strong evidence against the null hypothesis, suggesting support for the alternative hypothesis and, by extension, the original research hypothesis.

Thus, the process from research hypothesis to statistical hypothesis involves framing a theoretical assertion in a way that is empirically testable, using statistical tools to assess the validity of that assertion based on data.

Some interesting videos that complement this lecture:


What is a P-Value and Why Does it Matter? By NOVA PBD Official. The famous Lady tasting tea problem: https://en.wikipedia.org/wiki/Lady_tasting_tea


Read this small article ‘P-value: What is and what is not’: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5804470/


Lecture

Download lecture: 3 slides per page and outline

Download lecture: 1 slide per page