Lecture 7: Sampling variation

September 27, 2022 (4th week of classes)
Read and watch everything

Estimating and making inferences with uncertainty

Statistics is the science of assisting in decision making with incomplete knowledge because the values of statistics of interest (mean, variance, median, etc) different among samples and from the intended statistical population. This concept in critical to statistics and is known as sampling variation.

Critical understanding:

[1] Values for descriptive statistics based on samples are never (exactly) the same as their values for the populations (there is always “sampling error”).

[2] That does not mean that inference based on samples are wrong (more on that later). Sample values can be a very good approximation of the true value.

[3] Approximations can be good (sample value close to the true population value) or bad (sample value far from the true value). You will understand why we use terms “close” & ”far” to describe samples in relation to their populations.

[4] But to feel “safe” in our inferences, it would be great to have a measure that estimates how wrong one could be.

[5] As we will see later (in other lectures), the variation among observations within samples (standard deviation) can inform us about how far sample means in general can be from the population mean (estimate how wrong one could be).

Dancing statistics: explaining the statistical concept of sampling & standard error through dance. The concept of standard error will be explained in a later lecture. But this video provides a really cool explanation for the concept underlying sampling variation!

Survivorship bias: great video explaining sample bias (also covered in Whitlock & Schluter). This is a great video where wrong understanding of sampling can lead to wrong decisions.