Lecture 7: Sampling variation
September 24, 2024 (4th week of classes)Read and watch everything
Estimating and making inferences with uncertainty
Statistics is the science of aiding decision-making in the face of incomplete knowledge because statistical measures (mean, variance, median, etc.) vary between samples and differ from the true values of the population. This key concept is known as sampling variation. Here are critical points to understand:
[1] Descriptive statistics derived from samples are never exactly the same as their corresponding population values—there is always ‘sampling error.’
[2] However, this doesn’t mean that inferences based on samples are incorrect (more on that later). In fact, sample values can be excellent approximations of the true population values.
[3] These approximations can be ‘good’ (sample value close to the population value) or ‘bad’ (sample value far from the true value). We’ll discuss why we use terms like ‘close’ and ‘far’ when comparing samples to populations.
[4] To feel more confident in our inferences, it would be useful to have a measure that estimates how much error we might be making.
[5] As we’ll explore in future lectures, the variation among observations within a sample (e.g., standard deviation) can provide an indication of how far sample means typically deviate from the population mean—helping to estimate the potential error in our inferences.
Dancing Statistics: Explaining Sampling and Standard Error Through Dance. The concept of standard error will be covered in a later lecture, but this video offers a fun and creative explanation of the underlying concept of sampling variation!
Survivorship bias: great video explaining sample bias (also covered in Whitlock & Schluter). This is a great video where wrong understanding of sampling can lead to wrong decisions.
Lecture
Slides: