Lecture 8: Sampling distributions
September 26, 2024 (4th week of classes)Read and watch everything
Estimating under uncertainty with certainty (i.e., with some confidence)
Statistics is the science of aiding decision-making with incomplete information. Since the values of key statistics (mean, variance, median, etc.) can vary between samples and often differ for discrete variables — and almost always differ for continuous variables—from the true population values, this variability is a core concept in statistics, known as sampling variation.
Take a moment to reflect on this quote:
"While nothing is more uncertain than a single life,
nothing is more certain than the average duration of
a thousand lives." Elizur Wright (mathematician & “the father of life insurance”)
Critical concepts covered in this lecture:
[1] We typically work with a single sample, which gives us only one mean value, \(\bar{Y}\). However, understanding how sampling distributions are constructed is essential for grasping the process of estimating uncertainty. This is because sample means vary from one sample to another, affecting the confidence we have in our inferences.
[2] The sampling distribution highlights that, while the population mean \(\mu\) is considered a constant, its estimate \(\bar{Y}\) is a variable.
[3] The average of all sample estimates of the mean equals the population mean, \(\mu\).
[4] This also means that the sampling distribution of the mean is centered exactly on the true population mean, \(\mu\), making the sample mean \(\bar{Y}\)̅ an unbiased estimator of \(\mu\).
[5] Sample values for the standard deviation (and other statistics) also vary across samples. We will discuss this further in the next lecture, but it’s important to note that sample standard deviations are critical for estimating the uncertainty of a sample mean.
What is a Sampling Distribution? By Mike Marin ()
Sampling from a normally distributed population. This is a great interactive tutorial that demonstrates the properties of samples, sampling distributions and estimators. Project leader: Mike Whitlock; programmers: Boris Dalstein, Mike Whitlock & Zahraa Almasslawi.
Lecture
Slides: