
STAT 218 - Week 4, Lecture 2
February 5th, 2024
The Central Limit Theorem states that, no matter what distribution Y may have in the population, if the sample size is large enough, then the sampling distribution of \(\bar{Y}\) will be approximately a normal distribution.
The significance of the Central Limit Theorem lies in its applicability when the shape of the population distribution is unknown, a common scenario in practical situations.
How large a sample size is required?


Let’s have a look wing areas of 14 male Monarch butterflies at Oceano Dunes State Park in California
Suppose we consider these 14 observations as a random sample from a population.
We should be aware of the fact that these estimates are subject to sampling error.
Warning
And our goal is to estimate \(\mu\).
Sampling Error: the amount of discrepancy between \(\bar{y}\) and \(\mu\) is described (in a probability sense) by the sampling distribution of \(\bar{Y}\)
The standard error of the mean is defined as follow:
\[ SE_\bar{Y} = \frac{s}{\sqrt{n}} \]
Tip
If \(Z\) is a standard normal random variable, then the probability that \(Z\) is between \(\pm\) 2 is about 0.95 (OR 95% if we remember The 68/95/99.7 rule)

To understand how to calculate confidence intervals, we need to have
Tip
If \(Z\) is a standard normal random variable, then the probability that \(Z\) is between \(\pm\) 2 is about 95% (Remember The 68/95/99.7 rule)
\[ Pr\{ -1.96 < \frac{\bar{Y}-\mu}{\sigma/\sqrt{n}} < 1.96 \} =0.95 \]
If you solve this, it will become
\[ 95 \% \ CI = (\bar{Y} \pm 1.96 \sigma / \sqrt{n}) \]
Exercise 5.2.8 The heights of a certain population of corn plants follow a normal distribution with mean 145 cm and standard deviation 22 cm. We collected data from 16 plants and calculated the sample mean as 135 cm.
If \(\bar{Y}\) represents the mean height of a random sample of 16 plants from the population (which is 135), 95% confidence interval (CI) for \(\mu\) can be calculated as following:
\[ 95 \% \ CI = (\bar{Y} \pm 1.96 \ X\ \sigma / \sqrt{n}) \]
\[ = (135\pm 1.96 \ X \ 22 / \sqrt{16}) \]
\[ =(124.22,145.78) \]
To help you visualize, imagine we have a population, and from that population, we randomly select a group of 20 observational units
95%CI = (-44.47, 20.13)

If we repeat this process 100 times, creating 100 different samples of 20 observational units each, we would end up with 100 different samples drawn from the population.
If we calculate confidence intervals for each of these 100 samples, we will find that…
And…
If we calculate confidence intervals for each of these 100 samples, we will find that around 95% of these intervals capture the true population mean.
We are 95% confident that the true population mean is in this confidence interval.