Comparing Two Means

STAT 218 - Week 5, Lecture 3

February 7^th, 2024

Introduction

So far, we have interested in the analysis of a single sample of a numeric variable.
- In practice, most of the scientific research involves the comparison of 2 or more samples from different populations.
If the observed variable is quantitative, the comparison of two samples can include
1. comparison of means,
2. comparison of standard deviations
3. comparison of shapes.
In this week, we will be dealing with the comparison of 2 means

Notation

We employed a parallel notations but to be able to differentiate two samples from each other, we will use subscript.

Figure 1. Naturally Occurring Populations

The two populations that we are interested in can be either
- naturally occurring populations (Figure 1) OR
- conceptual populations defined by certain experimental conditions.

Standard Error of (\(\bar{Y}_1\) - \(\bar{Y}_2\))

We have learned that the precision of a sample mean can be quantified by its standard error.

The formula that we have used so far is

\[ SE_{\bar{Y}} = \frac{s}{\sqrt{n}} \]

Naturally, we can say that taking the difference between two sample means is an estimate of the quantity (\(\mu_1\) - \(\mu_2\)).
However, the formula for the standard error of the difference (\(\bar{Y}_1\) - \(\bar{Y}_2\)) is a little different from what we initially thought.

\[ SE_{\bar{Y}_1 - \bar{Y}_2} = \sqrt{SE_1^2 + SE_2^2} \]

\[ SE_{\bar{Y}_1 - \bar{Y}_2} = \sqrt{ \frac {s_1^2}{n_1} + \frac {s_2^2}{n_2}} \]

CI for (\(\mu_1\) - \(\mu_2\))

Confidence Interval for (\(\mu_1\) - \(\mu_2\))

\[ \\95 \% \ CI = (\bar{y} \pm t_{0.025} \ \times \ SE_{\bar{y}}) \] Let’s revise our 95% confidence interval formula for comparing two means.

\[ \\95 \% \ CI = (\bar{y_1}-\bar{y_2}) \pm t_{0.025} \times SE_{\bar{Y}_1 - \bar{Y}_2} \]

Warning

You are not responsible for calculating degrees of freedom for this chapter. It has a complex formula in your book but we either calculate that \(df\) by using computer or I give you the \(df\) value directly.

An Example of Calculating CI for (\(\mu_1\) - \(\mu_2\))

The Wisconsin Fast Plant, Brassica campestris, has a very rapid growth cycle that makes it particularly well suited for the study of factors that affect plant growth.

In one such study, 7 plants were treated with the substance Ancymidol (ancy) and were compared to 8 control plants that were given ordinary water. Heights of all of the plants were measured, in cm, after 14 days of growth.

(\(df\) for this question is calculated as 12).

Calculate 95% confidence interval for (\(\mu_1\) - \(\mu_2\))

95% CI for (\(\mu_1\) - \(\mu_2\)) is (-0.46,10.26)

Is it reasonable to believe that the substance may affect plant growth? Discuss with your neighbor

Example (cont.d)

Ancymidol is a plant growth regulator that reduces plant growth by inhibiting gibberellin biosynthesis.
In our hypothetical example, we calculated 95% CI and
- We are 95% confident that the population average 14-day height of fast plants when water is used (\(\mu_1\)) is between 0.4 cm lower and 10.2 cm higher than the average 14-day height of fast plants when Ancymidol is used (\(\mu_2\)).
A paper on Effects of Ancymidol from 1976
National Library of Medicine

Assumptions and Validations

The confidence interval formula is valid if

the data can be regarded as coming from two independently chosen random samples,
the observations are independent within each sample, and
each of the populations is normally distributed.

If \(n_1\) and \(n_2\) are large, condition (3) is less important.

Food for Thought

Contextual Questions:

Is it reasonable to believe that the substance may affect the plant growth?
Did we find an evidence that the substance may affect the plant growth?
Are we attempting to prove or refute anything here?
What is our primary objective or purpose in this scenario, broadly speaking?
What factors could contribute to conflicting evidence in this example?

Big Questions:

Once scientists establish a theory or law, is it subject to change over time?
- If so, how does this impact our trust in scientific knowledge?
- If not, what motivates ongoing scientific inquiry?