Lesson 13: Inference for the Difference of Two Means (Two Independent Samples)

Lesson Outcomes

By the end of this lesson, you should be able to do the following.

Recognize when the difference of means (two independent samples) inferential procedure is appropriate
Create numerical and graphical summaries of the data
Perform a hypothesis test for the difference of means (two independent samples) using the following steps:
1. State the null and alternative hypotheses
2. Calculate the test-statistic, degrees of freedom and P-value of the test using software
3. Assess statistical significance in order to state the appropriate conclusion for the hypothesis test
4. Check the requirements for the hypothesis test
Create a confidence interval for the difference of means (two independent samples) using the following steps:
1. Calculate a confidence interval using software
2. Interpret the confidence interval
3. Check the requirements of the confidence interval

Independent Samples Versus Paired Data

In the previous reading Lesson 12: Inference for Two Means: Paired Data we studied confidence intervals and hypothesis tests for the difference of two means, where the data are paired. One example of paired data is pre- and post-test scores, such as Mahon’s weight loss study. Another example is paired comparisons, like the nosocomial infection study. How can you tell if data are paired? The key characteristic of dependent samples (or matched pairs) is that knowing which subjects will be in Group 1 determines which subjects will be in Group 2. The data for each subject in Group 1 is paired with the data for a corresponding subject in Group 2. In the case of the weight loss study, the same subject provided weight data for both groups: once in the pre-test (group 1) and once in the post-test (group 2).

In contrast to dependent samples, two samples are independent if knowing which subjects are in Group 1 tells you nothing about which subjects will be in Group 2. With independent samples, there is no pairing between the groups. Suppose you want to compare the incomes of men and women in the general population. A random sample of men would be collected, and each would be asked to report their income. Similarly, a random sample of women would be drawn, and they would also be asked to report their income. Notice that the groups are independent. Knowing the names of the men who are selected tells you nothing about which women would be selected. This is an example of independent samples.

We can compare the mean income of men to the mean income of women using the procedures of this section. We will conduct hypothesis tests and compute confidence intervals for the difference in the true population means of two groups (\(\mu_1 - \mu_2\)).

Some students make the association that samples are independent if they do not affect each other. This is a false notion. Instead, remember that samples are independent if knowing who was selected for Group A tells you nothing about who will be selected for group B.

Hypothesis Tests

Reading Practices of Children with Developmental or Behavioral Problems

Is there a difference in the amount of reading done by children with problematic behavior compared to other children?

Summarize the relevant background information

Researchers led by Arlene Butz published a study on the reading practices of children . They wanted to know if there was a difference in the reading practices of children with developmental or behavioral problems (the DEV group or Group 1) compared to children in the general population who do not have developmental problems (the GEN group or Group 2.) One of the factors they considered was the number of nights each week that the children participated in reading in the home. Data representative of their results are given in the file ReadingPractices.xlsx.

State the null and alternative hypotheses and the level of significance

The null hypothesis is that there is no difference in the mean number of nights each week in which the two groups of children participate in reading in the home. The alternative hypothesis is that there is a difference in the mean number of nights that the children in the two groups participate in reading in the home. These hypotheses are expressed mathematically as: \[ \begin{align} H_0: &~~ \mu_1 = \mu_2 \\ H_a: &~~ \mu_1 \ne \mu_2 \end{align} \]

We will use the \(\alpha = 0.05\) level of significance.

Describe the data collection procedures

A group of children were enrolled in the study. Children who were identified to have developmental or behavioral problems were labeled as Group 1 (the DEV group). Children who did not display developmental or behavioral problems were labeled as Group 2 (the GEN group). A survey was administered to a parent of each of the children. One of the questions on the survey asked the number of nights that either their child read or that they read to their child during the week. This data is found in the file ReadingPractices.xlsx.

Answer the following questions:

For which group do you think the mean number of nights of reading will be higher?

	DEV Group	GEN Group
Mean:	\(\bar x_1 = 4.1\)	\(\bar x_2 = 3.7\)
Standard Deviation:	\(s_1 = 2.4\)	\(s_2 = 2.5\)
Sample Size:	\(n_1 = 204\)	\(n_2 = 117\)

Time Period	Mean	Std. Deviation	Sample Size
Control	\(\bar x_1 = 14\)	\(s_1 = 4.2\)	\(n_1 = 182\)
World Cup	\(\bar x_2 = 19\)	\(s_2 = 9.8\)	\(n_2 = 91\)

Community Group 1	Hospital Group 2
\(\bar x = 216.1\)	\(\bar x_2 = 283.4\)
\(s = 339.9\)	\(s = 359.9\)
\(n = 76\)	\(n = 85\)

Lesson 13: Inference for the Difference of Two Means (Two Independent Samples)

Optional Videos for this Lesson

Part 1

Part 2

Part 3

Part 4

Part 5

Lesson Outcomes

Independent Samples Versus Paired Data

Hypothesis Tests

Reading Practices of Children with Developmental or Behavioral Problems

Hypothesis Test

World Cup Heart Attacks

Theory of Statistics

Confidence Intervals

Reading Practices of Children with Developmental or Behavioral Problems

Chronic Obstructive Pulmonary Disease (COPD)

Summary

Navigation