This is not designed to be a comprehensive review. There may be items on the exam that are not covered in this review. Similarly, there may be items in this review that are not tested on this exam. You are strongly encouraged to review the readings, homework exercises, and other activities from Units 1 and 2 as you prepare for the exam. In particular, you should go over the Lesson 8: Review for Exam 1.

## Unit 2 Lesson Outcomes

Show/Hide Outcomes

## Unit 2 Lesson Summaries

Here are the summaries for each lesson in Unit 2. Reviewing these key points from each lesson will help you in your preparation for the exam.

Show/Hide Summaries

Lesson 09 Recap
• The null hypothesis ($$H_0$$) is the foundational assumption about a population and represents the status quo. It is a statement of equality ($$=$$). The alternative hypothesis ($$H_a$$) is a different assumption about a population and is a statement of inequality ($$<$$, $$>$$, or $$\ne$$). Using a hypothesis test, we determine whether it is more likely that the null hypothesis or the alternative hypothesis is true.

• The $$P$$-value is the probability of getting a test statistic at least as extreme as the one you got, assuming $$H_0$$ is true. A $$P$$-value is calculated by finding the area under the normal distribution curve that is more extreme (farther away from the mean) than the z-score. The alternative hypothesis tells us whether we look at both tails or only one.

• The level of significance ($$\alpha$$) is the standard for determining whether or not the null hypothesis should be rejected. Typical values for $$\alpha$$ are $$0.05$$, $$0.10$$, and $$0.01$$. If the $$P$$-value is less than $$\alpha$$ we reject the null. If the $$P$$-value is not less than $$\alpha$$ we fail to reject the null.

• A Type I error is committed when we reject a null hypothesis that is, in reality, true. A Type II error is committed when we fail to reject a null hypothesis that is, in reality, not true. The value of $$\alpha$$ is the probability of committing a Type I error.

Lesson 10 Recap
• The margin of error gives an estimate of the variability of responses. It is calculated as $$\displaystyle{m=z^*\frac{\sigma}{\sqrt{n}}}$$ where $$z^*$$ represents a calculated z-score corresponding to a particular confidence level.

• A confidence interval is an interval estimator used to give a range of plausible values for a parameter. The width of a confidence interval depends on the chosen confidence level (and its corresponding value of $$z^*$$) as well as the sample size ($$n$$). This is the equation for calculating confidence intervals: $\displaystyle{\left(\bar x-z^*\frac{\sigma}{\sqrt{n}},~\bar x+z^*\frac{\sigma}{\sqrt{n}}\right)}$

• The sample size formula allows us to estimate the number of observations required to obtain a specific margin of error. $$\displaystyle{n=\left(\frac{z^*\sigma}{m}\right)^2}$$

Lesson 11 Recap
• In practice we rarely know the true standard deviation $$\sigma$$ and will therefore be unable to calculate a z-score. Student’s t-distribution gives us a new test statistic, $$t$$, that is calculated using the sample standard deviation ($$s$$) instead. $\displaystyle{ t = \frac {\bar x - \mu} {s / \sqrt{n}} }$

• The $$t$$-distribution is similar to a normal distribution in that it is bell-shaped and symmetrical, but the exact shape of the $$t$$-distribution depends on the degrees of freedom ($$df$$). $df=n-1$

• You will use Excel to carry out hypothesis testing and create confidence intervals involving $$t$$-distributions.

Lesson 12 Recap
• The key characteristic of dependent samples (or matched pairs) is that knowing which subjects will be in group 1 determines which subjects will be in group 2.

• We use slightly different variables when conducting inference using dependent samples:

Group 1 values: $$x_1$$  Group 2 values: $$x_2$$  Differences: $$d$$  Population mean: $$\mu_d$$  Sample mean: $$\bar d$$  Sample standard deviation: $$s_d$$

• When conducting hypothesis tests using dependent samples, the null hypothesis is always $$\mu_d=0$$, indicating that there is no change between the first population and the second population. The alternative hypothesis can be left-tailed ($$<$$), right-tailed($$>$$), or two-tailed($$\ne$$).

Lesson 13 Recap
• In contrast to dependent samples, two samples are independent if knowing which subjects are in group 1 tells you nothing about which subjects will be in group 2. With independent samples, there is no pairing between the groups.

• When conducting inference using independent samples we use $$\bar x_1$$, $$s_1$$, and $$n_1$$ for the mean, standard deviation, and sample size, respectively, of group 1. We use the symbols $$\bar x_2$$, $$s_2$$, and $$n_2$$ for group 2.

• When working with independent samples it is important to graphically illustrate each sample separately. Combining the groups to create a single graph is not appropriate.

• When conducting hypothesis tests using independent samples, the null hypothesis is always $$\mu_1=\mu_2$$, indicating that there is no difference between the two populations. The alternative hypothesis can be left-tailed ($$<$$), right-tailed($$>$$), or two-tailed($$\ne$$).

• Whenever zero is contained in the confidence interval of the difference of the true means we conclude that there is no significant difference between the two populations.

Lesson 14 Recap

ANOVA is used to compare the means for several groups. The hypotheses for the test are always: \begin{align} H_0: & ~ \textrm{All the means are equal} \\ H_a: & ~ \textrm{At least one of the means differs} \end{align}

• For ANOVA testing we use an $$F$$-distribution, which is right-skewed. The $$P$$-value of an ANOVA test is always the area to the right of the $$F$$-statistic.

• We can conduct ANOVA testing when the following three requirements are satisfied:

1. The data come from a simple random sample.
2. The data are normally distributed within each group.
• This is considered met unless one or more of the groups has a strongly skewed distribution.
3. The variance is constant.
• This is satisfied when the largest variance is not more than four times the smallest variance.