Lesson 20: Review for Exam 3

This is not designed to be a comprehensive review. There may be items on the exam that are not covered in this review. Similarly, there may be items in this review that are not tested on this exam. You are strongly encouraged to review the readings, homework exercises, and other activities from Units 1-3 as you prepare for the exam. In particular, you should go over the Review for Exam 1 and the Review for Exam 2.

Lesson 16

Outcomes

By the end of this lesson, you should be able to:

Calculate a sample proportion
Interpret a sample proportion
Summarize categorical data with graphical summaries
Identify when a sample proportion will follow a normal distribution
Determine the mean of the sampling distribution of the sample proportion for a given parent population
Determine the standard deviation of the sampling distribution of the sample proportion for a given parent population
Calculate the z-score for a sample proportion, given the population proportion and sample size
Calculate probabilities of a sample proportion using the normal distribution.

Summary

Remember…

The sample proportion \(\hat{p}\) is computed by dividing the number of successes (\(x\)) found in a sample by the sample size (\(n\)), \[ \text{Sample Proportion} \quad \widehat{p} = \frac{x}{n} \]
The sample proportion \(\widehat{p}\) is interpreted as the probability of a success occurring in the sample. It is a point estimate of the true proportion, \(p\).
Two graphical summaries are typically used to show sample proportions.
- Pie charts are used when you want to represent the observations as part of a whole, where each slice (sector) of the pie chart represents a proportion or percentage of the whole.
- Bar charts are used when representing counts of how many times each category has been observed in the data.
The sampling distribution of the sample proportion can be considered to be normally distributed when both \(np \ge 10\) and \(n(1-p) \ge 10\). The value of \(np\) gives the expected number of successes for our sample, while the value of \(n(1-p)\) gives the expected number of failures for our sample.
The mean of the sampling distribution of the sample proportion is \(\mu_{\widehat{p}} = p\). In other words, sample proportions are centered around the true proportion.
The standard deviation of the sampling distribution of the sample proportion is given by \(\sigma_{\widehat{p}} = \displaystyle{\sqrt{\frac{p\cdot(1-p)}{n}}}\). This quantifies how far sample proportions spread away from the true proportion.
If \(np \ge 10\) and \(n(1-p) \ge 10\), probability calculations using the Normal Probability Applet can be computed using the equation \[ \displaystyle {z = \frac{\textrm{value} - \textrm{mean}}{\textrm{standard deviation}} = \frac{\widehat p - p}{\sqrt{\frac{p \cdot (1-p)}{n}}}} \]
To convert a z-score into a probability, we enter the z-score into the Normal Probability Applet and shade in the typical way.

Lesson 17

Outcomes

By the end of this lesson you should be able to do the following.

Recognize when a one proportion inferential procedure is appropriate
Create numerical and graphical summaries of the data
Perform a hypothesis test for one proportion using the following steps:
1. State the null and alternative hypotheses
2. Calculate the test-statistic and P-value of the test using software
3. Assess statistical significance in order to state the appropriate conclusion for the hypothesis test
4. Check the requirements for the hypothesis test
Create a confidence interval for one proportion using the following steps:
1. Calculate a confidence interval using software
2. Interpret the confidence interval
3. Check the requirements of the confidence interval
Calculate the sample size required to achieve a specified margin of error and level of confidence

Summary

Remember…

Use the Math 221 Statistics Toolbox to perform hypothesis testing and calculate confidence intervals for problems involving one proportion. One proportion situations arise whenever there is a single sample of categorical data consisting of just successes and failures. This is like heads and tails on a coin, or yes/no type questions, or various other situations where there are only two possible categories.
Numerical Summaries for a one proportion analysis include the sample size \(n\), number of successes \(x\), and sample proportion \(\widehat{p}\). Graphical Summaries consist of either a bar chart or pie chart, with either one being acceptable.
To perform a hypothesis test for one proportion use the following steps:
1. State the null hypothesis in the form of \(H_0: p = 0.#\) where \(0.#\) is some value between 0 and 1, and the alternative hypothesis is of the form \(H_a: p \ne 0.#\), \(H_a: p > 0.#\) or \(H_a: p < 0.#\).

The estimator of \(p\) is \(\widehat p\). \(\displaystyle{ \widehat p = \frac {x}{n}}\) and is used for both confidence intervals and hypothesis testing.
The requirements for a confidence interval are \(n \widehat p \ge 10\) and \(n(1-\widehat p) \ge 10\). The requirements for hypothesis tests involving one proportion are \(np\ge10\) and \(n(1-p)\ge10\).
We can determine the sample size we need to obtain a desired margin of error using the formula \(\displaystyle{ n=\left(\frac{z^*}{m}\right)^2 p^*(1-p^*)}\) where \(p^*\) is a prior estimate of \(p\). If no prior estimate is available, the formula \(\displaystyle{ \left(\frac{z^*}{2m}\right)^2}\) is used.

Lesson 18

Outcomes

By the end of this lesson, you should be able to do the following.

Recognize when a difference of two proportions inferential procedure is appropriate
Create numerical and graphical summaries of the data
Perform a hypothesis test for the difference of two proportions using the following steps:
1. State the null and alternative hypotheses
2. Calculate the test-statistic and P-value of the test using software
3. Assess statistical significance in order to state the appropriate conclusion for the hypothesis test
4. Check the requirements for the hypothesis test
Create a confidence interval for the difference of two proportions using the following steps:
1. Calculate a confidence interval using software
2. Interpret the confidence interval
3. Check the requirements of the confidence interval

Summary

Remember…

When conducting hypothesis tests using two proportions, the null hypothesis is always \(p_1=p_2\), indicating that there is no difference between the two proportions. The alternative hypothesis can be left-tailed (\(<\)), right-tailed(\(>\)), or two-tailed(\(\ne\)).
For a hypothesis test and confidence interval of two proportions, we use the following symbols: \[ \begin{array}{lcl} \text{Sample proportion for group 1:} & \hat p_1 = \displaystyle{\frac{x_1}{n_1}} \\ \text{Sample proportion for group 2:} & \hat p_2 = \displaystyle{\frac{x_2}{n_2}} \end{array} \]
For a hypothesis test only, we use the following symbols:

\[ \begin{array}{lcl} \text{Overall sample proportion:} & \hat p = \displaystyle{\frac{x_1+x_2}{n_1+n_2}} \end{array} \]

Whenever zero is contained in the confidence interval of the difference of the true proportions we conclude that there is no significant difference between the two proportions.
You will use the Excel spreadsheet Math 221 Statistics Toolbox to perform hypothesis testing and calculate confidence intervals for problems involving two proportions.

Lesson 19

Outcomes

By the end of this lesson, you should be able to do the following.

Recognize when a chi-square test for independence is appropriate
Create numerical and graphical summaries of the data
Perform a hypothesis test for the chi-square test for independence using the following steps:
1. State the null and alternative hypotheses
2. Calculate the test-statistic, degrees of freedom and P-value of the test using software
3. Assess statistical significance in order to state the appropriate conclusion for the hypothesis test
4. Check the requirements for the hypothesis test
State the properties of the chi-square distribution

Summary

Remember…

The \(\chi^2\) hypothesis test is a test of independence between two variables. These variables are either associated or they are not. Therefore, the null and alternative hypotheses are the same for every test: \[ \begin{array}{1cl} H_0: & \text{The (first variable) and the (second variable) are independent.} \\ H_a: & \text{The (first variable) and the (second variable) are not independent.} \end{array} \]
The degrees of freedom (\(df\)) for a \(\chi^2\) test of independence are calculated using the formula \(df=(\text{number of rows}-1)(\text{number of columns}-1)\)
In our hypothesis testing for \(\chi^2\) we never conclude that two variables are dependent. Instead, we say that two variables are not independent.

Navigation

Previous Reading	This Reading	Next Reading
Lesson 19: Inference for Independence of Categorical Data	Lesson 20: Review for Exam 3	Lesson 21: Describing Bivariate Data: Scatterplots, Correlation, & Covariance