This is not designed to be a comprehensive review. There may be items on the exam that are not covered in this review. Similarly, there may be items in this review that are not tested on this exam. You are strongly encouraged to review the readings, homework exercises, and other activities from Unit 1 as you prepare for the exam.

Unit 1 Lesson Outcomes

Show/Hide Outcomes

Unit 1 Lesson Summaries

Show/Hide Summaries

Lesson 01 Recap
• Each lesson follows the same schedule: Individual and Group Preparation, Class Meeting, and Homework Assignment and Quiz. Understanding this layout will help you successfully manage the workload of this class.

• In this class you will use the online textbook that has been written for you by your statistics teachers. All of the assignments and quizzes will be based on the readings, so study it well.

• By doing the work, staying on schedule, and living the Honor Code you can succeed in this class!

Lesson 02 Recap
• The Statistical Process has five steps: Design the study, Collect the data, Describe the data, Make inferences, Take action.

• In a designed experiment, researchers control the conditions of the study. In an observational study, researchers don’t control the conditions but only observe what happens.

• There are many sampling methods used to obtain a sample from a population:

• A simple random sample (SRS) is a random selection taken from a population
• A systematic sample is every kth item in the population, beginning at a random starting point
• A cluster sample is all items in one or more randomly selected clusters, or blocks
• A stratified sample divides data into similar groups and an SRS is taken from each group
• A convenience sample is one easily obtained in a less-than-systematic way and should be avoided whenever possible
• Quantitative variables represent things that are numeric in nature, such as the value of a car or the number of students in a classroom. Categorical variables represent nonnumerical data that can only be considered as labels, such as colors or brands of shoes.

Lesson 03 Recap
• A histogram allows us to visually interpret data. Histograms can be left-skewed, right-skewed, or symmetrical and bell-shaped.

• The mean, median, and mode are measures of the center of a distribution. The mean is the most common measure of center and is computed by adding up the observed data and dividing by the number of observations in the data set.

• The standard deviation is a number that describes how spread out the data are. A larger standard deviation means the data are more spread out than data with a smaller standard deviation.

• A parameter is a true (but usually unknown) number that describes a population. A statistic is an estimate of a parameter obtained from a sample.

• Quartiles/percentiles, Five-Number Summaries, and Boxplots are tools that help us understand data. The five-number summary of a data set contains the minimum value, the first quartile, the median, the third quartile, and the maximum value. A boxplot is a graphical representation of the five-number summary.

Lesson 04 Recap
• The three rules of probability are:
1. A probability is a number between 0 and 1. $0 \leq P(X) \leq 1$

2. If you list all the outcomes of a probability experiment (such as rolling a die) the probability that one of these outcomes will occur is 1. In other words, the sum of the probabilities in any probability is 1. $\sum P(X) = 1$

3. The probability that an outcome will not occur is 1 minus the probability that it will occur. $P(\text{not}~X) = 1 - P(X)$

Lesson 05 Recap
• A normal density curve is symmetric and bell-shaped. The curve lies above the horizontal axis and the total area under the curve is equal to 1.

• A standard normal distribution has a mean of 0 and a standard deviation of 1. The 68-95-99.7% rule states that when data are normally distributed, approximately 68% of the data lie within 1 standard deviation from the mean, approximately 95% of the data lie within 2 standard deviations from the mean, and approximately 99.7% of the data lie within 3 standard deviations from the mean.

• A z-score tells us how many standard deviations away from the mean a given value is. It is calculated as: $$\displaystyle{z = \frac{\text{value}-\text{mean}}{\text{standard deviation}} = \frac{x-\mu}{\sigma}}$$

• The probability applet allows us to use z-scores to calculate proportions, probabilities, and percentiles.

Lesson 06 Recap
• The distribution of sample means is a distribution of all possible sample means ($$\bar x$$) for a particular sample size. It has a mean of $$\mu$$ and a standard deviation of $$\sigma/\sqrt{n}$$.

• The distribution of sample means is normal when $$\bar x$$ is normally distributed or when, thanks to the Central Limit Theorem (CLT), our sample size ($$n$$) is large.

• The Law of Large Numbers states that as the sample size ($$n$$) gets larger, the sample mean ($$\bar x$$) will get closer to the population mean ($$\mu$$).

Lesson 07 Recap
• When the distribution of sample means is normally distributed, we can use z-scores and the probability applet to calculate proportions and probabilities. A z-score is calculated as: $$\displaystyle{z = \frac{\text{value}-\text{mean}}{\text{standard deviation}} = \frac{\bar x-\mu}{\sigma/\sqrt{n}}}$$

• The $$P$$-value is the probability of getting a test statistic at least as extreme as the one you got, assuming $$H_0$$ is true. A $$P$$-value is calculated by finding the area under the normal distribution curve that is more extreme (farther away from the mean) than the z-score.