Chapter 3 Probability and Sampling
Much of the material in this chapter was taken from the MATH 221 textbook.
3.1 Probability
Principles of probability are essential to statistics. It is through probability that we understand how likely events are, which then allows us to make data-driven decisions. This course and textbook aren’t sufficient to gain an in-depth understanding of probability, but a few of the basics will be covered.
3.1.1 Probability Notation
You may already have a good understanding of the basics of probability. It is worth noting that there is a special notation used to denote probabilities. The probability that an event, \(x\), will occur is written \(P(x)\). As an example, the probability that you will roll a 6 on a die can be written as
P (Roll a 6 on a die)= \(\displaystyle{\frac{1}{6}}\)
3.1.2 Rules of Probability
Probabilities follow patterns, called probability distributions, or distributions, for short. There are three rules that a probability distribution must follow.
The three rules of probability are:
- Rule 1: The probability of an event \(X\) is a number between 0 and 1.
\[0 \leq P(X) \leq 1\]
- Rule 2: If you list all the outcomes of an experiment (such as rolling a die) the probability that one of these outcomes will occur is 1. In other words, the sum of the probabilities of all the possible outcomes of any experiment is 1.
\[\sum P(X) = 1\]
- Rule 3: (Complement Rule) The probability that an event \(X\) will not occur is 1 minus the probability that it will occur.
\[P(\text{not}~X) = 1 - P(X)\]
You may have noticed that the Complement Rule is just a combination of the first two rules.
3.2 Sampling from a Population
Very rarely do we have access to an entire population for one reason or another (too large, not enough resources, etc.), so we are left with taking samples from the population that should be representative of that population. If we had access to the entire population then we wouldn’t need to do statistical tests or analysis, we would just make observations on the whole population. The goal of statistical analysis is to determine if what we see in a sample is likely to occur in the population. For example, if we observe a common trend among 100 BYU-Idaho can we assume that that trend will hold for all BYU-Idaho students? Or as a made-up larger scale example, assume that 1000 Toyota Camrys are tested and 1% of them are found to have a defect in the braking system. Can Toyota assume that 1% of all Toyota Camrys will have the same defect? Through statistical analysis we are able to obtain answers to these questions.
Hopefully this helps you see the importance of sampling. Even if we aren’t able to observe every member of a population, through proper sampling and statistical analysis we are able to gain insight into the population as a whole. For those insights to be valid, however, the sampling must be done in a statistically correct way. Not just any sample can be taken, so methods for sampling have been developed. Some of the more common methods are shown below.
- There are many sampling methods used to obtain a sample from a population:
- A simple random sample (SRS) is a random selection taken from a population
- A systematic sample is every kth item in the population, beginning at a random starting point
- A cluster sample is all items in one or more randomly selected clusters, or blocks
- A stratified sample divides data into similar groups and an SRS is taken from each group
- A convenience sample is one easily obtained in a less-than-systematic way and should be avoided whenever possible
3.2.1 Randomness
A BYU-Idaho student was overheard saying, “I went shopping and bought some random items.” Did the person actually take a random sample of the items at the store? Did they write all the items down and randomly select the items for purchase? Of course not!
What did the student mean? That the items they bought seemed unrelated. When we consciously or subconsciously choose a sample, it is not random.
What does it mean to be random? When something is random, it is not just haphazard, with no pattern. Any random process follows a very distinct pattern over time—the distribution of its outcomes. For example, if you roll a die thousands of times, about one-sixth of the time you will roll a four. This is a very clear pattern, or part of a pattern. The entire pattern (or, the entire distribution) is that each number on the die is rolled about one-sixth of the time.
But there’s something different about the patterns followed by random processes than other kinds of patterns. Other kinds of patterns can be very predictable, such as a color pattern of the red, yellow, blue, red, yellow, blue, and so on. If you’re following this pattern and happen to see yellow, you know the next color will be blue. By contrast, you never know what you will get on the next roll of a six-sided die. You do know that in the long run you will roll fours about one-sixth of the time.
When something is random, we can be sure that it follows a long-term pattern, its distribution. We just never know what the outcome of the next experiment will be.