Lesson 2: The Statistical Process & Design of Studies

Lesson Outcomes
Introduction
Design of Studies
Making Inferences: Hypothesis Testing
Summary
References
Additional Reading
Navigation

Optional Videos for this Lesson

Part 1

Part 2

Part 3

Part 4

Lesson Outcomes

By the end of this lesson, you should be able to:

Describe the five steps of the Statistical Process
Distinguish between an observational study and an experiment
Differentiate between a population and a sample
Describe each of the following sampling schemes:
1. Simple random sampling
2. Stratified sampling
3. Systematic sampling
4. Cluster sampling
5. Convenience sampling
Explain the importance of using random sampling
Distinguish between a quantitative and a categorical variable

Introduction

Statistics are used in every aspect of society. Every statistical analysis follows a pattern we will call the Statistical Process. This process will be introduced in this lesson and will be used throughout the course.

The Statistical Process and Daniel’s Experiment

Stained-glass depiction of Daniel’s deliverance from the lions’ den. Found in the old Dominican priory church at Hawkesyard in Staffordshire, England. (Photo credit: Fr Lawrence Lew, O.P. Used by permission.)

The Old Testament prophet Daniel planned one of the earliest recorded scientific research studies. We will use his example to illustrate the following five steps of The Statistical Process.

The following icons can help you remember these steps. Notice that each icon has a letter and an image to help you remember the five steps of the Statistical Process.

The Statistical Process
	Design the Study
	Collect the Data
	Describe the Data
	Make Inference
	Take Action

Step 1: Design the Study

An important step in scientific inquiry or problem solving can be to state a research question such as:

Will internet advertising increase a company’s revenue?
Does expressing gratitude increase a person’s satisfaction with life in general?
Does a newly developed vaccine prevent the spread of disease?

Researchers also investigate the background of the situation. What have other people discovered about this situation? How can we find the answer to the research question? What do we need to do? What is the population (or total collection of all individuals) under consideration? What kind of data need to be collected?

Before collecting data, researchers make a hypothesis, or an educated guess about the outcome of their research. A hypothesis is a statement such as the following:

Using internet advertising will increase the company’s sales revenue.
People who express gratitude will be more satisfied with life than those who do not.
A newly-developed vaccine is effective at preventing tuberculosis.

Daniel’s Experiment

After taking Israel captive, Babylon’s King Nebuchadnezzar asked his chief officer to bring Israelite children who were well favoured, and skilful in all wisdom, and cunning in knowledge, and understanding science to stand in the king’s palaces (Daniel 1:4). To aid their preparation, Nebuchadnezzar planned to feed them his meat and wine for three years (Daniel 1:5).

Daniel did not want to defile himself by partaking of the king’s meat and wine. He asked permission to eat pulse¹ and drink water instead. His supervisor, Melzar, was afraid to displease the king. He thought that after eating pulse and water, the selected Israelites would look worse than their peers, and he would be punished (Daniel 1:8-10).

With an understanding of the background of the situation, Daniel proposed an experiment. He said, Prove thy servants, I beseech thee, ten days; and let them give us pulse to eat, and water to drink. Then let our countenances be looked upon before thee, and the countenance of the children that eat of the portion of the king’s meat: and as thou seest, deal with thy servants (Daniel 1:12-13.). In short, Daniel’s implied research question can be stated as: Will those who eat pulse and drink water appear healthier than those who eat the king’s meat and drink his wine? Melzar agreed to the experiment.

Answer the following question:

What is Daniel’s hypothesis?

Show/Hide Solution

Step 2: Collect Data

When designing a study, much attention is given to the process by which data are observed. When examining data, it is also important to understand the data collection procedures. A sample is a subset (a portion) of a population. How is this sample obtained? How are the observations made?

Daniel’s study design required that data be collected at the end of 10 days. Melzar would compare the appearances of two groups of people: (1) Israelites who ate pulse and drank water versus (2) Israelites who ate the king’s meat and drank his wine.

Step 3: Describe the Data

When we describe data, we use any tools appropriate to the situation. This can include creating graphs or calculating statistics to help understand or visualize the data.

For Daniel’s experiment, the data are described in Daniel 1:15: And at the end of ten days [the] countenances [of those who ate pulse] appeared fairer and fatter in flesh than all the children which did eat the portion of the king’s meat.

Step 4: Make Inferences

Inference is the process of using the information contained in a sample from a population to make a general statement (i.e. to infer something) about the entire population. Later in the course we will learn techniques that make this type of analysis possible.

Melzar made an inference. Based on the results of the sample, he determined that (in general) those who eat pulse and drink water will be healthier than those who eat the king’s meat and drink his wine Daniel 1:15-16.

Step 5: Take Action

The goal of a statistical analysis is to determine which action to take in a particular situation. Actions can include many things: launching an internet ad campaign (or not), expressing gratitude (or not), getting vaccinated (or not), etc.

Melzar took action as described in Daniel 1:16: Thus Melzar took away the portion of their meat, and the wine that they should drink; and gave [all the Israelite children] pulse.

Was the experiment a success?

“Now at the end of the days that the king had said he should bring them in… the king communed with them; and among them all was found none like Daniel, Hananiah, Mishael, and Azariah And in all matters of wisdom and understanding, that the king enquired of them, he found them ten times better than all the magicians and astrologers that were in all his realm” Daniel 1:18-20.

Summary of the Statistical Process

Daniel’s experience can also help you learn the Statistical Process. Look at the first letter of each of the steps in the Statistical Process. You can use the phrase “Daniel Can Discern More Truth” to help you remember the five steps in the Statistical Process.

The Statistical Process

	Pneumonic	Actual Process Step
Step 1:	Daniel	Design the study
Step 2:	Can	Collect data
Step 3:	Discern	Describe the data
Step 4:	More	Make inferences
Step 5:	Truth	Take action

The Statistical Process will be used throughout the course. Take time to memorize the five steps.

The study designed by the Old Testament prophet Daniel provides an ancient example of a designed experiment. Daniel’s experiment included two groups of people: those who had the experimental treatment eating pulse and drinking water (called the treatment group) and those who ate the standard food the king’s meat (called the control group.) The treatment group receives the experimental procedure. The control group is used for comparison.

Answer the following question:

Why was it important that Daniel’s experiment included a control group?

Show/Hide Solution

Design of Studies

Most research projects can be classified into one of two basic categories: observational studies or designed experiments. In an experiment, researchers control (to some extent) the conditions under which measurements are made. In an observational study, researchers simply observe what happens, without controlling the conditions under which measurements are made. Both types of study follow the five steps of the Statistical Process.

Designed Experiments

In a designed experiment, researchers manipulate the conditions that the participants experience. They often do this by randomly assigning subjects to one of two groups, a “treatment” group (sometimes called the experimental group) and a “control” group (though this could be second treatment group instead of a control group). The experiment is typically conducted by applying some kind of treatment to the subjects in the treatment group and observing the effect of the treatment. Those in the control group do not receive the treatment and are also observed. In this way researchers can determine the effects of the treatment by comparing the treatment group results to the control group results. The following example illustrates the use of these two groups.

Jonas Salk’s First Polio Vaccine Trial

Beginning around 1916 and through the 1950s, a mysterious plague attacked infants and children. Symptoms included excruciating muscle pain and a stiff neck. This illness, which became known as poliomyelitis or simply “polio,” left children disfigured, paralyzed, and sometimes even dead.

While working as a researcher at the University of Pittsburgh School of Medicine, Dr. Jonas E. Salk developed a vaccine that might help prevent the spread of this disease. He conducted what has become one of the most famous designed experiments in history.

This short video below provides a compelling summary of the famous Jonas Salk vaccine experiment. As you watch, notice each of the 5 steps of a statistical study in this study.

As explained in the video, in the first Salk trial almost 1.1 million children participated in the study. Even though the sample size was large, flaws in the study design rendered the results useless.

Undaunted, Dr. Salk fixed the problems with the design and enrolled hundreds of thousands of additional children for the second phase of his study. In all, over 1.8 million infants and children participated in this experiment, making it the largest drug trial to date.

Step 1: Design the study.

The participants in a study are commonly called subjects. Sometimes subjects are called experimental units or simply units. In the Salk trials, the children who participated were the subjects.

Subjects (the children) were randomly assigned to one of two groups. The first group was given the experimental vaccine, the treatment. The treatment is the new or experimental condition that is imposed on the subjects. The subjects who receive the treatment make up the treatment group.

The second group was given a control or placebo. In this study, the control was an injection that looked just like the vaccine, but contained a harmless saline solution. The control group or placebo group is made up of the subjects assigned to receive the control.

This study was double blind. Neither the children’s parents nor their doctors knew whether a particular child received the treatment or the control. Both parties were blinded to this information.

Because the children were assigned to the groups randomly, the two groups should be similar. If the vaccine is not effective, the number of future cases of polio should be about the same in each group. However, if Salk’s vaccine helped to prevent the spread of polio, then fewer cases should occur in the vaccinated group.

Answer the following questions:

Some children can be identified as having a higher risk of developing polio. Would it have been better if they were assigned to the treatment group so they could get the vaccine?

Show/Hide Solution

Why is it important for the subject and those who assess the health of the subject to be unaware of whether or not that child received the vaccine?

Show/Hide Solution

Subjects: Suppose a subject in the study thinks they’re being treated. It has been documented that subjects with such knowledge tend to show improvement whether they are receiving the treatment or not. To see why, consider how you might feel and act if you were told you had been vaccinated. You might have a more hopeful outlook, leading to healthier living habits such as better hygiene and nutrition. Such changes would tend to reduce your chance of contracting polio whether you’ve received the vaccine or not. This might make the vaccine look like it works better than it does. It also might make the vaccine look like it works, even if it doesn’t.
Now suppose subjects in the control group know they are not being treated. This can also change the way they feel and act, in ways that can make them more likely to contract polio than they would be if they weren’t in the study. This could make it look like the incidence of polio among unvaccinated persons is higher than it is, again making the vaccine look like it works better than it does.
To reduce bias caused by such errors, subjects should not know to which group they are assigned.
Researchers: Suppose a researcher assessing the health of a subject is told that the subject is in the control group. It has been documented that in such a case, the researcher is more likely to record that the subject has symptoms even if the subject is not actually in the control group. This makes it look like unvaccinated persons are more likely to get polio than they really are, which makes it look like the vaccine works better than it does.
There are other effects of knowing to which group the subject belongs, such as doctors treating or advising the patient differently than they would without such knowledge. Such differences can make it harder to tell whether the vaccine works, and how well.
To reduce bias caused by such effects, those assessing the health of the subjects should not be told to which group the subject belongs.

Step 2: Collect data.

The researchers followed up with each child to determine if they contracted polio. They recorded the number of children in each group that developed polio during the study period. Not all of Salk’s experiments were double-blind. Here is a summary of the results from the regions where a double-blind study was conducted (Francis et al., 1955; Brownlee, 1955):

**Children Who Developed Polio**
	Yes	No	Total
Treatment Group	57	200,688	200,745
Placebo Group	142	201,087	201,229

Step 3: Describe the data.

One way to summarize the data is to compute the proportion of children in each group that developed polio. The proportion of children in the treatment group that developed polio during the study period is:

\[ \frac{57}{200745} = 0.000~283~9 \]

Answer the following questions:

Calculate the proportion of children in the placebo group that developed polio during the study period.

Show/Hide Solution

Compare the two proportions. What do you observe?

Show/Hide Solution

Step 4: Make inferences

Careful statistical analysis of the records suggested that this difference was so great that it was attributable to the vaccine and not to chance. Assuming that the vaccine had no effect, the probability that the difference in the proportions between the two groups would be at least as extreme as the difference Dr. Salk observed was very low: 0.00000000093. Because this probability is so small, it is highly unlikely that these results are due to chance.

Step 5: Take action

Once it was clear that the vaccine was effective, children who were unvaccinated or had received the placebo were given Salk’s vaccine. Since 1954, there has been a marked decrease in the number of polio cases worldwide (Offit, 2005). Public health researchers are striving to eradicate this disease entirely.

Observational Studies

In an observational study researchers observe the responses of the individuals, without controlling the conditions experienced by the individuals. Therefore, they do not assign the participants to treatment or control groups.

Observational studies commonly occur in business settings. One example is a financial audit. The purpose of a financial audit is to assess the accuracy of a company’s financial business practices. ImmunAvance Ltd., a non-government health care organization, hired the Accounting Office at Global Optimization Unlimited to perform an independent audit of their financial practices. ImmunAvance provides inoculation and other preventative health care services in rural African communities.

Step 1: Design the study

The volume of financial transactions conducted by ImmunAvance makes it impossible to conduct a census or an examination of the entire collection of ImmunAvance’s financial documents. Instead, you will collect a manageable group of items (called the sample) from the entire collection of financial documents (called the population.) A sample is a subset or a portion of a population. The information gained from the sample is used to make an inference (or generalization) about the population.

Auditors typically cannot consider every item in a population, because there are too many. When it is not possible to conduct a census, auditors face sampling risk. Sampling risk is the risk affiliated with not auditing every item in the population. It is the risk that the sample may not adequately reflect the population. The only way to eliminate sampling risk is to conduct a census, which is usually not practical. Auditors can reduce sampling risk by obtaining a sample randomly. This is called random selection. Another way to reduce sampling risk is to increase the sample size, the number of items sampled.

Sampling Methods

Step 2: Collect data

There are several procedures that can be used to select a random sample from a population, including: simple random sampling (SRS), stratified sampling, systematic sampling, cluster sampling, , and convenience sampling (or, haphazard sampling). These are examples of sampling methods.

Random Sampling Methods

A simple random sample (SRS) is the best method for obtaining a sample from a population. This method allows each possible sample of a certain size an equal chance at being selected as the chosen sample. A difficulty of this method is that a list of all of the items in the population must be accessible before the sample is taken. Often, we obtain a SRS by allowing a computer to randomly select a certain number of items from the full list of the population. It is akin to the idea of putting all of the names into a hat, shaking them up, and randomly drawing out a few.

For example, suppose there are 18,000 students in the population of a certain university. School officials can use a computer to randomly choose values between 1 and 18,000 to identify which students are to be selected to complete a survey. In Excel, the command to obtain a random number between 1 and 18,000 is =RANDBETWEEN(1,18000). A simple random sample can be obtained any time there is a complete list of the items to be sampled and they are all accessible. All the statistical procedures in this course assume that simple random sampling has been used. But in practice, the SRS is often difficult (or impossible) to implement.

A stratified sample is when the items to be sampled are organized in groups of homogeneous (similar) items called strata, then a simple random sample is drawn from each of these strata. Stratified sampling works well when the items are similar within each stratum and tend to differ from one stratum to another. We often use stratified sampling in order to obtain a sample in such a way that we can make comparisons between each of the groups (or strata).

For example, in obtaining a sample of students from a university, school officials could define the strata as: (1) freshman, (2) sophomores, (3) juniors, and (4) seniors. A simple random sample could then be obtained from each of these strata. This would ensure that each class rank of students was represented in the sample. It would also allow the school officials to see how freshman, sophomore, junior, and senior level students compared in their answers to a survey.

A systematic sample is where every \(k^{\text{th}}\) item in the population is selected to be part of the sample, beginning at a random starting point. Systematic sampling works well when the items are in a random, but sequential ordering. If the items are not arranged randomly, a systematic sample can miss important parts of the population.

For example, consider a fast food company where every 10th customer is given the opportunity to compete a satisfaction survey in exchange for a small discount coupon towards their next purchase. An airport security line also often implements a procedure where every 100th (or so) person is selected for a more “in depth” security examination. Similarly, factories that use assembly lines will pull say every 500th item from the assembly line to perform a quality control check on the item.

A cluster sample (sometimes called a block sample) consists of taking all items in one or more randomly selected clusters, or blocks. When the variation from one block to another is relatively low, compared to the variation within the block, cluster sampling is a reasonable way to get a sample.

For example, ecologists could draw grids on a map of a forest to create small sampling regions, or sampling clusters. Then, by randomly selecting one or two of these clusters from the map, the ecologists could go to the areas marked on the map and document information on the health of every tree they find in those clusters. This is a practical way to get a sample in this case because the ecologists only have to go to a few areas of the forest, but are still able to obtain a random sample of all of the trees in the forest. It is also worth noting that the ecologists would not be interesting in comparing the health of the trees from the selected clusters to each other like they would in a stratified sample. Instead, they are just looking for a feasible way to obtain a single random sample of all of the trees in the forest, but want to keep their traveling time to a minimum while collecting their sample. In contrast, to obtain a simple random sample of trees from the same forest, the ecologists would first have to go out and number every tree in the entire forest. Then they would need to use a computer to randomly pick which trees to collect data on. Finally, they would then have to go back to the forest and collect data on the selected trees from across the entire forest. Such an approach just isn’t feasible in practice, so we are willing to settle instead for the cluster sample.

A convenience sample involves selecting items that are relatively easy to obtain and does not use random selection to choose the sample. This method of sampling can be assumed to always bring bias into the sample.

As an example of a convenience sample, an auditor could haphazardly select items from a filing cabinet. This is frequently done when a quick and simple sample is needed, but may not yield a sample that represents the population well. When possible, convenience samples should be avoided.

Types of Data

Whenever we collect data, we record information about the things we are studying. There are two basic types of data that can be recorded: quantitative measurements and categorical labels. We will call these types of data simply “quantitative” or “categorical” variables. We use the word “variable” to denote the idea that the quantitative measurements or categorical labels can vary from person to person, or item to item, in our study.

Quantitative variables provide measurement information on each individual (or item) in our study. They represent things that are numeric in nature; things that are measured. They often include units of measurement along with the quantitative value of the measurement. For example, the heights of children measured in inches (or centimeters), or their weight measured in pounds (or kilograms). For a quantitative variable, it makes sense to apply arithmetic operations to the data (such as adding values together, computing the average of the values, or comparing two values). If one child weighs 30 pounds (13.61 kg) and a second child weights 60 pounds (27.22) then the second child is twice as heavy as the first.

Categorical variables allow us to place each individual (or item) into to a specific category. Categorical variables are labels, and it does not make sense to do arithmetic with them. For example the gender of a newborn child, the ethnicity of an individual, a person’s job title, the brand of phone they own, or the area code of a telephone number, etc are all categorical variables. Notice that although a telephone number consists of numbers, it is not a quantitative measurement. It does not make sense to double someone’s phone number, to average phone numbers together, or to say one phone number is half the size of another. But the area code of the phone number gives information about the region where the phone number was first initiated, which is categorical information.

In Unit 3 of this course we will learn more about categorical variables and proportions. Units 1 and 2 of this course focus on studying quantitative variables.

Returning to the sample accounts receivable record, we find this data to have information on both types of variables.

Answer the following question:

For each of the following variables taken from this accounts receivable record, indicate whether the variable is quantitative or categorical.

Terms

Show/Hide Solution

Account number

Show/Hide Solution

Invoice amount

Show/Hide Solution

Step 3: Describe the data

After auditors collect a sample and compile the data, they review the evidence. Auditors may use graphs or compute numbers (such as the average) to summarize the evidence they found.

Making Inferences: Hypothesis Testing

Step 4: Make inferences

Auditors use the information drawn from the sample to form an opinion about the population. Whenever sample data is used to infer a characteristic of a population, it is called making an inference. Inferential statistics represents a collection of methods that can be used to make inference about a population. Based on the documents reviewed, the auditors assess if the company is conducting its business in a proper manner.

When conducting an audit, the implicit assumption is that transactions have been posted properly. As auditors sample the company’s records, they are looking to see if everything is consistent with the original assumption that all transactions have been posted properly. It would only be in the case of discovering suspicious activity or evidence of fraudulent reporting that the auditors would change their belief about the company and accuse the company ImmunAvance of falsely reporting on their financial statements.

“Piled Higher and Deeper” by Jorge Cham

There is a formal procedure for determining when enough evidence has been found to make accusations of fraud. Later this semester, after we establish some foundational principles of statistics, we will study these statistical methods in depth. Of course, these methods can be used for much more than just determining if a company has reported their financial statements fraudulently. So we will look at many different ways these statistical procedures can be applied to research and industry.

For ImmunAvance’s audit, based on the samples of financial statements that had been selected, while there were a few errors in the documents, there was not evidence dramatic enough to claim that the company had been fraudulent. So the company passed their audit.

Step 5: Take Action

The auditors prepare a report in which they give their opinion on the status of the company’s current operations.

Since there was not enough evidence to suggest that ImmunAvance’s financial statements were fraudulent, the auditor’s conclusion is that no adjustment is necessary. The few observed discrepancies were apparently just the result of random chance errors, not the deliberate falsefying of information.

Summary

Remember…

The Statistical Process has five steps: Design the study, Collect the data, Describe the data, Make inference, Take action. These can be remembered by the pneumonic “Daniel Can Discern More Truth.”
In a designed experiment, researchers control the conditions of the study, typically with a treatment group and a control group, and then observe how the treatments impact the subjects. In a purely observational study, researchers don’t control the conditions but only observe what happens.
The population is the entire group of all possible subjects that could be included in the study. The sample is the subset of the population that is actually selected to participate in the study. Statistics use information from the sample to make claims about what is true about the entire population.
There are many sampling methods used to obtain a sample from a population. The best methods use some sort of randomness (like pulling names out of a hat, rolling dice, flipping coins, or using a computer generated list of random numbers) to avoid bias.

A simple random sample (SRS) is a random sample taken from the full list of the population. This is the least biased (best) sampling method, but can only be implemented when a full list of the population is accessible.
A stratified sample divides the population into similar groups and then takes an SRS from each group. The main reason to use this sampling method is when a study wants to compare and contrast certain groups within the population, say to compare freshman, sophomores, juniors, and seniors at a university.
A systematic sample samples every k^th item in the population, beginning at a random starting point. This is best applied when subjects are lined up in some way, like at a fast food restaurant, an airport security line, or an assembly line in a factory.
A cluster sample consists of taking all items in one or more randomly selected clusters, or blocks. For example, ecologists could draw grids on a map of a forest to create small sampling regions and then sample all trees they find in a few randomly selected regions. Note that this differs from a stratified sample in that only a few sub-groups (clusters) are selected and that all subjects within the selected clusters are included in the study.
A convenience sample involves selecting items that are relatively easy to obtain and does not use random selection to choose the sample. This method of sampling can be assumed to always bring bias into the sample.

The best way to avoid bias when trying to make conclusions about a population from a single sample of that population is to use a random sampling method to obtain the sample.
Quantitative variables represent things that are numeric in nature, such as the value of a car or the number of students in a classroom. Categorical variables represent non-numerical data that can only be considered as labels, such as colors or brands of shoes.

References

Bible Dictionary, “Pulse” at http://churchofjesuschrist.org/scriptures/bd/pulse.

Brownlee, K. A. (1955). Statistics of the 1954 polio vaccine trials. Journal of the American Statistical Association, 50(272), pp. 1005-1013.

Francis, T., et. al. (1955). An evaluation of the 1954 poliomyletis vaccine trials. American Journal of Public Health and the Nation’s Health, 45(5)

Offit, P. A. (2005). Why are pharmaceutical companies gradually abandoning vaccines? Health Affairs, 24(3), 622-630. doi:10.1377/hlthaff.24.3.622

Additional Reading

Sampling Chapter.pdf

Navigation

Previous Reading	This Reading	Next Reading
Lesson 1: Course Intro & Probability	Lesson 2: The Statistical Process & Design of Studies	Lesson 3: Describing Quantitative Data (Shape & Center)

churchofjesuschrist.org definition of pulse ↩︎