By the end of this lesson, you should be able to do the following:
|
\(~ ~\) |
President Gordon B. Hinckley said, "My plea is that we stop seeking out the storms and enjoy more fully the sunlight. I am suggesting that as we go through life, we ‘accentuate the positive.’ I am asking that we look a little deeper for the good, that we still our voices of insult and sarcasm, that we more generously compliment and endorse virtue and effort" (Standing for Something, 2000, p.101). |
Summarize the relevant background information
Robert Emmons and Michael McCullough investigated the effects of gratitude on people’s perception of life as a whole. In a study of \(n=192\) undergraduates, the people were randomly assigned to one of three groups.
In addition to the weekly record of the five things they recorded their level of satisfaction with life in general. (Higher values are more favorable.) Reports were collected for nine weeks, and the overall level of satisfaction with life as a whole was recorded for each individual. The researchers wanted to determine if there was a difference in the perception of life as a whole between the subjects assigned to each of the three groups. Stated differently, they wanted to determine if expressing gratitude affects a person’s view of life in general.
Here is an excerpt of data representing the results of this study:
Higher values indicate a greater level of satisfaction with life as a whole.
How might we analyze these data? One possible method would be to conduct separate t-tests for all the possible pairs of groups in the study. If we did this, we would need to conduct a separate t-test to compare groups 1 & 2, 1 & 3 and 2 & 3. If the probability of committing a Type I error is \(\alpha = .05\) on each of these tests, then the probability that we would commit a Type I error on at least one of the tests is much greater than 0.05. We need a hypothesis test that we can use to compare all the groups at once. The procedure that allows us to do this is called Analysis of Variance (ANOVA).
ANOVA is a test for equality of several means. It allows us to compare the means for several groups–in one hypothesis test. It might sound intimidating, but ANOVA is simply a way to analyze several means at once. It is based on a comparison of the spread of the data within each of the groups compared to the spread of the means of the groups.
In an ANOVA test, the null hypothesis is typically expressed in words: \[H_0: \text{All the means are equal.}\] The alternative hypothesis is given as: \[H_a: \text{One or more of the means differs from the others.}\]
If the means differ from each other in comparison to the variability in each group, then we conclude that the means are not all equal. If the means do not differ by much (when compared to the spread of the data in each group) then we do not reject the hypothesis that all the means are equal.
We will use the level of significance, \(\alpha\), and the \(P\)-value just as we have in the other hypothesis tests.
The test statistic in ANOVA follows an \(F\)-distribution. This is the first time we have encountered this distribution. In previous tests, we have used the test statistics \(z\) and \(t\). For the ANOVA test, we use the \(F\)-statistic.
Here is a brief summary of the characteristics of the \(F\)-distribution:
There are three requirements of ANOVA that must be checked:
Each group of data used in the ANOVA test need to be simple random samples from their respective populations.
The data are normally distributed in each group.
We can check this by creating a histogram for each group (separately). Unless one (or more) of the group’s histograms provide strong evidence to the contrary, it is reasonable to conclude that the data are sufficiently normal to proceed with the analysis. We can take this approach because ANOVA is robust to violations of requirements. In other words, results from ANOVA tests are reasonably good even if there are mild to moderate violations of the requirements.
The population variances are equal for each group.
This requirement is checked by examining the sample variances. The rule we will use is: if the largest sample variance is less than or equal to four times the smallest sample variance, then we will conclude that the population variances are equal.
If done by hand, the calculations for one ANOVA problem can easily require an hour of hard work. We will use software to do these calculations quickly and accurately.
To conduct a test for several means (ANOVA) in Excel, do the following:
Open the file Math 221 Statistics Toolbox and do the following:
The following points will help you interpret the output.
The “Numerical Summary of the Data” (Step 3) section of the output gives the sample size, mean and standard deviation for each of the groups in your sample.
In the ANOVA table, you will find the test statistic (\(F\)), the \(P\)-value (Sig.), and the degrees of freedom (df) for the \(F\) statistic. Note that there are two numbers specifying the degrees of freedom. These are given as the between groups and within groups df, respectively. Do not worry about the total degrees of freedom. This number is the sum of the other two.
To check your requirements, do the following:
To determine if the data are normally distributed, we will look at a histogram plot for each group separately.
To determine if the population variances are equal, we will use a very simple check:
We will conduct a hypothesis test to determine if the mean responses of the individuals in the three groups differ.
State the null and alternative hypotheses and the level of significance \[ \begin{align} H_0: & ~ \textrm{All the means are equal} \\ H_a: & ~ \textrm{At least one of the means differs} \end{align} \]
We will use the \(\alpha = 0.05\) level of significance.
Describe the data collection procedures
The students were randomly assigned to one of the three treatments. They wrote in a weekly journal, according to their group assignment. At the end of the semester, they completed a questionnaire that asked about their attitude toward life. The responses on the survey were coded into a number, where higher numbers represent a more positive outlook.
Give the relevant summary statistics
Follow these instructions to apply the ANOVA procedure using Excel:
Data representative of the values reported by Emmons and McCullough are given in the file Gratitude.xlsx. The data is divided up into three columns which represent Grateful, Hassels, and Events.
The summary statistics can be obtained using the Math 221 Statistics Toolbox. Paste the three columns of data into the appropriate areas of column A, B, and C. The summary statistics will appear in the table to the right labeled “Numerical Summary of the Data”.
The summary statistics are presented in the following table:
Group | N | Mean | Std. Deviation |
---|---|---|---|
Grateful | 64 | 5.050 | 0.9443 |
Hassles | 63 | 4.675 | 0.8320 |
Events | 65 | 4.660 | 0.8483 |
Please do not blindly cut-and-paste computer output. It can include a lot of information that we do not use. Identify the relevant parts and only report those pieces of information.
Make an appropriate graph to illustrate the data
This graph is from the Math221 Statistics Toolbox, but a title and proper labels have been added so that the chart can be understood without additional text to explain it.
Verify the requirements have been met
A histogram for each of the three sample groups is created (not shown). The data from the Grateful group is slightly skewed right. However, the skew is small enough that we should still be able to get reasonable results. The distributions for the other two groups do not exhibit distinctly non-normal shapes. Therefore, we have not violated the assumption of normally distributed sample data.
The largest variance (0.8917 from the Grateful group) is not four times the smallest (0.6922 from the Hassles group), so we conclude that the variances are equal for the three groups.
We conclude that the requirements are satisfied and it is appropriate to use ANOVA.
Give the test statistic and its value
This can be found in the output. Our test statistic, \(F\), is: \[F = 4.075\]
State the degrees of freedom
There are 2 and 189 degrees of freedom.
The order in which these are stated is important. For an F-test, it is not the same to have 2 and 189 degrees of freedom as it is to have 189 and 2 degrees of freedom.
Mark the test statistic and \(P\)-value on a graph of the sampling distribution
Find the \(P\)-value and compare it to the level of significance
\[P\textrm{-value}=0.019 < 0.05 = \alpha\]
State your decision
Since \(P\)-value\(=0.019 < 0.05 = \alpha\), we reject the null hypothesis.
Present your conclusion in an English sentence, relating the result to the context of the problem
There is sufficient evidence to suggest that at least one of the three groups has a mean level of satisfaction with life that differs from the others. In short, the mean level of satisfaction with life in general is not the same for all three groups.
If we take a closer look, we see that the Hassles and Events groups had means that were fairly close together. However, the Grateful group appears to have a significantly higher mean level of satisfaction than the other two groups.
Summarize the relevant background information
Nike, a company that makes sporting goods including shoes, funded a study to compare five soccer shoe designs. The objective of the research was to determine if there is a difference in the mean accuracy soccer players achieve using different Nike shoe designs.
State the null and alternative hypotheses and the level of significance \[ \begin{align} H_0: & \textrm{All the means are equal} \\ H_a: & \textrm{At least one of the means differs} \end{align} \]
We will use the \(\alpha = 0.10\) level of significance.
Describe the data collection procedures
As part of the research, they asked trained soccer players to kick a ball at a target. The target was placed 115 cm above the ground and at a distance of 10 m from the players. Using electronic equipment, the researchers recorded the distance from the center of the target to the point where the ball hit. The objective of the research was to assess if footwear could affect the accuracy of a soccer player.
The subjects wore five different soccer shoes and for one treatment they kicked the ball in stocking feet. Due to the proprietary nature of the data, the shoes are only labeled “A,” “B,” “C,” “D,” and “E” in the article. Data representing the results of this study are given in the file SoccerShoes.xlsx.
Use the SoccerShoes.xlsx data to answer the following questions.
ANOVA is used to compare the means for several groups. The hypotheses for the test are always: \[ \begin{align} H_0: & ~ \textrm{All the means are equal} \\ H_a: & ~ \textrm{At least one of the means differs} \end{align} \]
For ANOVA testing we use an \(F\)-distribution, which is right-skewed. The \(P\)-value of an ANOVA test is always the area to the right of the \(F\)-statistic.
We can conduct ANOVA testing when the following three requirements are satisfied:
Copyright © 2020 Brigham Young University-Idaho. All rights reserved.