This document uses the Angell dataset from library(car) to determine if there is greater mobility between the East and the West among the cities in the U.S. (around 1950).
First, because this file is being used to demonstrate the Wilcoxon Rank Sum Test, we need to isolate the data to two groups, East and West. We will do this by combining S and NE to be E and combining MW and W to be W. We will use the library(tidyverse) and the function recode to do this. Notice how the dataset is modifed by the recode command in the code below.
Angell2 <- Angell %>%
mutate(area = recode(region, S="E", NE="E", MW="W"))
# alternatively we could have used:
# Angell2 <- Angell %>%
# mutate(area = mapvalues(region,
# from = c("S", "MW"),
# to = c("E", "W")))
rownames(Angell2) <- rownames(Angell)
pander(Angell)
| moral | hetero | mobility | region | |
|---|---|---|---|---|
| Rochester | 19 | 20.6 | 15 | E |
| Syracuse | 17 | 15.6 | 20.2 | E |
| Worcester | 16.4 | 22.1 | 13.6 | E |
| Erie | 16.2 | 14 | 14.8 | E |
| Milwaukee | 15.8 | 17.4 | 17.6 | MW |
| Bridgeport | 15.3 | 27.9 | 17.5 | E |
| Buffalo | 15.2 | 22.3 | 14.7 | E |
| Dayton | 14.3 | 23.7 | 23.8 | MW |
| Reading | 14.2 | 10.6 | 19.4 | E |
| Des_Moines | 14.1 | 12.7 | 31.9 | MW |
| Cleveland | 14 | 39.7 | 18.6 | MW |
| Denver | 13.9 | 13 | 34.5 | W |
| Peoria | 13.8 | 10.7 | 35.1 | MW |
| Wichita | 13.6 | 11.9 | 42.7 | MW |
| Trenton | 13 | 32.5 | 15.8 | E |
| Grand_Rapids | 12.8 | 15.7 | 24.2 | MW |
| Toledo | 12.7 | 19.2 | 21.6 | MW |
| San_Diego | 12.5 | 15.9 | 49.8 | W |
| Baltimore | 12 | 45.8 | 12.1 | E |
| South_Bend | 11.8 | 17.9 | 27.4 | MW |
| Akron | 11.3 | 20.4 | 22.1 | MW |
| Detroit | 11.1 | 38.3 | 19.5 | MW |
| Tacoma | 10.9 | 17.8 | 31.2 | W |
| Flint | 9.8 | 19.3 | 32.2 | MW |
| Spokane | 9.6 | 12.3 | 38.9 | W |
| Seattle | 9 | 23.9 | 34.2 | W |
| Indianapolis | 8.8 | 29.2 | 23.1 | MW |
| Columbus | 8 | 27.4 | 25 | MW |
| Portland_Oregon | 7.2 | 16.4 | 35.8 | W |
| Richmond | 10.4 | 65.3 | 24.9 | S |
| Houston | 10.2 | 49 | 36.1 | S |
| Fort_Worth | 10.2 | 30.5 | 36.8 | S |
| Oklahoma_City | 9.7 | 20.7 | 47.2 | S |
| Chattanooga | 9.3 | 57.7 | 27.2 | S |
| Nashville | 8.6 | 57.4 | 25.4 | S |
| Birmingham | 8.2 | 83.1 | 25.9 | S |
| Dallas | 8 | 36.8 | 37.8 | S |
| Louisville | 7.7 | 31.5 | 19.4 | S |
| Jacksonville | 6 | 73.7 | 27.7 | S |
| Memphis | 5.4 | 84.5 | 26.7 | S |
| Tulsa | 5.3 | 23.8 | 44.9 | S |
| Miami | 5.1 | 50.2 | 41.8 | S |
| Atlanta | 4.2 | 70.6 | 32.6 | S |
pander(Angell2)
| moral | hetero | mobility | region | area | |
|---|---|---|---|---|---|
| Rochester | 19 | 20.6 | 15 | E | E |
| Syracuse | 17 | 15.6 | 20.2 | E | E |
| Worcester | 16.4 | 22.1 | 13.6 | E | E |
| Erie | 16.2 | 14 | 14.8 | E | E |
| Milwaukee | 15.8 | 17.4 | 17.6 | MW | W |
| Bridgeport | 15.3 | 27.9 | 17.5 | E | E |
| Buffalo | 15.2 | 22.3 | 14.7 | E | E |
| Dayton | 14.3 | 23.7 | 23.8 | MW | W |
| Reading | 14.2 | 10.6 | 19.4 | E | E |
| Des_Moines | 14.1 | 12.7 | 31.9 | MW | W |
| Cleveland | 14 | 39.7 | 18.6 | MW | W |
| Denver | 13.9 | 13 | 34.5 | W | W |
| Peoria | 13.8 | 10.7 | 35.1 | MW | W |
| Wichita | 13.6 | 11.9 | 42.7 | MW | W |
| Trenton | 13 | 32.5 | 15.8 | E | E |
| Grand_Rapids | 12.8 | 15.7 | 24.2 | MW | W |
| Toledo | 12.7 | 19.2 | 21.6 | MW | W |
| San_Diego | 12.5 | 15.9 | 49.8 | W | W |
| Baltimore | 12 | 45.8 | 12.1 | E | E |
| South_Bend | 11.8 | 17.9 | 27.4 | MW | W |
| Akron | 11.3 | 20.4 | 22.1 | MW | W |
| Detroit | 11.1 | 38.3 | 19.5 | MW | W |
| Tacoma | 10.9 | 17.8 | 31.2 | W | W |
| Flint | 9.8 | 19.3 | 32.2 | MW | W |
| Spokane | 9.6 | 12.3 | 38.9 | W | W |
| Seattle | 9 | 23.9 | 34.2 | W | W |
| Indianapolis | 8.8 | 29.2 | 23.1 | MW | W |
| Columbus | 8 | 27.4 | 25 | MW | W |
| Portland_Oregon | 7.2 | 16.4 | 35.8 | W | W |
| Richmond | 10.4 | 65.3 | 24.9 | S | E |
| Houston | 10.2 | 49 | 36.1 | S | E |
| Fort_Worth | 10.2 | 30.5 | 36.8 | S | E |
| Oklahoma_City | 9.7 | 20.7 | 47.2 | S | E |
| Chattanooga | 9.3 | 57.7 | 27.2 | S | E |
| Nashville | 8.6 | 57.4 | 25.4 | S | E |
| Birmingham | 8.2 | 83.1 | 25.9 | S | E |
| Dallas | 8 | 36.8 | 37.8 | S | E |
| Louisville | 7.7 | 31.5 | 19.4 | S | E |
| Jacksonville | 6 | 73.7 | 27.7 | S | E |
| Memphis | 5.4 | 84.5 | 26.7 | S | E |
| Tulsa | 5.3 | 23.8 | 44.9 | S | E |
| Miami | 5.1 | 50.2 | 41.8 | S | E |
| Atlanta | 4.2 | 70.6 | 32.6 | S | E |
Now we can compare the East and West with respect to their mobility scores.
boxplot(mobility ~ area, data=Angell2, names=c("Eastern Cities","Western Cities"), ylab="Mobility Score", col='gray', boxwex=.25, main = "Geographic Mobility of U.S. Cities, 1950", xlab="Cities in the Western U.S. Show Higher Mobility")
It appears there may be a slight shift in medians with the West being higher. Since the distibutions are similarly shaped (slightly right skewed), an official test of the hypotheses \[
H_0: \text{difference in medians} = 0
\] \[
H_a: \text{difference in medians} \neq 0
\] can be performed. Using a Wilcoxon Rank Sum Test (using the normal approximation with continuity correction due to ties in the data), we obtain a test statistic of \(W = 181\) and a p-value of \(0.2376\). There is insufficient evidence to reject the null. We conclude that any differences in medians demonstrated by the above boxplot is simply due to random sampling. The mobility scores for the entire U.S. appear to be the same on average (median) between the East and West.
To see the R Code that produced the Wilcoxon Test results reported above, click the code button to the right.
wilcox.test(mobility ~ area, data=Angell2)
## Warning in wilcox.test.default(x = c(15, 20.2, 13.6, 14.8, 17.5, 14.7,
## 19.4, : cannot compute exact p-value with ties
##
## Wilcoxon rank sum test with continuity correction
##
## data: mobility by area
## W = 181, p-value = 0.2376
## alternative hypothesis: true location shift is not equal to 0