Moral Integration of American Cities

This document uses the Angell dataset from library(car) to determine if there is greater mobility between the East and the West among the cities in the U.S. (around 1950).

First, because this file is being used to demonstrate the Wilcoxon Rank Sum Test, we need to isolate the data to two groups, East and West. We will do this by combining S and NE to be E and combining MW and W to be W. We will use the library(tidyverse) and the function recode to do this. Notice how the dataset is modifed by the recode command in the code below.

Angell2 <- Angell %>%
  mutate(area = recode(region, S="E", NE="E", MW="W"))

# alternatively we could have used:
# Angell2 <- Angell %>%
#    mutate(area = mapvalues(region,
#                            from = c("S", "MW"), 
#                            to = c("E", "W")))

rownames(Angell2) <- rownames(Angell)

Hide Data
Show Data

Original Angell Data
Modified Angell Data

pander(Angell)

	moral	hetero	mobility	region
Rochester	19	20.6	15	E
Syracuse	17	15.6	20.2	E
Worcester	16.4	22.1	13.6	E
Erie	16.2	14	14.8	E
Milwaukee	15.8	17.4	17.6	MW
Bridgeport	15.3	27.9	17.5	E
Buffalo	15.2	22.3	14.7	E
Dayton	14.3	23.7	23.8	MW
Reading	14.2	10.6	19.4	E
Des_Moines	14.1	12.7	31.9	MW
Cleveland	14	39.7	18.6	MW
Denver	13.9	13	34.5	W
Peoria	13.8	10.7	35.1	MW
Wichita	13.6	11.9	42.7	MW
Trenton	13	32.5	15.8	E
Grand_Rapids	12.8	15.7	24.2	MW
Toledo	12.7	19.2	21.6	MW
San_Diego	12.5	15.9	49.8	W
Baltimore	12	45.8	12.1	E
South_Bend	11.8	17.9	27.4	MW
Akron	11.3	20.4	22.1	MW
Detroit	11.1	38.3	19.5	MW
Tacoma	10.9	17.8	31.2	W
Flint	9.8	19.3	32.2	MW
Spokane	9.6	12.3	38.9	W
Seattle	9	23.9	34.2	W
Indianapolis	8.8	29.2	23.1	MW
Columbus	8	27.4	25	MW
Portland_Oregon	7.2	16.4	35.8	W
Richmond	10.4	65.3	24.9	S
Houston	10.2	49	36.1	S
Fort_Worth	10.2	30.5	36.8	S
Oklahoma_City	9.7	20.7	47.2	S
Chattanooga	9.3	57.7	27.2	S
Nashville	8.6	57.4	25.4	S
Birmingham	8.2	83.1	25.9	S
Dallas	8	36.8	37.8	S
Louisville	7.7	31.5	19.4	S
Jacksonville	6	73.7	27.7	S
Memphis	5.4	84.5	26.7	S
Tulsa	5.3	23.8	44.9	S
Miami	5.1	50.2	41.8	S
Atlanta	4.2	70.6	32.6	S

pander(Angell2)

	moral	hetero	mobility	region	area
Rochester	19	20.6	15	E	E
Syracuse	17	15.6	20.2	E	E
Worcester	16.4	22.1	13.6	E	E
Erie	16.2	14	14.8	E	E
Milwaukee	15.8	17.4	17.6	MW	W
Bridgeport	15.3	27.9	17.5	E	E
Buffalo	15.2	22.3	14.7	E	E
Dayton	14.3	23.7	23.8	MW	W
Reading	14.2	10.6	19.4	E	E
Des_Moines	14.1	12.7	31.9	MW	W
Cleveland	14	39.7	18.6	MW	W
Denver	13.9	13	34.5	W	W
Peoria	13.8	10.7	35.1	MW	W
Wichita	13.6	11.9	42.7	MW	W
Trenton	13	32.5	15.8	E	E
Grand_Rapids	12.8	15.7	24.2	MW	W
Toledo	12.7	19.2	21.6	MW	W
San_Diego	12.5	15.9	49.8	W	W
Baltimore	12	45.8	12.1	E	E
South_Bend	11.8	17.9	27.4	MW	W
Akron	11.3	20.4	22.1	MW	W
Detroit	11.1	38.3	19.5	MW	W
Tacoma	10.9	17.8	31.2	W	W
Flint	9.8	19.3	32.2	MW	W
Spokane	9.6	12.3	38.9	W	W
Seattle	9	23.9	34.2	W	W
Indianapolis	8.8	29.2	23.1	MW	W
Columbus	8	27.4	25	MW	W
Portland_Oregon	7.2	16.4	35.8	W	W
Richmond	10.4	65.3	24.9	S	E
Houston	10.2	49	36.1	S	E
Fort_Worth	10.2	30.5	36.8	S	E
Oklahoma_City	9.7	20.7	47.2	S	E
Chattanooga	9.3	57.7	27.2	S	E
Nashville	8.6	57.4	25.4	S	E
Birmingham	8.2	83.1	25.9	S	E
Dallas	8	36.8	37.8	S	E
Louisville	7.7	31.5	19.4	S	E
Jacksonville	6	73.7	27.7	S	E
Memphis	5.4	84.5	26.7	S	E
Tulsa	5.3	23.8	44.9	S	E
Miami	5.1	50.2	41.8	S	E
Atlanta	4.2	70.6	32.6	S	E

Now we can compare the East and West with respect to their mobility scores.

boxplot(mobility ~ area, data=Angell2, names=c("Eastern Cities","Western Cities"), ylab="Mobility Score", col='gray', boxwex=.25, main = "Geographic Mobility of U.S. Cities, 1950", xlab="Cities in the Western U.S. Show Higher Mobility")

It appears there may be a slight shift in medians with the West being higher. Since the distibutions are similarly shaped (slightly right skewed), an official test of the hypotheses $H_0: \text{difference in medians} = 0$ $H_a: \text{difference in medians} \neq 0$ can be performed. Using a Wilcoxon Rank Sum Test (using the normal approximation with continuity correction due to ties in the data), we obtain a test statistic of $W = 181$ and a p-value of $0.2376$ . There is insufficient evidence to reject the null. We conclude that any differences in medians demonstrated by the above boxplot is simply due to random sampling. The mobility scores for the entire U.S. appear to be the same on average (median) between the East and West.

To see the R Code that produced the Wilcoxon Test results reported above, click the code button to the right.

wilcox.test(mobility ~ area, data=Angell2)

## Warning in wilcox.test.default(x = c(15, 20.2, 13.6, 14.8, 17.5, 14.7,
## 19.4, : cannot compute exact p-value with ties

## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  mobility by area
## W = 181, p-value = 0.2376
## alternative hypothesis: true location shift is not equal to 0