The following R functions allow us to use simple linear regression.
cor
: Calculates the correlation of an x and y variablecov
: Calculates the covariance of an x and y variablelm
: Stands for linear model. It is primarily used to fit regression models. We will use it for simple linear regression.plot(lm.object)
: When you plot the object created by lm it will provide standardized diagnostic plots for evaluation.predict
: Predict a y-value for a given x-valueIn 1969 the U.S. government instituted the draft to support the Vietnam war. The draft lasted the next four years and the data related to the draft can be read into R with the code below. This video of the event shows how that first 1969 draft was performed.
During the later parts of 1969 some people started to believe that the 1969 draft was not random. There were even lawsuits over the non-randomness. We have the data for the four years of the draft.
Based on the the regression concepts we learned in the reading, what could we do to evaluate the randomness or non-randomness of the 1969 draft?
# Two websites
# http://faculty.washington.edu/gloftus/P317-318/Useful_Information/r_to_z/PearsonrCIs.pdf
# https://www.r-bloggers.com/how-to-calculate-confidence-intervals-of-correlations-with-r/
# install.packages(c("reshape","ggplot2","lubridate"))
library(reshape)
library(ggplot2)
library(lubridate)
draft = read.csv("https://github.com/byuistats/data/raw/master/Draft_vietnam/Draft_vietnam.csv",stringsAsFactors = FALSE)
draft$N69 = as.numeric(draft$N69)
draft$N71 = as.numeric(draft$N71)
draft$N70 = as.numeric(draft$N70)
draft$N72 = as.numeric(draft$N72)
draft.melt = melt(draft,measure.vars=c("N69","N70","N71","N72"))
draft.lm69 = lm(N69~Day_Year,data=draft)
draft.lm70 = lm(N70~Day_Year,data=draft)
draft.cor69 = cor(x=draft$Day_Year,y=draft$N69)
draft.cor70 = cor(x=draft$Day_Year,y=draft$N70)
summary(draft.lm69)
summary(draft.lm70)
predict(draft.lm69,data.frame(Day_Year=yday("1976-10-06")),interval="confidence")
qplot(data=draft,x=Day_Year,y=N71)+geom_smooth(method="lm")
qplot(data=draft.melt,x=Day_Year,y=value)+facet_wrap(~variable)+geom_smooth(method="lm")
ggplot(data=draft.melt,aes(x=factor(Month),y=value))+geom_boxplot()+facet_wrap(~variable)
#install.pacakges("psychometric")
#library(psychometric)
#CIr(r=draft.cor,n=nrow(draft),level=.95)