For this assignment we will be tackling data obtained from this website. You can download the data by right clicking on the data
link and clicking save link as...
if loading the file directly from the web doesn’t work for you.
Completing this assignment well will prepare you for the final.
60 male undergraduates read an essay supposedly written by a female college freshman. They then evaluated the quality of the essay and the ability of its writer on several dimensions. By means of a photo attached to the essay, 20 subjects were led to believe that the writer was physically attractive and 20 that she was unattractive. The remaining subjects read the essay without any information about the writer’s appearance. 30 subjects read a version of the essay that was well written while the other subjects read a version that was poorly written. Significant main effects for essay quality and writer attractiveness were predicted and obtained. (15 ref) (PsycINFO Database Record (c) 2016 APA, all rights reserved)
For this Two-Way Design of Experiment we will need to complete the following tasks.
aov
,the equal variance test, and Shapiro-Wilkes Test
to validate if the assumptions are met.#libraries
library(car)
library(DT)
library(ggplot2)
library(plyr)
# Read in data
essay = read.table("http://www.stat.ufl.edu/~winner/data/halo1.dat")
# Format Data
colnames(essay) = c("essayQ","appearance","score")
#Essay Quality 8 /* 1=Good, 2=Poor */
#Student Attactiveness 16 /* 1= Attractive, 2=Control, 3=Unattractive */
essay$essayQ_name = factor(essay$essayQ,labels=c("Good","Poor"))
essay$appearance_name = factor(essay$appearance,labels=c("Attractive","None","Unattractive"))
# Plot Data
ggplot(data=essay,aes(y=score,x=appearance_name))+
geom_boxplot()+
geom_jitter(height=0,width=.25,alpha=.5)+
facet_wrap(~essayQ_name,labeller = "label_both")+
theme_bw()+labs(x="Student Picture Type",y="Male Graded Score")
essay.cellmeans = ddply(.data=essay,.variables = .(appearance_name,essayQ_name),.fun=function(x) mean(x$score))
ggplot(data=subset(essay.cellmeans,appearance_name!="None"),aes(x=appearance_name,y=essayQ_name))+
geom_label(aes(label=V1))+theme_bw()
# Next two lines are identical
essay.aov = aov(score~essayQ_name*appearance_name,data=essay)
essay.aov = aov(score~essayQ_name+appearance_name+essayQ_name:appearance_name,data=essay)
#
# Assumptions look like they are met
# # Constant Variance
anova(aov(resid(essay.aov)^2~essayQ_name*appearance_name,data=essay))
# Normality of residuals
shapiro.test(resid(essay.aov))
plot(essay.aov)
### Now evaluate effects
anova(essay.aov)
### So the interaction is not significant. Drop from model
essay.aov = aov(score~essayQ_name+appearance_name,data=essay)
datatable(anova(essay.aov))
essay.appearance.interval = TukeyHSD(essay.aov,"appearance_name")
essay.essayQ.interval = TukeyHSD(essay.aov,"essayQ_name")
plot(essay.appearance.interval)
plot(essay.essayQ.interval)
interaction.plot(essay$essayQ_name,essay$appearance_name,essay$score)
interaction.plot(essay$appearance_name,essay$essayQ_name,essay$score)
We will use R and the Excel Worksheet to conceptualize how we can perform this analysis.
expand.grid
function to build out the main effects run combinations for a 24 DOE (The code below is for a 23).abc = expand.grid(C=c(-1,1),B=c(-1,1),A=c(-1,1))
abc = abc[,c("A","B","C")]
We have a simplified data set from the Medical wire material described below. We have three factors with each at a high/low level. The factors are
We need to identify which factors and their interactions are significant.
aov
,the equal variance test, and Shapiro-Wilkes Test
to validate if the assumptions are met. If they are not discuss robustness.Medical wires are used in cardiovascular devices. Therefore, the wire drawing process is extremely critical in terms of creating high-quality products. However, with the possibility of hundreds of different process levels, implementing experimental design and analysis in the wire drawing process is very challenging. This article presents a case study of a Six Sigma based quality improvement framework for a wire drawing process at a U.S.-based medical wire manufacturing company1.
Hence, the difference between UTS and YS (i.e., expressed by their ratio) tells you how much the material work-hardens. If it is large, work-hardening is more and the material is more ductile. Also this gives an indication for the resistance to crack propagation1. It looks like a low number is better2
library(FrF2)
library(plyr)
library(ggplot2)
library(reshape)
library(DT)
library(lsmeans)
wire_example = read.csv(file="http://byuistats.github.io/M330/data/wire.csv")
wire_example_wide = read.csv(file = "http://byuistats.github.io/M330/data/wire_wide.csv")
# Now we need to make the variables factors
wire_example$Drawing_Machine = as.factor(wire_example$Drawing_Machine)
wire_example$Die_Reduction_Angle = as.factor(wire_example$Die_Reduction_Angle)
wire_example$Die_Bearing_Length = as.factor(wire_example$Die_Bearing_Length)
### Plot our Data
# First calculate means
wire.cellmeans = ddply(.data=wire_example,.variables = .(Drawing_Machine,Die_Reduction_Angle,Die_Bearing_Length),.fun=function(x){
data.frame(strength=mean(x$strength))
} )
wire.grandmean = mean(wire_example$strength)
wire.dmmeans = ddply(.data=wire_example,.variables = .(Drawing_Machine),.fun=function(x){
data.frame(strength=mean(x$strength))
} )
wire_plot = ggplot(data=wire_example,aes(x=Die_Reduction_Angle,y=strength))+
geom_boxplot()+
geom_jitter(height=0,width=.25)+
theme_bw()
wire_plot + facet_grid(Drawing_Machine~Die_Bearing_Length,labeller = "label_both")
wire_plot +
facet_wrap(~Die_Bearing_Length+Drawing_Machine,labeller = "label_both",nrow=1)+
geom_hline(yintercept=wire.grandmean,size=1.5,colour="blue",alpha=.5)+
geom_point(data=wire.cellmeans,size=4,col="darkgrey",shape=15,alpha=.75)+
geom_point(data=wire.cellmeans,size=2,col="black",shape=15,alpha=.75)
wire.aov.all=aov(strength~Drawing_Machine*Die_Reduction_Angle*Die_Bearing_Length,data=wire_example)
### These give us the same thing as the Excel Sheet.
### This only works for balanced designs. The library library(doBy) can help for unbalanced.
model.tables(wire.aov.all,type="means")
model.tables(wire.aov.all,type="effects")
# We can also get them out a different way
wire.ls = lsmeans(wire.aov.all,~Drawing_Machine*Die_Reduction_Angle*Die_Bearing_Length)
confint(wire.ls)
drawing.ls = lsmeans(wire.aov.all,~Drawing_Machine)
DRA.ls = lsmeans(wire.aov.all,~Die_Reduction_Angle)
DBL.ls = lsmeans(wire.aov.all,~Die_Bearing_Length)
pairs(DBL.ls)
pairs(drawing.ls)
plot(pairs(DRA.ls))
# Now check assumptions
anova(aov(resid(wire.aov.all)^2~Drawing_Machine*Die_Reduction_Angle*Die_Bearing_Length,
data=wire_example))
# Normality of residuals
shapiro.test(resid(wire.aov.all))
plot(wire.aov.all)
# Now let's check for significant 3-way interaction
datatable(anova(wire.aov.all))%>%formatRound(columns=1:5, digits=3)
# looks like there isn't. Let's remove and check two way
# see page 403 for code example and description
wire.aov.n3way=aov(strength~Drawing_Machine*Die_Reduction_Angle+ Drawing_Machine*Die_Bearing_Length+Die_Bearing_Length*Die_Reduction_Angle,data=wire_example)
datatable(anova(wire.aov.n3way))
wire.aov.signif=aov(strength~Drawing_Machine+Die_Bearing_Length+Die_Reduction_Angle+Drawing_Machine:Die_Bearing_Length,data=wire_example)
wire.aov.maineffects=aov(strength~Drawing_Machine+Die_Bearing_Length+Die_Reduction_Angle,data=wire_example)
wire.aov.maineffects$coefficients
confint(wire.aov.maineffects)
datatable(anova(wire.aov.maineffects))%>%formatRound(columns=1:5, digits=3)
TukeyHSD(wire.aov.signif)
plot(TukeyHSD(wire.aov))
cubePlot(wire.aov.maineffects, "Drawing_Machine", "Die_Reduction_Angle", "Die_Bearing_Length")
# Build Wire example data
#
wire = read.table("http://www.stat.ufl.edu/~winner/data/medical_wire.dat")
colnames(wire) = c("Spool","Block","Drawing_Machine", "Die_Reduction_Angle",
"Die_Bearing_Length","Supply_Diameter","level_id","strength")
### We will only use the following subsetted data for our example.
wire_example = wire[,c("level_id","Drawing_Machine","Die_Reduction_Angle","Die_Bearing_Length","strength")]
wire_example = wire_example[order(wire_example$level_id),]
wire_example$n_id = rep(1:15,max(wire_example$level_id))
wire_example_wide = cast(data=wire_example,Drawing_Machine+Die_Reduction_Angle+Die_Bearing_Length~n_id,value="strength",margins="grand_col",fun.aggregate = mean)
colnames(wire_example_wide) = gsub("\\(all\\)","mean",colnames(wire_example_wide))
write.csv(wire_example_wide,file=file.path(getwd(),"data/wire_wide.csv"),row.names=FALSE)
write.csv(wire_example[,c("level_id","n_id","Drawing_Machine","Die_Reduction_Angle","Die_Bearing_Length","strength")],file=file.path(getwd(),"data/wire.csv"),row.names=FALSE)
install.packages("FrF2")
install.packages("BsMD")
library(FrF2)
library(BsMD)
data(BM93.e3.data) #from BsMD
iMdat <- BM93.e3.data[1:16,2:10] #only original experiment
colnames(iMdat) <- c("MoldTemp","Moisture","HoldPress","CavityThick","BoostPress",
"CycleTime","GateSize","ScrewSpeed", "y")
iM.lm <- lm(y ~ (.)^2, data = iMdat)
cubePlot(iM.lm, "MoldTemp", "HoldPress", "BoostPress")