Description

For this assignment we will be tackling data obtained from this website. You can download the data by right clicking on the data link and clicking save link as... if loading the file directly from the web doesn’t work for you.

Completing this assignment well will prepare you for the final.

Beauty and Judgment - Halo Effect

60 male undergraduates read an essay supposedly written by a female college freshman. They then evaluated the quality of the essay and the ability of its writer on several dimensions. By means of a photo attached to the essay, 20 subjects were led to believe that the writer was physically attractive and 20 that she was unattractive. The remaining subjects read the essay without any information about the writer’s appearance. 30 subjects read a version of the essay that was well written while the other subjects read a version that was poorly written. Significant main effects for essay quality and writer attractiveness were predicted and obtained. (15 ref) (PsycINFO Database Record (c) 2016 APA, all rights reserved)

Our Job

For this Two-Way Design of Experiment we will need to complete the following tasks.

  • Make a clear plot showing the measurements by the two grouping variables.
  • Use the plots from the aov,the equal variance test, and Shapiro-Wilkes Test to validate if the assumptions are met.
  • Show the results from the ANOVA table and state your final conclusions using the p-values.
  • Provide the pairwise confidence intervals and draw conclusions from those intervals.
  • Make inference about how males grade essays.
  • Write a short paragraph that describes the results.
#libraries
library(car)
library(DT)
library(ggplot2)
library(plyr)

# Read in data
essay = read.table("http://www.stat.ufl.edu/~winner/data/halo1.dat")

# Format Data
colnames(essay) = c("essayQ","appearance","score")
#Essay Quality    8   /* 1=Good, 2=Poor  */
#Student Attactiveness   16  /*  1= Attractive, 2=Control, 3=Unattractive  */
essay$essayQ_name = factor(essay$essayQ,labels=c("Good","Poor"))
essay$appearance_name = factor(essay$appearance,labels=c("Attractive","None","Unattractive"))

# Plot Data
ggplot(data=essay,aes(y=score,x=appearance_name))+
  geom_boxplot()+
  geom_jitter(height=0,width=.25,alpha=.5)+
  facet_wrap(~essayQ_name,labeller = "label_both")+
  theme_bw()+labs(x="Student Picture Type",y="Male Graded Score")

essay.cellmeans = ddply(.data=essay,.variables = .(appearance_name,essayQ_name),.fun=function(x) mean(x$score))

ggplot(data=subset(essay.cellmeans,appearance_name!="None"),aes(x=appearance_name,y=essayQ_name))+
  geom_label(aes(label=V1))+theme_bw()

# Next two lines are identical
essay.aov = aov(score~essayQ_name*appearance_name,data=essay)
essay.aov = aov(score~essayQ_name+appearance_name+essayQ_name:appearance_name,data=essay)
# 
# Assumptions look like they are met
#    # Constant Variance
anova(aov(resid(essay.aov)^2~essayQ_name*appearance_name,data=essay))

# Normality of residuals
shapiro.test(resid(essay.aov))
plot(essay.aov)

### Now evaluate effects

anova(essay.aov)

### So the interaction is not significant.  Drop from model
essay.aov = aov(score~essayQ_name+appearance_name,data=essay)
datatable(anova(essay.aov))


essay.appearance.interval = TukeyHSD(essay.aov,"appearance_name")
essay.essayQ.interval = TukeyHSD(essay.aov,"essayQ_name")

plot(essay.appearance.interval)
plot(essay.essayQ.interval)

interaction.plot(essay$essayQ_name,essay$appearance_name,essay$score)
interaction.plot(essay$appearance_name,essay$essayQ_name,essay$score)

Multi-Factor Expirements

Our Job 1

We will use R and the Excel Worksheet to conceptualize how we can perform this analysis.

  • Use the expand.grid function to build out the main effects run combinations for a 24 DOE (The code below is for a 23).
  • Make a table with all the interaction contrasts for your 24 design.
abc = expand.grid(C=c(-1,1),B=c(-1,1),A=c(-1,1))
abc = abc[,c("A","B","C")]

Our Job 2

We have a simplified data set from the Medical wire material described below. We have three factors with each at a high/low level. The factors are

  • Drawing Machine
  • Die Reduction Angle
  • Die Bearing Length

We need to identify which factors and their interactions are significant.

  • Make a clear plot showing the measurements by the two grouping variables.
  • Use the plots from the aov,the equal variance test, and Shapiro-Wilkes Test to validate if the assumptions are met. If they are not discuss robustness.
  • Show the results from the ANOVA table and state your final conclusions using the p-values.
  • Tell the medical wire industry what combination of the three variables will provide the best wire.

Factors Affecting Strength of Medical Wire (DOE)

Medical wires are used in cardiovascular devices. Therefore, the wire drawing process is extremely critical in terms of creating high-quality products. However, with the possibility of hundreds of different process levels, implementing experimental design and analysis in the wire drawing process is very challenging. This article presents a case study of a Six Sigma based quality improvement framework for a wire drawing process at a U.S.-based medical wire manufacturing company1.

Some notes on UTS and YS

Hence, the difference between UTS and YS (i.e., expressed by their ratio) tells you how much the material work-hardens. If it is large, work-hardening is more and the material is more ductile. Also this gives an indication for the resistance to crack propagation1. It looks like a low number is better2

library(FrF2)
library(plyr)
library(ggplot2)
library(reshape)
library(DT)
library(lsmeans)


wire_example = read.csv(file="http://byuistats.github.io/M330/data/wire.csv")
wire_example_wide = read.csv(file = "http://byuistats.github.io/M330/data/wire_wide.csv")

# Now we need to make the variables factors
wire_example$Drawing_Machine = as.factor(wire_example$Drawing_Machine)
wire_example$Die_Reduction_Angle = as.factor(wire_example$Die_Reduction_Angle)
wire_example$Die_Bearing_Length = as.factor(wire_example$Die_Bearing_Length)


### Plot our Data

# First calculate means
wire.cellmeans = ddply(.data=wire_example,.variables = .(Drawing_Machine,Die_Reduction_Angle,Die_Bearing_Length),.fun=function(x){
  data.frame(strength=mean(x$strength))
} )

wire.grandmean = mean(wire_example$strength)

wire.dmmeans = ddply(.data=wire_example,.variables = .(Drawing_Machine),.fun=function(x){
  data.frame(strength=mean(x$strength))
} )

wire_plot = ggplot(data=wire_example,aes(x=Die_Reduction_Angle,y=strength))+
  geom_boxplot()+
  geom_jitter(height=0,width=.25)+
  theme_bw()

wire_plot +  facet_grid(Drawing_Machine~Die_Bearing_Length,labeller = "label_both")

wire_plot +  
  facet_wrap(~Die_Bearing_Length+Drawing_Machine,labeller = "label_both",nrow=1)+
  geom_hline(yintercept=wire.grandmean,size=1.5,colour="blue",alpha=.5)+
  geom_point(data=wire.cellmeans,size=4,col="darkgrey",shape=15,alpha=.75)+
  geom_point(data=wire.cellmeans,size=2,col="black",shape=15,alpha=.75)



wire.aov.all=aov(strength~Drawing_Machine*Die_Reduction_Angle*Die_Bearing_Length,data=wire_example)

### These give us the same thing as the Excel Sheet.
### This only works for balanced designs. The library library(doBy) can help for unbalanced.
model.tables(wire.aov.all,type="means")
model.tables(wire.aov.all,type="effects")


# We can also get them out a different way
wire.ls = lsmeans(wire.aov.all,~Drawing_Machine*Die_Reduction_Angle*Die_Bearing_Length)


confint(wire.ls)

drawing.ls = lsmeans(wire.aov.all,~Drawing_Machine)
DRA.ls = lsmeans(wire.aov.all,~Die_Reduction_Angle)
DBL.ls = lsmeans(wire.aov.all,~Die_Bearing_Length)


pairs(DBL.ls)
pairs(drawing.ls)


plot(pairs(DRA.ls))



# Now check assumptions
anova(aov(resid(wire.aov.all)^2~Drawing_Machine*Die_Reduction_Angle*Die_Bearing_Length,
             data=wire_example))

# Normality of residuals
shapiro.test(resid(wire.aov.all))
plot(wire.aov.all)

# Now let's check for significant 3-way interaction
datatable(anova(wire.aov.all))%>%formatRound(columns=1:5, digits=3)

# looks like there isn't.  Let's remove and check two way
# see page 403 for code example and description
wire.aov.n3way=aov(strength~Drawing_Machine*Die_Reduction_Angle+ Drawing_Machine*Die_Bearing_Length+Die_Bearing_Length*Die_Reduction_Angle,data=wire_example)
datatable(anova(wire.aov.n3way))

wire.aov.signif=aov(strength~Drawing_Machine+Die_Bearing_Length+Die_Reduction_Angle+Drawing_Machine:Die_Bearing_Length,data=wire_example)

wire.aov.maineffects=aov(strength~Drawing_Machine+Die_Bearing_Length+Die_Reduction_Angle,data=wire_example)

wire.aov.maineffects$coefficients
confint(wire.aov.maineffects)

datatable(anova(wire.aov.maineffects))%>%formatRound(columns=1:5, digits=3)

TukeyHSD(wire.aov.signif)
plot(TukeyHSD(wire.aov))


cubePlot(wire.aov.maineffects, "Drawing_Machine", "Die_Reduction_Angle", "Die_Bearing_Length")

Other Stuff

# Build Wire example data
# 
wire = read.table("http://www.stat.ufl.edu/~winner/data/medical_wire.dat")
colnames(wire) =  c("Spool","Block","Drawing_Machine", "Die_Reduction_Angle", 
                    "Die_Bearing_Length","Supply_Diameter","level_id","strength")

### We will only use the following subsetted data for our example.
wire_example = wire[,c("level_id","Drawing_Machine","Die_Reduction_Angle","Die_Bearing_Length","strength")]
wire_example = wire_example[order(wire_example$level_id),]
wire_example$n_id = rep(1:15,max(wire_example$level_id))



wire_example_wide = cast(data=wire_example,Drawing_Machine+Die_Reduction_Angle+Die_Bearing_Length~n_id,value="strength",margins="grand_col",fun.aggregate = mean)
colnames(wire_example_wide) = gsub("\\(all\\)","mean",colnames(wire_example_wide))

write.csv(wire_example_wide,file=file.path(getwd(),"data/wire_wide.csv"),row.names=FALSE)
write.csv(wire_example[,c("level_id","n_id","Drawing_Machine","Die_Reduction_Angle","Die_Bearing_Length","strength")],file=file.path(getwd(),"data/wire.csv"),row.names=FALSE)
install.packages("FrF2")
install.packages("BsMD")
library(FrF2)
library(BsMD)
data(BM93.e3.data)  #from BsMD
iMdat <- BM93.e3.data[1:16,2:10]  #only original experiment
colnames(iMdat) <- c("MoldTemp","Moisture","HoldPress","CavityThick","BoostPress",
                     "CycleTime","GateSize","ScrewSpeed", "y")
iM.lm <- lm(y ~ (.)^2, data = iMdat)
cubePlot(iM.lm, "MoldTemp", "HoldPress", "BoostPress")