# This script will flip the summary data table to be easily used with geom_col()
library(rio)
library(tidyverse)
library(mosaic)
library(car)
<- import("https://byuistats.github.io/M221R/Data/quiz/R/bp_alcohol.csv")
bp_alcohol
## Note: the following code transforms a table into a tidy dataset
<- bp_alcohol %>%
bp_alcohol_graph pivot_longer(cols = c( "zero", "one_two", "three_four_five", "six_or_more" ), values_to = "Counts", names_to = "Frequency_of_Drink") %>%
mutate(Frequency_of_Drink = factor(Frequency_of_Drink, levels=c("zero", "one_two","three_four_five", "six_or_more"))) %>%
rename("Hypertension"=V1)
Chi-Square Practice
Instructions
Complete the following questions about testing for independence between 2 categorical variables. When completed, Render the qmd
file and submit the html
.
Aussie Alcohol
A small study in Western Australia was done to determine the association between daily alcohol intake and hypertension (high blood pressure). The data contains a hypertension indicator (0=no, 1=yes), and the number of alcoholic drinks per day. Researchers wanted to determine if hypertension and daily alcohol intake are independent with a level of significance of 0.05.
Run the following commands to read the data into R:
Use the bp_alcohol_graph
dataset to create a side-by-side bar chart (ggplot()
) to illustrate the counts.
NOTE: When you use geom_bar()
with raw data, you do not need to specify a y=
variable because ggplot will create the counts automatically. When you have summarized data, you need to specify y=Counts
in aes()
then include geom_bar(stat="identity")
to instruct ggplot to ignore counting the frequencies.
Question: Which group appears to be most likely to have hypertension?
Answer:
State the Null and Alternative Hypotheses:
H0:
Ha:
Hypothesis Test
NOTE: The way the data were imported into bp_alcohol
, it contains a column, V1, which is the row label. It should not be included in the input table in the chisq.test()
function. Create a new dataset that only includes the columns: zero, one_two, three_four_five and six_or_more.
<- bp_alcohol %>%
alcohol_tbl select()
Once you’ve created the table with only the counts, perform the Chi-square test for independence:
# Run the Chi-square test:
# Check the requirements:
Question: Are the requirements satisfied for the \(\chi\)-square test for independence?
Answer:
Question: What is the value of the test statistic?
Answer:
Question: What is the P-value?
Answer:
Question: State your conclusion in context of this problem:
Answer:
Therapy
A psychologist is interested in whether the type of therapy a patient receives is related to their level of improvement. Patients were randomly assigned to one of two therapy types (Cognitive Behavioral Therapy (CBT) or Psychodynamic Therapy (PDT)). Their improvement level was categorized as “Improved,” “No Change,” or “Worsened” after six months of treatment.
# Read in the data:
<- read_csv('https://github.com/byuistats/Math221D_Course/raw/refs/heads/main/Data/CBT_vs_PDT_Treatment_Data.csv') therapy
QUESTION: Create a side-by-side bar chart that groups the bars based on therapy type and colors them by improvement level.
QUESTION: State your null and alternative hypothesis:
Ho:
Ha:
QUESTION: Perform a Chi-square test of independence:
QUESTION: What is your P-value?
ASNWER:
QUESTION: State your conclusion.
ANSWER:
QUESTION: Are the requirements for a Chi-square test of independence satisfied?
ANSWER:
Bird Populations
A wildlife manager would like to study the prevalence of Imperiled and Non-Imperiled bird species in different land designations (Protected, Multi-use, and Un-designated). She observes the number of species in different land designations.
# Read in the data
<- read_csv('https://github.com/byuistats/Math221D_Course/raw/refs/heads/main/Data/Imperiled_Bird_Habitats.csv') birds
QUESTION: Create a side-by-side bar chart that groups the bars based on Imperiled Status and colors them by Land Designation.
QUESTION: State your null and alternative hypothesis:
Ho:
Ha:
QUESTION: Perform a Chi-square test of independence:
QUESTION: What is your P-value?
ASNWER:
QUESTION: State your conclusion.
ANSWER:
QUESTION: Are the requirements for a Chi-square test of independence satisfied?
ANSWER: