Chi-Square Practice

Instructions

Complete the following questions about testing for independence between 2 categorical variables. When completed, Render the qmd file and submit the html.

Aussie Alcohol

A small study in Western Australia was done to determine the association between daily alcohol intake and hypertension (high blood pressure). The data contains a hypertension indicator (0=no, 1=yes), and the number of alcoholic drinks per day. Researchers wanted to determine if hypertension and daily alcohol intake are independent with a level of significance of 0.05.

Run the following commands to read the data into R:

# This script will flip the summary data table to be easily used with geom_col()
library(rio)
library(tidyverse)
library(mosaic)
library(car)

bp_alcohol <- import("https://byuistats.github.io/M221R/Data/quiz/R/bp_alcohol.csv") 

## Note:  the following code transforms a table into a tidy dataset

bp_alcohol_graph <- bp_alcohol %>% 
  pivot_longer(cols = c( "zero", "one_two", "three_four_five", "six_or_more" ), values_to = "Counts", names_to = "Frequency_of_Drink") %>%
  mutate(Frequency_of_Drink = factor(Frequency_of_Drink, levels=c("zero", "one_two","three_four_five", "six_or_more"))) %>%
  rename("Hypertension"=V1)

Use the bp_alcohol_graph dataset to create a side-by-side bar chart (ggplot()) to illustrate the counts.

NOTE: When you use geom_bar() with raw data, you do not need to specify a y= variable because ggplot will create the counts automatically. When you have summarized data, you need to specify y=Counts in aes() then include geom_bar(stat="identity") to instruct ggplot to ignore counting the frequencies.

Question: Which group appears to be most likely to have hypertension?
Answer:

State the Null and Alternative Hypotheses:

H0:
Ha:

Hypothesis Test

NOTE: The way the data were imported into bp_alcohol, it contains a column, V1, which is the row label. It should not be included in the input table in the chisq.test() function. Create a new dataset that only includes the columns: zero, one_two, three_four_five and six_or_more.

alcohol_tbl <- bp_alcohol %>%
  select()

Once you’ve created the table with only the counts, perform the Chi-square test for independence:

# Run the Chi-square test:  


# Check the requirements:

Question: Are the requirements satisfied for the \(\chi\)-square test for independence?
Answer:

Question: What is the value of the test statistic?
Answer:

Question: What is the P-value?
Answer:

Question: State your conclusion in context of this problem:
Answer:

Therapy

A psychologist is interested in whether the type of therapy a patient receives is related to their level of improvement. Patients were randomly assigned to one of two therapy types (Cognitive Behavioral Therapy (CBT) or Psychodynamic Therapy (PDT)). Their improvement level was categorized as “Improved,” “No Change,” or “Worsened” after six months of treatment.

# Read in the data:  

therapy <- read_csv('https://github.com/byuistats/Math221D_Course/raw/refs/heads/main/Data/CBT_vs_PDT_Treatment_Data.csv')

QUESTION: Create a side-by-side bar chart that groups the bars based on therapy type and colors them by improvement level.

QUESTION: State your null and alternative hypothesis:

Ho:
Ha:

QUESTION: Perform a Chi-square test of independence:

QUESTION: What is your P-value?
ASNWER:

QUESTION: State your conclusion.
ANSWER:

QUESTION: Are the requirements for a Chi-square test of independence satisfied?
ANSWER:

Bird Populations

A wildlife manager would like to study the prevalence of Imperiled and Non-Imperiled bird species in different land designations (Protected, Multi-use, and Un-designated). She observes the number of species in different land designations.

# Read in the data

birds <- read_csv('https://github.com/byuistats/Math221D_Course/raw/refs/heads/main/Data/Imperiled_Bird_Habitats.csv')

QUESTION: Create a side-by-side bar chart that groups the bars based on Imperiled Status and colors them by Land Designation.

QUESTION: State your null and alternative hypothesis:

Ho:
Ha:

QUESTION: Perform a Chi-square test of independence:

QUESTION: What is your P-value?
ASNWER:

QUESTION: State your conclusion.
ANSWER:

QUESTION: Are the requirements for a Chi-square test of independence satisfied?
ANSWER: