Practice: Numerical Summaries (Quant.)

Introduction

This is an opportunity for you to practice creating numerical summaries of quantitative data.

We will explore 2 datasets. The first contains information about the duration and the time between geyser eruptions at Old Faithful in Yellowstone National park.

The second is data collected about expressions of gratitude and their impact on subjective well-being.

For each dataset, you will create numerical summaries of the data.

Load libraries and Data

library(rio)
library(mosaic)
library(tidyverse)
library(car)


old_faithful <- rio::import("https://byuistats.github.io/M221R/Data/old_faithful.xlsx")

gratitude <- rio::import("https://byuistats.github.io/M221R/Data/gratitude.xlsx")

Old Faithful

Calculate the Summary Statistics for Duration

favstats()
Error in favstats(): argument "x" is missing, with no default

QUESTION: What is the mean duration time of Old Faithful eruptions?
ANSWER:

QUESTION: What is the standard deviation of duration?
ANSWER:

Calculate Summary Statistics for Wait time

Question: What is the mean wait time between eruptions?
Answer:

Question: What is the maximum wait time between eruptions?
Answer:

Question: The middle 50% of wait times will be between what 2 numbers?
Answer:

Gratitude

In this dataset, there is a column called treatment which has the group designation, and a single column called happiness which is the self-reported happiness score for the participants in the sample.

Create a summary statistics table and answer the questions below:

Question: What is the maximum self-reported happiness score?
Answer:

Question: What is the standard deviation of happiness scores?
Answer:

Question: Explain what the standard deviation means in context of this problem.
Answer: