Summarizing Data - One Group

Summarizing Quantitative Data

Introduction

In these document, we explore 2 datasets. The first is data collected on the duration and the time between geyser eruptions at Old Faithful in Yellowstone National park. The second is data collected about expressions of gratitude and their impact on subjective well-being.

For each dataset, you will create numerical and visual summaries of the data.

Load libraries and Data

library(rio)
library(mosaic)
library(tidyverse)
library(car)


old_faithful <- rio::import("https://byuistats.github.io/M221R/Data/old_faithful.xlsx")

gratitude <- rio::import("https://byuistats.github.io/M221R/Data/gratitude.xlsx")

Old Faithful

Calculate the Summary Statistics for Duration

What is the mean duration time of Old Faithful eruptions?

What is the standard deviation of duration?

Create a Historgram for Duration

Create a histogram and describe the shape of the distribution of duration:

Calculate Summary Statistics for Wait time

Question: What is the mean wait time between eruptions?
Answer:

Question: What is the maximum wait time between eruptions?
Answer:

Question: The middle 50% of wait times will be between what 2 numbers?
Answer:

Gratitude

In this dataset, there is a column called treatment which has the group designation, and a single column called happiness which is the self-reported happiness score for the participants in the sample.

Data are often organized this way. In the next lesson, we will learn how to break down data summaries for different groups. For now, let’s get a sense for the distribution of self-reported happiness scores for everyone in the study, regardless of treatment group.

Create a summary statistics table and answer the questions below:

Question: What is the maximum self-reported happiness score?
Answer:

Question: What is the standard deviation of happiness scores?
Answer:

Question: In your own words, explain what the standard deviation means in context of this problem:
Answer:

Histogram

Create a histogram of happiness scores:

Question: What is the general shape of the distribution?
Answer: