12  Combining Height Files

Readings

Scan the reading to learn what types of files these packages are used to read in.

  1. Haven package
  2. foreign R package and read.dbf()
  3. cheatsheet

Be sure to do the Readings and Guided Instruction for the previous task if you have not yet.

Guided Instruction

Complete the following if you have not yet: Data structures and importing data

The Scientific American argues that humans have been getting taller over the years. As the data scientists that we are becoming, we would like to find data that validates or refutes this concept. Our challenge is to show different male heights across the centuries.

This time, instead of looking at the mean height per country over time like we did for the previous task, we have a few files that contain heights of individuals. Each file represents a different time and/or place from which the individuals are sampled. We will combine the data from these files into one dataset to facilitate our visualization.

  1. Work with these datasets where each row represents an individual. Import these five datasets into R.

  2. Wrangle each dataset so that it contains the following columns: birth_year, height.in, height.cm, and study.

    • You will have to potentially do some renaming and conversions between inches and centimeters.
    • You need to create the “study” column yourself to identify which dataset the rows came from.
    • For each dataset select(birth_year, height.in, height.cm, study)
  3. Use the bind_rows() function to combine your five individual datasets into one dataset.Each dataset must have the columns in the same order for this to work.

  4. Write a short paragraph summarizing the data wrangling process you had to go through to create your tidy dataset. Include in that discussion any decisions you had to make about what data to exclude.

  5. Make a plot of the five studies containing individual heights to examine the question of height distribution across centuries.

  6. Write at least two paragraphs to address the following:

    • How does the story told by this data compare to the story told by the data in the previous task? Do they agree or do they contradict? If they contradict, reason through the contradiction and try to make sense of it.
    • How would you respond to the assertion that humans are getting taller over time based on the datasets in these two tasks involving height?
    • Be sure to provide an overall conclusion about where you stand on the question.
  7. Render the .qmd file. Push all the files created in the rendering process into your GitHub repository.

Submit

In I-learn submit a link to the .md file on GitHub.