Getting control of Factors

J. Hathaway

Thought for the day

What I see: Greatness

What you may think is happening.

Team Discussion

Case Study 6: The collapse of construction in Idaho

Case Study 7: Counting names in scripture

Task 13: Controlling categorical variables (factors)

Discussing Use Cases

Call Center all

Who can tell the story of the call center data visualizations?

Of course, I was told the mean time to closure was some number of minutes, either 2 or 20 or 200 or something, I forget; it really doesn’t matter for this discussion. They told me the mean, so naturally I asked for the raw, atomic-level data.

The data dive

  • They gave me the data: a printout from an SQL routine that told me, accurate to twenty decimal places (I am not making this up!), the mean time to closure.
    • No, I need the data that you used to get these means; do you have that data?
  • After several weeks, I was given a data set with hundreds of call durations.
    • Do you have the start and stop times from which you calculated these durations, the actual times the calls came in and when the cases were opened and closed?
  • After several more weeks, I finally got the data: among other things, start and stop times for each of the calls.

The call center data graphics

Factoring in control

Using Factors to improve communication

Now that we have learned about factors let’s take some time to fix our Case Study 6 work.

  1. Let’s correctly sort our x-axes and then include both bars when we are making bar plots.
  2. Let’s fix our axis labels and legends.
  3. Is there something better we can do than bar-plots?

Your code or My code

If your code from case study 6 has enough to address the questions you can use your own code. If not let’s use mine.

Hathaway code

Sorting and including factor levels

with tidyr and dplyr

with ggplot2

Sometimes setting the drop variable to FALSE can fix our problem.

Fixing axes

Making new R package friends

The challenge

  1. Pick one of the R packages on the following slide
  2. Read material on the R package
  3. Build a working script that demonstrates the use of the R package.
  4. Write up a short presentation on the package.

The packages