Take 10 minutes to brainstorm with your table what the data inputs are and what visualizations you would like to create?
There were three democratic candidates. We are going to use the following plots to help us understand the constituency that helped the winner in the primaries.
#install.packages("nycflights13")
library(nycflights13)
fl_bp <- flights %>%
ggplot(aes(x = carrier, y = dep_delay))
fl_sc <- flights %>%
filter(dep_time > 800, dep_time < 900) %>%
ggplot(aes(x = dep_time, y = dep_delay))
fl_bp + geom_boxplot()
fl_sc + geom_point()
Get the above code working. We will be tweaking it using ggplot for the next part.
Complete the following - Create clean labels for the x and y axes and zoom in on the y-axis from 50 to 100 minutes. Also have breaks every 15 minutes
scale_x_
, scale_y
, coord_
- Setting breaks and changing labels or Transforming scaleComplete the following -
fl_sc
by origin
using the brewer scale.fl_sc
by arr_delay
**scale_color_
scale_fill_
scale_gradient_
Complete the following - 1) Color the points of fl_sc
by origin
using the brewer scale and use the directlabel package to move the labels into the plotting region.
library(directlabels)
geom_dl()
and direct.label()
Complete the following - Use a theme_()
to create a different look for your graphic and change the orientiation of the x-axis test to 35 degrees
library(ggthemes)
)ggsave()
Each of the aesthetics has a paired scale function - x, y, size, color, fill, linetype, shape, alpha. All of the scales start with scale_
and then the respective aesthetic. All the aesthetic scales have an _continuous
, _discrete
, and _manual
.
scale_x_
& scale_y_
are the two scales I most often usescale_fill_
& scale_color_
are the next most often used.
The library(ggrepel)
package is a must for our work. library(directlabels)
can also be helpful. Here is the book’s graphic.
Here is the book’s graphic.
Use the code from 28.3 and update their graphic to match mine.
library(ggrepel)
library(viridis)
best_in_class <- mpg %>%
group_by(class) %>%
filter(row_number(desc(hwy)) == 1)
ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(colour = class), size = 3) +
geom_point(size = 1.5, data = best_in_class, color = "white") +
geom_text_repel(aes(label = model, colour = class),
data = best_in_class, show.legend = FALSE,
nudge_x = -1, nudge_y = -2) +
theme_bw() + theme(panel.grid.minor = element_blank()) +
scale_color_viridis(discrete = TRUE) +
labs(x = "Engine displacement", y = "Miles per gallon (highway)",
color = "Vehicle type")
Data can get complicated very fast. How do we provide depth of variability understanding without overwhelming the visualization user?
geom_violin()
)ggbeeswarm::geom_quasirandom()
)lvplot::geom_lv()
. Here is a description.Another package that makes flipping the axes easier in ggplot – rotating axes (ggstance
)
Remember, data can get complicated very fast.
What don’t we like about this plot?
geom_violin()
and geom_quasirandom()
with the nycflights13::flights
data to show some variable distributions.