Part 1: Leaflet Basics

Practice Using Leaflet for Interactive Maps

  • We will use leaflet to create interactive maps of the cumulative confirmed cases of COVID-19.
  • First, we will plot markers for the counties with the most confirmed cases.

Note, rather than doing copying and pasting, you may learn more by actually typing things out. This will force your mind to actually read over and digest the syntax.

First, let’s read in the COVID data using the code below. Replace the stars with the last date in the dataset or a date of your choice using the format m/d/y. (For example, March 9th, 2023 would be 3/9/23). This will give us a column called “confirmed” that contains the cumulative confirmed cases up to that date.

# Use this R-Chunk to import all your datasets!
covid <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv") %>% 
  rename(confirmed = `3/9/23`)

Take a minute to look at the data and understand it’s structure.

Plotting the Top Cities

We will plot the counties that had the most COVID cases (cumulative). You get to decide how many counties to plot. (I’d recommend somewhere between 20 and 40). The code in the next chunk selects just those counties and stores it in the a new dataset.

Fill in the blank in the code below so that the code will run.

# Use this R-Chunk to clean & wrangle your data!
covid_top<- covid %>% 
  slice_max(order_by = confirmed, n= 30) %>% 
  select(confirmed, everything())#This line reorders the columns so that `confirmed` is first, followed by everything else

Now create a plot with a marker placed in the location of each of the top cities.

Note the use of the ~ in order to reference a column name.

leaflet(data=covid_top) %>% addTiles() %>% 
  addMarkers(lng = ~Long_, lat = ~Lat)

Alternatively, you can change the underlying tiles of the base map to something provided by a third-party. Type the first three letters of the desired tiles in the blank in the code chunk below to use the auto-fill feature to find an option you like (they don’t all work because some require registration with the third-party). You can preview the tile maps and their names: http://leaflet-extras.github.io/leaflet-providers/preview/index.html.

Take a couple of minutes to play with different third-party tiles and find one that you like or that is unique.

#multiple solutions will work to fill in the blank
leaflet(data=covid_top) %>% addProviderTiles(providers$Esri.NatGeoWorldMap) %>% 
  addMarkers(lng = ~Long_, lat = ~Lat)

Consider the next chunk of code where we set some defaults for the map. What is this setView() doing and what values should you put in the blanks?

leaflet(data=covid_top) %>% addTiles() %>% 
  setView(lng = -99, lat = 40, zoom = 4) %>% 
  addMarkers(lng = ~Long_, lat = ~Lat)

Let’s add labels to the markers so we know the name of the county and the cumulative # of confirmed cases. In addition to blanks, this code has an error in it. Can you find and correct it? (Hopefully the error message provides a hint).

# Use this R-Chunk to plot & visualize your data!
leaflet(data=covid_top_cities) %>% 
  addTiles() %>% 
  setView(lng = -99, lat = 40, zoom = 4)  %>%  
  addMarkers(lng = ~Long_, 
             lat = ~Lat, 
             label = ~paste(Admin2, "=", confirmed, "cases"))

Circles on the map

Instead of markers we would like to use circles on the map. Let’s encode the number of cases into the size of the circles. (If the circles are too large, try dividing the number of cases by 10).

leaflet(data=covid_top_cities) %>% addTiles() %>% 
  setView(lng = -99, lat = 40, zoom = 4) %>% 
  addCircles(lng = ~Long_, lat = ~Lat, 
        radius = confirmed/10, #Divide by 10 so the circles aren't so large
        label = ~paste(Admin2, sep = "=", confirmed , "cases"))

In addition to circle size, we will encode the cumulative confirmed case count into the color of the circle. This is not as simple and straight forward as one would like. It is explained here: https://rstudio.github.io/leaflet/colors.html

First the set-up

pal<-colorNumeric(palette = c("yellow", "red"), 
                  domain = covid_top$confirmed/10)
pal #this is to see what pal contains/is

Take the time to answer the following questions about the above line of code:

  • What is the domain argument providing? What is it used for?
    • domain argument is providing the range of values that colors need to be assigned to
  • Why are we taking the log do you think (try the plot without logging it first)
    • If a log isn’t taken, the highest values are too high and you can’t get visibility into the lower values because their differences are dwarfed by the large values
  • What is pal?
    • pal is a function that you will call later to assign colors to specific values

Now to actually create the plot. Can you find the 2 errors?

leaflet(data=covid_top) %>% addTiles() %>% 
  setView(lng = -99, lat = 40, zoom = 4) %>% 
  addCircles(lng = ~Long_, lat = ~Lat, 
             radius = ~confirmed/10,
             label = ~paste0(Admin2," County = ", confirmed , " confirmed cases"), 
             fillOpacity = .1,
             color = ~pal(confirmed/10),  #color is the border color
             fillColor =  ~pal(confirmed/10)) #notice the domain above must match exactly what is fed to pal

Big Data and Spatial Data

Pause to read the following article: Spatial Data Often Becomes Big Data https://blogs.esri.com/esri/arcgis/2017/10/17/strategies-to-effectively-display-large-amounts-of-data-in-web-apps/

Identify the 4 stratagies for displaying map data. Consider recording them in your Readme.md

  1. Filter to only display the data that is important
  2. Aggregate the data into equal-sized hexagons (hexbins) to visualize patterns
  3. Use scale visibility and/or multiscale rendering to show information when it is visually helpful.
  4. Use clustering to group points into one symbol and apply Smart Mapping Styles

Time permitting, read through the Leaflet guide and find 1-2 additional map features you would like to play with.

Part 2: Leaflet with Layers

# Use this R-Chunk to load all your libraries!
library(tidyverse)
library(USAboundaries)
library(sf)
library(leaflet)

Our goal is to create a choropleth of the confirmed cases by state. We would like to use layers to show the progression over time. So we will have a different layer for each point in time. Specifically, we will want 4 layers, one for June 1st, August 1st, October 1st, and December 1st (all of the year 2020).

We will also want to add a layer that allows us to toggle on/off plotting of county borders.

Take 3-5 minutes to identify what major data wrangling steps need to take place in order for this to work.

If you get stuck, visit the leaflet guide https://rstudio.github.io/leaflet/. For help with a specific function use the ? function to read the help documentation.

# Use this R-Chunk to import all your datasets!
#covid now contains the cumulative confirmed case count by county for every day since the start of the pandemic
covid <- read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv")


#These two files read in sf objects that contain the geometry of the counties and states respectively
counties <- USAboundaries::us_counties() %>%  
  filter(name != "Alaska" & name != "Hawaii" & name != "Puerto Rico")
state48 <- USAboundaries::us_states() %>%  
  filter(name != "Alaska" & name != "Hawaii" & name != "Puerto Rico")

Data Wrangling

Step through each data verb/line of code to understand what it is doing and why. Fill in the blanks so that the code will run correctly.

covid_states_long <- covid %>% filter(Province_State %in% state.name) %>% 
  #recall that state.name is a vector of the names of the United States that comes built in to base R.
  select(Province_State,  `6/1/20`, `8/1/20`, `10/1/20`, `12/1/20`) %>% 
  group_by(Province_State) %>% 
  summarise(Dec = sum(`12/1/20`),
            Jun = sum(`6/1/20`),
            Aug = sum(`8/1/20`),
            Oct = sum(`10/1/20`)) %>% 
  pivot_longer(cols = c(Dec, Jun, Aug, Oct), values_to = "cases", names_to = "when")

Note the use of pivot_longer() above. The data needs to be in long form in order for the layering to work easily

The goal of this next line of code is to get the state boundaries geometry contained in state48 into the same data set as the long-form covid data.

covid_lay_long <- inner_join(covid_states_long, state48, by = c("Province_State" = "name"))

Plotting the data

Our dataset is ready for plotting. We will now set the stage/prepare/define the colors to be used in the choropleth. Ensure you understand each piece of the code and fill in the blanks so that it runs correctly.

#Create a color palette to assign colors to the various values
pal <- colorNumeric(palette = c("white", "orange", "red"),
                    domain =
 min(covid_lay_long$cases):max(covid_lay_long$cases))

We will now create the plot and add a control box to allow the viewer to control which layer they want to see.

leaflet() %>% 
  setView(lng = -99, lat = 40, zoom = 4) %>% 
  addProviderTiles(providers$Esri.NatGeoWorldMap) %>%
  
  addPolygons(data = st_as_sf(filter(covid_lay_long, when=="Jun")),
              group = "Jun",
              fillOpacity = .5,
              fillColor = ~pal(cases)) %>% 
  addPolygons(data = st_as_sf(filter(covid_lay_long, when == "Aug")),
              group = "Aug",
              fillOpacity = .5,
              fillColor = ~pal(cases)) %>% 
  addPolygons(data = st_as_sf(filter(covid_lay_long, when == "Oct")),
              group = "Oct",
              fillOpacity = .5,
              fillColor = ~pal(cases)) %>% 
  addPolygons(data = st_as_sf(filter(covid_lay_long, when == "Dec")),
              group = "Dec",
              fillOpacity = .5,
              fillColor = ~pal(cases)) %>% 
  addLayersControl(
    baseGroups = c("Jun", "Aug", "Oct", "Dec"),
    options = layersControlOptions(collapsed = FALSE)) 

After getting the above code running by filling in the blanks. Answer these questions:

  • What does st_as_sf() do, and why is it necessary?
    • If you run class(covid_lay_long) you will find it is not an sf object. However, addPolygons() needs an sf object. st_as_sf() converts covid_lay_long to a simple features (i.e. sf) object.
  • What happens when you change the collapsed = FALSE aregument to true?
    • The options to control the layers is not expanded or displayed by default.

Adding toggle on/off layer(s)

Now we want to add the option to toggle on/off a layer of county borders.

myleaflet <- leaflet() %>% 
  setView(lng = -99, lat = 40, zoom = 4) %>% 
  addProviderTiles(providers$Esri.NatGeoWorldMap) %>%
  
  addPolygons(data = st_as_sf(filter(covid_lay_long, when=="Jun")),
              group = "Jun",
              fillOpacity = .5,
              fillColor = ~pal(cases)) %>% 
  addPolygons(data = st_as_sf(filter(covid_lay_long, when == "Aug")),
              group = "Aug",
              fillOpacity = .5,
              fillColor = ~pal(cases)) %>% 
  addPolygons(data = st_as_sf(filter(covid_lay_long, when == "Oct")),
              group = "Oct",
              fillOpacity = .5,
              fillColor = ~pal(cases)) %>% 
  addPolygons(data = st_as_sf(filter(covid_lay_long, when == "Dec")),
              group = "Dec",
              fillOpacity = .5,
              fillColor = ~pal(cases)) %>% 
  addPolygons(data = counties, weight = 1,
              group = "counties",
              fill = FALSE, 
              color = "black") %>% 
  addLayersControl(
    baseGroups = c("Jun", "Aug", "Oct", "Dec"),
    overlayGroups = "counties",
    options = layersControlOptions(collapsed = FALSE))  

myleaflet

Think about it. If you wanted to add another layer to toggle on/off (say for example, state capitals), do you know how to do it? Describe the necessary steps / change in code to your group to a group member or friend.

Adding a legend

We start with the plot created with above code, stored in myleaflet. Fill in the blanks to add a legend that will show the meaning/scale of the color coding.

myleaflet %>% 
  addLegend(position = "topright", pal = pal, values = covid_lay_long$cases,
            title = "Four months of comparison",
            #labFormat = labelFormat(suffix = "%"),
            opacity = 1)