Authors

Andree Valle Campos

Laure Vancauwenberghe

Kene David Nwosu

Published

December 1, 2024

1 Prerequisites

This lesson requires the following packages:

Code
if(!require('pacman')) install.packages('pacman')
pacman::p_load_gh('afrimapr/afrilearndata')
pacman::p_load(rnaturalearth,
               ggspatial,
               ggplot2,
               mdsr,
               tidyverse,
               dplyr,
               sf)

2 Learning objectives

  1. Identify two types of Thematic maps (choropleth and dot maps) used by epidemiologist to visualize Geospatial data.

  2. Create Thematic maps using {ggplot2} and the geom_sf() function.

  3. Relate each Thematic map with a Geometry type.


3 Choropleth map

What is it?

Colors or shadings represent geographic regions in relation to the value of a variable.

• E.g. Larger values could be indicated by a darker color, and smaller values by a lighter color.

How to plot it?

• Geospatial data can be plotted using ggplot2::geom_sf().

Colors and shapes can be depicted with the aes() function,

• Using the fill, color and size arguments.

With Quantitative data

• A Quantitative Choropleth map requires the fill argument.

• Let’s create one with the africountries dataset from {afrilearndata}!

• Administrative boundaries of all the countries in the African continent.

  1. Use ggplot2::geom_sf() to plot African countries, only.
  2. fill each of them according to their population (pop_est):
Code
# WRITE YOUR CODE HERE
library(sf)
library(afrilearndata)

africountries %>%
  mutate(geometry = st_as_text(geometry)) %>% 
  as_tibble()
# A tibble: 51 × 12
   name       name_long pop_est gdp_md_est lastcensus income_grp iso_a3 geometry
   <chr>      <chr>       <dbl>      <dbl>      <dbl> <chr>      <chr>  <chr>   
 1 Angola     Angola     1.28e7     110300       1970 3. Upper … AGO    MULTIPO…
 2 Burundi    Burundi    8.99e6       3102       2008 5. Low in… BDI    MULTIPO…
 3 Benin      Benin      8.79e6      12830       2002 5. Low in… BEN    MULTIPO…
 4 Burkina F… Burkina …  1.57e7      17820       2006 5. Low in… BFA    MULTIPO…
 5 Botswana   Botswana   1.99e6      27060       2011 3. Upper … BWA    MULTIPO…
 6 Central A… Central …  4.51e6       3198       2003 5. Low in… CAF    MULTIPO…
 7 Côte d'Iv… Côte d'I…  2.06e7      33850       1998 4. Lower … CIV    MULTIPO…
 8 Cameroon   Cameroon   1.89e7      42750       2005 4. Lower … CMR    MULTIPO…
 9 Dem. Rep.… Democrat…  6.87e7      20640       1984 5. Low in… COD    MULTIPO…
10 Congo      Republic…  4.01e6      15350       2007 4. Lower … COG    MULTIPO…
# ℹ 41 more rows
# ℹ 4 more variables: name_fr <chr>, name_pt <chr>, name_af <chr>,
#   name_sw <chr>
Code
ggplot(data = africountries) +
  geom_sf(mapping = aes(fill = pop_est))

sf stands for “simple features”, an open standard used to represent a wide range of geometric shapes.

Create a Choropleth map with the world data from the {spData} package, using geom_sf(), to portrait its countries and fill them in relation to its population, available in the pop variable.

Code
library(spData)

# Write and visualize your answer:
q1 <- world %>% 
  ggplot() +
  geom_sf(mapping = aes(fill = pop))

With Categorical data

• Let’s keep using fill.

• Create a map with countries colored by their Economical classification (economy):

Code
countries <- rnaturalearth::ne_countries(returnclass = "sf")

ggplot(data = countries) +
  geom_sf(mapping = aes(fill = economy))

• Before using geom_sf(), verify that your Spatial data is an "sf" R object:

Code
library(sf)
library(afrilearndata)

class(africountries)
[1] "sf"         "data.frame"

• More about sf in the following lessons!

How to use it?

• Visualize how one variable changes across a defined regions.

Figure 2. Choropleth map with the number of cases in a simulated Ebola epidemic in Sierra Leone.

• Region of interest (Sierra Leone)

• Partitioned into a finite number of subregions (districts)

• Number of cases aggregated at that level.

Choropleth maps visualize a shape called Polygons.

• Collects data from an enclosed region.

• Partitioned into a finite number of areal units with defined boundaries.

• E.g., data collected by ZIP code, census tract, or the administrative boundary levels of a country (Figure 2).


4 Dot map

What is it?

• Thematic map type that uses dots to represent attribute values.

How to plot it?

• The Dot map could use the size or color argument.

With Quantitative data

• A Quantitative Dot map requires the size argument.

• Let’s create a Dot map!

• Use the africapitals dataset, it contains the location of capital cities in the African continent.

  1. Use ggplot2::geom_sf() to plot these locations,
  2. and size them according to their number of inhabitants:
Code
library(sf)
library(afrilearndata)

africapitals %>% as_tibble()
# A tibble: 51 × 5
   capitalname  countryname                  pop iso3c       geometry
   <chr>        <chr>                      <int> <chr>    <POINT [°]>
 1 Abuja        Nigeria                   178462 NGA      (7.17 9.18)
 2 Accra        Ghana                    2029143 GHA      (-0.2 5.56)
 3 Addis Abeba  Ethiopia                 2823167 ETH     (38.74 9.03)
 4 Algiers      Algeria                  2029936 DZA     (3.04 36.77)
 5 Antananarivo Madagascar               1463754 MDG   (47.51 -18.89)
 6 Asmara       Eritrea                   578860 ERI    (38.94 15.33)
 7 Bamako       Mali                     1342519 MLI    (-7.99 12.65)
 8 Bangui       Central African Republic  547668 CAF     (18.56 4.36)
 9 Banjul       Gambia                     34388 GMB    (-16.6 13.46)
10 Bissau       Guinea-Bissau             404119 GNB    (-15.6 11.87)
# ℹ 41 more rows
Code
ggplot(data = africapitals) +
  geom_sf(mapping = aes(size = pop))

We can replicate John Snow’s Dot map with the number of deaths per household from the 1854 London cholera outbreak:

Code
cholera_deaths <- 
  read_rds(here("data/cholera_deaths.rds"))

ggplot(data = cholera_deaths) +
  geom_sf(mapping = aes(size = Count), alpha = 0.7)

With Categorical data

• Visualize airports classified by type using the color argument:

Code
airports <- rnaturalearth::ne_download(scale = 10, type = "airports", returnclass = "sf")

ggplot(data = airports) +
  geom_sf(mapping = aes(color = type))

Create a Thematic map with the afriairports object to portrait all its airport locations, using geom_sf(), and color them in relation to the type variable.

Code
# Write and visualize your answer:
q2 <- ggplot(data = afriairports) +
  geom_sf(mapping = aes(color = type))

How to use it?

• To visualize the scatter of your data and visually scan for clusters.

Figure 3. Dot map. Location of simulated Ebola cases in Sierra Leone, colored by each case outcome.

Dot maps visualize a shape called Point.

• Collects data that register the locations of random events.

• E.g., geographical coordinates of individuals with a given diagnosis (Figure 3).

• Bothered by having just dots and no geographical context?

• That’s good! We will see how to add those using roads and rivers very soon.

• Thematic maps visualize specific Geometry types:

• Choropleth maps visualize Polygons.

• Dot maps visualize Points.

Figure 4. Geometry types for Choropleth and Dot maps.

Which of the following options of Thematic map types:

  1. "choropleth_map"
  2. "dot_distribution_map"

…corresponds to each of these Epidemic map figures seen below?

Code
# Write your answers here as comments:
# Malaria cases in Africa : a
# COVID-19 cases in the world: b
  1. Malaria cases in Africa:

  1. COVID-19 cases in the world:


5 Wrap up

• We learned about Thematic maps,

• How to create them using ggplot2::geom_sf(),

• Which type of Geometry they visualize.

Figure 5. Concept map #1.

• But, how can we complement Thematic maps with geographic context?

• We are going to learn about how to add Physical features to our maps.


Contributors

The following team members contributed to this lesson:


References

Some material in this lesson was adapted from the following sources:

This work is licensed under the Creative Commons Attribution Share Alike license. Creative Commons License


Answer Key

Q1

Code
q1 <- ggplot(data = world) + 
  geom_sf(aes(fill = pop))
q1

Q2

Code
q2 <- ggplot(data = afriairports) + 
  geom_sf(aes(fill = type))
q2

Q3 & Q4

Which of the following options of Thematic map types corresponds to each of these Epidemic map figures? Your answer should be either “choropleth_map” or “dot_distribution_map”.

  1. [**Malaria cases in Africa]** is a CHROPLETH MAP
  2. [COVID-19 cases in the world] is a DOT DISTRIBUTION MAP