Plot Labels with ggplot2

Author

The Graph Network

Published

January 1, 2025

1 Introduction

Bar plots are one the most common chart type out there and come in several varieties. In the previous lesson, we learned how to make bar plots and their circular counterparts with {ggplot2}.

In this lesson, we’ll delve into the intricacies of labeling bar plots, focusing on geom_label() and geom_text() functions from {ggplot2}.

2 Learning Objectives

After this lesson, you will be able to:

  1. Use two different text geoms to label ggplots:
    • geom_text() for simple labels
    • geom_label() for emphasized labels
  2. Appropriately transform and summarize data in the appropriate format for different chart types.
  3. Adjust text placement to position labels on stacked, Dodged, and percent-stacked bar plots.
  4. Adjust text placement to position labels on pie charts and donut plots.

2.1 Packages

Load packages for the lesson using the pacman::p_load() function.

‣ Packages: tidyverse, here, patchwork, medicaldata.

Code
# Load required packages: tidyverse, here, patchwork, medicaldata
pacman::p_load(tidyverse, here, patchwork, medicaldata)

2.2 Introduction to text geoms in {ggplot2}

‣ Focus on geom_text() for simple labeling and geom_label() for emphasized labels.

‣ Apply these geoms on bar plots, stacked bars, dodged bars, normalized stacked bars, and circular plots.

‣ Start with a simple bar plot using fake data for practice.

Code
# Create example data frame with categories A, B, C and counts 10, 20, 15
data <- data.frame(
  category = c("A", "B", "C"),
  count = c(10, 20, 15)
)

# Generate a bar plot using ggplot2, mapping category to x and count to y, 
# Use geom_col() and fill with "steelblue"
# and geom_text() to add labels
ggplot(data, mapping = aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  geom_text(aes(label = count))

‣ As you can see however, the placement of our text is odd. Can fix with nudge_y. Also increase text size.

Code
# Use geom_text in ggplot to add labels
# Address label placement and size
ggplot(data, mapping = aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  geom_text(aes(label = count), nudge_y = 2, size = 5)

‣ Explore negative values of nudge_y to move text down.

Code
# Experiment with negative nudge_y values to move text down
ggplot(data, mapping = aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  geom_text(aes(label = count), nudge_y = -1, size = 5)

‣ For horizontal bar plots, use nudge_x to adjust text horizontally.

Code
# Adjust horizontal position of text using nudge_x in horizontal bar plot
ggplot(data, mapping = aes(y = category, x = count)) +
  geom_col(fill = "steelblue") +
  geom_text(aes(label = count), nudge_x = 1, size = 5)

‣ We can also use the geom_label() function for text labels with background rectangles.

Code
# Replace geom_text with geom_label for labels with background
ggplot(data, mapping = aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  geom_label(aes(label = count), nudge_y = -2, size = 5)

‣ We can fill and color aesthetics in geom_label() for background and text color.

Code
# Adjust fill and color in geom_label for background and text color changes
ggplot(data, mapping = aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  geom_label(aes(label = count),
             fill = "lightblue",
             color = "darkblue",
             nudge_y = -2, size = 5)

Q: Simple labeling

Consider the following sample data frame:

Code
# Create example data frame
district_cases <- data.frame(
  district = c("A", "B", "C"),
  cases = c(10, 20, 15)
)

district_cases
  district cases
1        A    10
2        B    20
3        C    15

Create a labeled bar plot of the data frame above, where the x-axis is the district and the y-axis is the number of cases. The labels should be the number of cases, and should be placed above the bars. The labels should have “darkblue” text with a “lightblue” background. The bar color should be “steelblue”.

Code
ggplot(district_cases, mapping = aes(x = district, y = cases)) +
  geom_col(fill = "steelblue") +
  geom_label(aes(label = cases),
             fill = "lightblue",
             color = "darkblue",
             nudge_y = -2, size = 5)

PRO-TIP: Setting a custom theme to streamline your plotting process.

‣ Combine an existing theme with your own customizations: create a custom theme combining theme_dark with large bold axis labels:

Code
# Define a custom theme with theme_light and large bold axis titles
theme_dark_custom <- 
  theme_dark() +
  theme(axis.title = element_text(size = 16, face = "bold"))

‣ Can use this on a single plot as follows:

Code
# Apply custom theme to a single plot
ggplot(data, mapping = aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  geom_label(aes(label = count),
             fill = "lightblue",
             color = "darkblue",
             nudge_y = -2, size = 5) +
  theme_dark_custom

‣ Set the custom theme as the default for all plots:

Code
# Set the custom theme as the default for all plots
theme_set(theme_dark_custom)

‣ Observe the effect: theme_dark_custom automatically applies to every new plot.

‣ Example: Redraw an earlier plot with the new default theme.

Code
# Redraw an earlier plot using the new global theme
ggplot(data, aes(x = category, y = count)) +
  geom_col(fill = "steelblue") +
  geom_label(aes(label = count), 
            nudge_y = -3,
            fill = "royalblue4",
            color = "white")

‣ To set the default theme back to the original, use theme_set(theme_gray()).

Code
theme_set(theme_gray())

2.3 The vjust and hjust arguments

‣ Explore vjust and hjust in ggplot2 for adjusting text position.

‣ These arguments control vertical and horizontal justification of text.

2.4 Understanding hjust (horizontal justification)

hjust modifies horizontal position of text relative to anchor points.

hjust values range from 0 to 1:

‣ `hjust = 0`: Left edge aligns with anchor point.

‣ `hjust = 0.5`: Centered on anchor point.

‣ `hjust = 1`: Right edge aligns with anchor point.
Code
# Example to show hjust in action
# Use geom_text to display different hjust values on a plot

# Example data
df <- data.frame(x = 1, y = 1)

# Base plot with a point
base_p <- ggplot(df, aes(x, y)) + geom_point() + theme_void()

base_p + geom_text(aes(label = "text"))

Code
p_hjust_0 <- base_p + geom_text(aes(label = "hjust=0"), hjust = 0)
p_hjust_0.25 <- base_p + geom_text(aes(label = "hjust=0.25"), hjust = 0.25)
p_hjust_0.5 <- base_p + geom_text(aes(label = "hjust=0.5"), hjust = 0.5)
p_hjust_0.75 <- base_p + geom_text(aes(label = "hjust=0.75"), hjust = 0.75)
p_hjust_1 <- base_p + geom_text(aes(label = "hjust=1"), hjust = 1)

# Combine plots with patchwork
p_hjust_0 / p_hjust_0.25 / p_hjust_0.5 / p_hjust_0.75 / p_hjust_1

hjust < 0 and hjust > 1 for extra left/right alignment/padding.

Code
p_hjust_neg0.5 <- base_p + geom_text(aes(label = "hjust=-0.5"), hjust = -0.5)
p_hjust_neg0.2 <- base_p + geom_text(aes(label = "hjust=-0.2"), hjust = -0.2)
p_hjust_1.2 <- base_p + geom_text(aes(label = "hjust=1.2"), hjust = 1.2)
p_hjust_1.5 <- base_p + geom_text(aes(label = "hjust=1.5"), hjust = 1.5)

# Combine plots with patchwork
p_hjust_neg0.5 / p_hjust_neg0.2 / p_hjust_0 / p_hjust_0.25 / p_hjust_0.5 / p_hjust_0.75 / p_hjust_1 / p_hjust_1.2 / p_hjust_1.5

Q: Horizontal adjusment practice

Consider the following horizontal bar plot with text labels added:

Code
# sample data
sample_gender <- 
  data.frame(gender = c("F", "M"), 
             n = c(10, 20))

ggplot(sample_gender, aes(x = n, y = gender)) + 
  geom_col() + 
  geom_text(aes(label = n), hjust = 1.2)

Use the hjust or vjust arguments to adjust the position of the text label so that it is inside the bar, with some padding on the right side.

‣ Be cautious with hjust outside 0-1 range for varying label lengths. Can be problematic when your labels are not the same length:

Code
# Different text labels with varying lengths
p_xx <- base_p + geom_text(aes(label = "xxx"), hjust = 1.5)
p_xxxx <- base_p + geom_text(aes(label = "xxxxxx"), hjust = 1.5)
p_xxxxxx <- base_p + geom_text(aes(label = "xxxxxxxxx"), hjust = 1.5)

# Combine plots with patchwork
p_xx / p_xxxx / p_xxxxxx

‣ When this matters, you can use nudge_x and nudge_y as an alternative for consistent label positioning.

2.5 Understanding vjust (vertical justification)

vjust adjusts vertical position of text labels.

vjust values also from 0 to 1:

‣ `vjust = 0`: Bottom edge aligns with anchor point.

‣ `vjust = 0.5`: Centered vertically.

‣ `vjust = 1`: Top edge aligns with anchor point.
Code
# Base plot with a point
p <- ggplot(df, aes(x, y)) + geom_point() + theme_void()

p + geom_text(aes(label = "text"))

Code
p_vjust_0 <- p + geom_text(aes(label = "vjust=0"), vjust = 0)
p_vjust_0.25 <- p + geom_text(aes(label = "vjust=0.25"), vjust = 0.25)
p_vjust_0.5 <- p + geom_text(aes(label = "vjust=0.5"), vjust = 0.5)
p_vjust_0.75 <- p + geom_text(aes(label = "vjust=0.75"), vjust = 0.75)
p_vjust_1 <- p + geom_text(aes(label = "vjust=1"), vjust = 1)

# Combine plots with patchwork
p_vjust_0 / p_vjust_0.25 / p_vjust_0.5 / p_vjust_0.75 / p_vjust_1

‣ Like hjust, vjust can go beyond 0-1 to add padding:

Code
p_vjust_neg0.5 <- p + geom_text(aes(label = "vjust=-0.5"), vjust = -0.5)
p_vjust_1.5 <- p + geom_text(aes(label = "vjust=1.5"), vjust = 1.5)

# Combine plots with patchwork
p_vjust_neg0.5 / p_vjust_0 / p_vjust_0.25 / p_vjust_0.5 / p_vjust_0.75 / p_vjust_1 / p_vjust_1.5

Q: Vertical adjusment practice

Consider the following bar plot with text labels added:

Code
# sample data
sample_gender <- 
  data.frame(gender = c("F", "M"), 
             n = c(10, 20))


ggplot(sample_gender, 
       aes(x = gender, y = n)) + 
  geom_col() + 
  geom_text(aes(label = n), color = "white", vjust = 2)

Use the hjust or vjust arguments to adjust the position of the text label so that it is inside the bar, with some padding on the top.

2.6 Data Example: TB Treatment Outcomes in Benin

‣ We will apply our learned skills to the tb_outcomes dataset.

‣ This dataset was also used in a previous lesson.

Code
tb_outcomes <- read_csv(here::here('data/benin_tb.csv'))
tb_outcomes
# A tibble: 711 × 6
   period period_date hospital        outcome     cases diagnosis_type 
   <chr>  <date>      <chr>           <chr>       <dbl> <chr>          
 1 2015Q4 2015-10-01  St Jean De Dieu failed          0 bacteriological
 2 2015Q4 2015-10-01  St Jean De Dieu unevaluated     0 bacteriological
 3 2015Q4 2015-10-01  St Jean De Dieu died            0 bacteriological
 4 2015Q4 2015-10-01  St Jean De Dieu lost            0 bacteriological
 5 2015Q4 2015-10-01  St Jean De Dieu completed       0 bacteriological
 6 2015Q4 2015-10-01  St Jean De Dieu cured          11 bacteriological
 7 2015Q3 2015-07-01  St Jean De Dieu failed          0 bacteriological
 8 2015Q3 2015-07-01  St Jean De Dieu unevaluated     0 bacteriological
 9 2015Q3 2015-07-01  St Jean De Dieu died            0 bacteriological
10 2015Q3 2015-07-01  St Jean De Dieu lost            0 bacteriological
# ℹ 701 more rows

‣ Our goal: to plot the number of TB cases per hospital.

‣ The total number of cases per hospital isn’t directly available.

‣ We need to first summarize the cases data.

Code
hospital_sums <- tb_outcomes %>% 
  group_by(hospital) %>% 
  summarise(cases = sum(cases))
# Calculate total number of cases per hospital
# Group by hospital, then summarize cases = sum(cases)

‣ We’ll visualize the total number of cases per hospital using hospital_sums.

geom_text() will be used to annotate the bars in the plot.

Code
ggplot(hospital_sums, aes(x=hospital, y = cases)) +
  geom_col(fill="steelblue") +
  geom_text(aes(label = cases), vjust = -0.2)

Code
# Plot using ggplot
# Use geom_col for bars and geom_text for annotations

Q: Summarize then plot

Consider the aus_tb_notifs dataset imported below, which shows the number of TB cases in urban and rural areas per quarter:

Code
aus_tb_notifs <- read_csv(here::here('data/aus_tb_notifs_modified.csv'))
aus_tb_notifs 
# A tibble: 52 × 4
    year quarter rural urban
   <dbl> <chr>   <dbl> <dbl>
 1  2010 Q1          4    87
 2  2010 Q2          4    98
 3  2010 Q3          5   101
 4  2010 Q4         10   124
 5  2011 Q1          5    81
 6  2011 Q2          4    52
 7  2011 Q3          9   102
 8  2011 Q4          5   100
 9  2012 Q1          9    80
10  2012 Q2          4    63
# ℹ 42 more rows

Create a simple bar plot to visualize the total number of TB cases in urban areas for each year. Label each bar with the total number of cases using geom_text() just below the bar.

Hint: First, aggregate the data by year and sum up the urban cases. Then use ggplot() with geom_col() for the bar plot and geom_text() for the labels.

Code
urban_case <- aus_tb_notifs %>% 
  group_by(year) %>% 
  summarise(urban_cases = sum(urban))
 
ggplot(urban_case, aes(x=year, y=urban_cases)) +
  geom_col(fill="steelblue") +
  geom_text(aes(label = urban_cases), vjust=-0.2)

Further Aesthetic Modifications

‣ Additional aesthetics include:

  • size: the size of the text, in mm
  • angle: the angle of the text, from 0 to 360
  • alpha: the transparency of the text, from 0 to 1
  • color: the color of the text
  • family: the font family of the text, such as “sans”, “serif”, “mono”
  • fontface: the font face of the text, including “plain”, “bold”, “italic”, “bold.italic”
  • group: a grouping variable for the text
  • hjust: horizontal justification of the text
  • vjust: vertical justification of the text
  • lineheight: the line height of the text

‣ Experiment with these aesthetics to understand their impact on the plot.

Code
ggplot(hospital_sums, aes(x = hospital, y = cases)) +
  geom_col(fill = "steelblue") +
  geom_text(aes(label = paste(cases, "\ncases")), 
            size = 5,
            angle = 0,
            alpha = 0.5,
            color = "black",
            family = "mono",
            fontface = "bold",
            hjust = 0.5,
            vjust = 1,
            nudge_y = -10,
            lineheight = 0.8) + 
  theme(axis.text.x = element_text(angle = 90))

2.7 Labeling Stacked Bar Plots

‣ We explore labeling in stacked bar plots with two categorical variables.

‣ We use the tb_outcomes dataset, summarizing by period_date and diagnosis_type.

Code
tb_sum <- tb_outcomes %>% 
  group_by(period_date, diagnosis_type) %>% 
  summarise(cases = sum(cases))
# Summarize the data by period and diagnosis type
# Group by period_date and diagnosis_type
# Summarize cases = sum(cases)

‣ Create a simple stacked bar plot using ggplot2.

Code
quarter_dx_bar <- ggplot(tb_sum, aes(x=period_date, y=cases, fill= diagnosis_type)) +
  geom_col() +
  labs(title = "New and relapse TB cases per quarter",
       subtitle = "Data from six health facilities in Benin, 2015-2017")
# Basic stacked bar plot called quarter_dx_bar
  # Use aes(x = period_date, y = cases, fill = diagnosis_type)
  # geom_col() for bar plot
  # labs(title = "New and relapse TB cases per quarter",
  #     subtitle = "Data from six health facilities in Benin, 2015-2017")

‣ We aim to add text labels to each bar segment using cases.

Code
quarter_dx_bar +
  geom_text(aes(label = cases))

Code
# Add text labels to the bar plot
  # geom_text() with aes(label = cases)

‣ Notice the misalignment of labels - they don’t match bar heights.

‣ To correct this, set position = "stack" in geom_text().

Code
quarter_dx_bar +
  geom_text(aes(label = cases), position = "stack")

Code
# Correct label position
  # Add position = "stack" to geom_text()

‣ For vertical alignment inside bars, we can use vjust in geom_text().

Code
quarter_dx_bar +
  geom_text(aes(label = cases), position = "stack", vjust = 1.5)

Code
# Reposition labels inside the bars
# Add vjust = 1.5 

‣ But what if we want to center the labels in each bar segment? Can use position_stack(vjust = 0.5).

Code
quarter_dx_bar +
  geom_text(aes(label = cases), position = position_stack(vjust = 0.5))

Code
# Center labels vertically in each bar segment
# Adjust position_stack() parameter vjust to 0.5

‣ This approach is effective for horizontal bar plots - flip axes with coord_flip().

Code
quarter_dx_bar +
  geom_text(aes(label = cases), position = position_stack(vjust = 0.5),
            color = "white",
            fontface = "bold") +
  coord_flip()

Code
# Horizontal bar plot with flipped axes and label modifications
  # Add color and fontface adjustments to geom_text()
  # Use coord_flip() for horizontal orientation

Q: Practice with labeling stacked plots

Create a stacked bar plot showing the distribution per year of TB cases in rural and urban areas using the aus_tb_notifs dataset. Use geom_text() and adjust the position of the labels for clarity.

Hint: Pivot the data so that area_type is a column, then summarize the data by year and area_type, calculating the sum of cases (cases) for each group. The pivoting is done for you in the code below.

Code
# Pivot the data
tb_longer <- aus_tb_notifs %>%
  pivot_longer(cols = c(rural, urban),
               names_to = "area_type", 
               values_to = "cases")

# Summarize the data by year and area type
sum_tb_long <- tb_longer %>% 
  group_by(year, area_type) %>% 
  summarise(cases = sum(cases))

# Create the stacked bar plot
ggplot(sum_tb_long, aes(x = year, y=cases, fill=area_type)) +
  geom_col() +
  geom_text(aes(label = cases), position = position_stack(vjust = 0.5))

2.8 Labeling Dodged Bar Plots

‣ Exploring dodged bar charts: multiple categories displayed side by side.

‣ Start by grouping tb_outcomes dataset by hospital and diagnosis_type.

‣ Calculate the sum of cases (cases) for each group.

Code
hospital_dx_cases <- tb_outcomes %>% 
  group_by(hospital, diagnosis_type) %>% summarise(cases = sum(cases))
# Group tb_outcomes by hospital and diagnosis_type
# Calculate cases = sum(cases)
# Store in hospital_dx_cases

‣ Create a dodged bar chart, setting position = "dodge" in geom_col()

Code
hospital_dx_bar <- hospital_dx_cases %>% ggplot(aes(x = hospital, y = cases, fill = diagnosis_type)) +
  geom_col(position = "dodge")
# Create dodged bar chart with tb_outcomes
  # Use ggplot, setting x to hospital, y to cases, and fill to diagnosis_type
  # Add geom_col with position set to dodge
  # Assign to hospital_dx_bar

‣ Annotate chart with geom_text() to display the labels.

Code
hospital_dx_bar +
  geom_text(aes(label = cases))

Code
# Add geom_text to hospital_dx_bar
  # Set aes label to cases

‣ Not quite right! Need to adjust geom_text() with position = position_dodge() for proper alignment with bars.

Code
hospital_dx_bar +
  geom_text(aes(label = cases), position = position_dodge())

Code
# Adjust geom_text in hospital_dx_bar
  # Use position_dodge()

Important: Specify a width argument for position_dodge().

‣ Use width = 0.9 for geom_text(), as this is the default width for geom_col().

Code
hospital_dx_bar +
  geom_text(aes(label = cases), position = position_dodge(width = 0.9))

Code
# Adjust geom_text in hospital_dx_bar
# Use position_dodge() with width = 0.9

‣ Finally, shift labels up slightly with vjust for better visualization.

Code
hospital_dx_bar +
  geom_text(aes(label = cases), position = position_dodge(width = 0.9),
            vjust = -0.2)

Code
# Final adjustment to geom_text
# Use position_dodge with width = 0.9 and adjust vjust

Q: Practice with labeling dodged bar plots

Generate a dodged bar plot that displays rural and urban TB cases side by side for each year using the aus_tb_notifs dataset. Label each bar using geom_text(), ensuring the labels are correctly aligned.

You can use the code and comments below as a guide:

Code
# Pivot the data
aus_tb_notifs %>%
  pivot_longer(cols = c(rural, urban),
               names_to = "area_type",
               values_to = "cases")
# A tibble: 104 × 4
    year quarter area_type cases
   <dbl> <chr>   <chr>     <dbl>
 1  2010 Q1      rural         4
 2  2010 Q1      urban        87
 3  2010 Q2      rural         4
 4  2010 Q2      urban        98
 5  2010 Q3      rural         5
 6  2010 Q3      urban       101
 7  2010 Q4      rural        10
 8  2010 Q4      urban       124
 9  2011 Q1      rural         5
10  2011 Q1      urban        81
# ℹ 94 more rows
Code
# then summarize the data by year and area type
sum_tb_long <- tb_longer %>% 
  group_by(year, area_type) %>% 
  summarise(cases = sum(cases))

# then create the dodged bar plot
# for the text, use position = position_dodge(width = 0.9)
ggplot(sum_tb_long, aes(x=year, y=cases, fill=area_type)) +
  geom_col(position = "dodge") +
  geom_text(aes(label = cases), position = position_dodge(width = 0.9),
            vjust = -0.2)

3 Labeling Percent-Stacked Bar Plots

‣ For percent stacked bar plot, need to format labels into percentages to match the segments on the chart.

‣ By the end of this section, you’ll have recreated the example graph.

‣ Let’s start with a stacked bar plot showing raw counts.

Code
hosp_dx_sum <- tb_outcomes %>%
  group_by(hospital, diagnosis_type) %>%
  summarise(total_cases = sum(cases))

hosp_dx_sum
# A tibble: 12 × 3
# Groups:   hospital [6]
   hospital         diagnosis_type  total_cases
   <chr>            <chr>                 <dbl>
 1 CHPP Akron       bacteriological         695
 2 CHPP Akron       clinical                180
 3 CS Abomey-Calavi bacteriological         671
 4 CS Abomey-Calavi clinical                120
 5 Hopital Bethesda bacteriological         139
 6 Hopital Bethesda clinical                117
 7 Hopital Savalou  bacteriological          70
 8 Hopital Savalou  clinical                 10
 9 Hopital St Luc   bacteriological         149
10 Hopital St Luc   clinical                 19
11 St Jean De Dieu  bacteriological         100
12 St Jean De Dieu  clinical                 71
Code
hosp_dx_sum %>%
  ggplot(aes(x = hospital, y = total_cases, fill = diagnosis_type)) +
  geom_col() +
  geom_text(aes(label = total_cases),
            position = "stack")

‣ Now we can normalize this by using the “fill” position for geom_col() and geom_text(). Or rather, position_fill().

Code
hosp_dx_sum %>%
  ggplot(aes(x = hospital, y = total_cases, fill = diagnosis_type)) +
  geom_col(position = position_fill()) +
  geom_text(aes(label = total_cases),
            position = position_fill())

Code
# use position_fill() for geom_col() and geom_text()

‣ Next, replace raw values with percentages.

Code
hosp_dx_sum <- tb_outcomes %>%
  group_by(hospital, diagnosis_type) %>%
  summarise(total_cases = sum(cases)) %>% 
  group_by(hospital) %>% 
  mutate(prop = total_cases/sum(total_cases))
# Calculate proportion of cases for each hospital and diagnosis type
# Create a column 'prop' for proportions

‣ Create a bar chart with the prop column.

Code
hosp_dx_fill <- hosp_dx_sum %>%
  ggplot(aes(x = hospital, y = prop, fill = diagnosis_type)) +
  geom_col()
# Use prop for the y-axis in ggplot
# Create a bar chart with geom_col. Don't need position_fill any longer
# assign to hosp_dx_fill

‣ Add labels using geom_text().

Code
hosp_dx_fill +
  geom_text(aes(label = prop), position = position_fill())

Code
# Add labels to the plot using geom_text and position_fill

‣ Improve label readability by formatting decimals and percentages.

Code
hosp_dx_fill +
  geom_text(aes(label = scales::percent(prop)),
            position = position_fill()) 

‣ Center labels using vjust in position_fill().

Code
hosp_dx_fill +
  geom_text(aes(label = scales::percent(prop)),
            position = position_fill(vjust = 0.5)) 

Code
# Center labels in the bar plot using vjust in position_fill()

‣ Enhance readability with flipped coordinates and aesthetic tweaks.

Code
hosp_dx_fill +
  geom_text(aes(label = scales::percent(prop, accuracy = 1)),
            position = position_fill(vjust = 0.5),
            color = "white",
            fontface = "bold",
            size = 4.5) +
  theme_light() +
  coord_flip()

Code
# Flip coordinates for better readability
# Add aesthetic elements like color, fontface and size to labels

Q: Creating Percent-Stacked Bar Plots with Labels

Transform the aus_tb_notifs data into a percent-stacked bar plot, with a bar for each year, and the fill aesthetic mapped to the area type (rural vs urban).

Label each segment with the percentage of cases using geom_text(). Format the labels as percentages.

You can use the code and comments below as a guide:

Code
# Pivot the data 
tb_longer <- aus_tb_notifs %>%
  pivot_longer(cols = c(rural, urban),
               names_to = "area_type",
               values_to = "cases")

# Then summarize and calculate proportions
sum_tb_long <- tb_longer %>% 
  group_by(year, area_type) %>% 
  summarise(cases = sum(cases)) %>% 
  group_by(year) %>% 
  mutate(prop = cases/sum(cases))


# Next create the percent-stacked bar plot
# For the label, use the scales::percent() function with an accuracy of 1
# Use position_fill() to center the labels
sum_tb_long %>% 
  ggplot(aes(x = year, y = prop, fill = area_type)) +
  geom_col() +
  geom_text(aes(label = scales::percent(prop, accuracy = 1)),
            position = position_fill(vjust = 0.5),
            color = "white",
            fontface = "bold",
            size = 4.5) +
  theme_light() +
  coord_flip()

3.1 Labeling Circular Plots

‣ Begin with summarizing the data by calculating total cases for each hospital.

Code
total_results <- tb_outcomes %>% 
  group_by(hospital) %>% 
  summarise(total_cases = sum(cases))
# Use tb_outcomes data
# Group by hospital
# Summarise to get total_cases
# Save as total_results

‣ Create a simple bar chart to visualize the data before moving to circular plots.

Code
total_results %>% 
  ggplot(aes(x = 4, y = total_cases,
             fill = hospital)) +
  geom_col()

Code
# plot with ggplot
# Set x to 4 for bar chart
# y is total_cases
# fill based on hospital
# Use geom_col()

‣ Transform the bar chart into a basic pie chart using coord_polar().

Code
outcome_pie <- total_results %>% 
  ggplot(aes(x = 4, y = total_cases,
             fill = hospital)) +
  geom_col() +
  coord_polar(theta = "y")
# Take the bar chart results
# Add coord_polar with theta set to "y"
# Assign to outcome_pie

‣ Create a base donut chart with xlim().

Code
outcome_donut <- total_results %>% 
  ggplot(aes(x = 4, y = total_cases,
             fill = hospital)) +
  geom_col() +
  xlim(c(0, 4.5)) +
  coord_polar(theta = "y")

# Start with the pie chart
# Apply xlim set to c(0, 4.5)
# Store as outcome_donut

‣ Next, focus on adding labels to the pie chart.

Code
outcome_pie +
  geom_text(aes(label = total_cases))

Code
# Add labels using geom_text
  # Set label to total_cases
  # Note: labels will be misplaced initially

‣ Adjust label positions using position_stack() and vjust.

Code
outcome_pie +
  geom_text(aes(label = total_cases), position = position_stack(vjust = 0.5)) 

Code
# Update geom_text
  # Add position argument with position_stack
  # vjust set to 0.5 to center labels

‣ Apply similar adjustments using geom_label().

Code
outcome_pie +
  geom_label(aes(label = total_cases), position = position_stack(vjust = 0.5)) 

Code
# Use geom_label instead of geom_text
# Same position and x adjustments
# Address legend issue with show.legend = FALSE

‣ Label the donut chart similarly, centering labels in sections.

Code
outcome_donut +
  geom_text(aes(label = total_cases), position = position_stack(vjust = 0.5)) 

Code
# Apply geom_text to donut chart

‣ Enhance the chart’s aesthetics with theme_void(), color adjustments, and label styling.

Code
outcome_donut +
  geom_text(aes(label = total_cases), position = position_stack(vjust = 0.5),
            color = "white",
            fontface = "bold") +
  scale_fill_viridis_d() +
  theme_void()

Code
# Final aesthetic adjustments
# Use theme_void for cleaner look 
# Apply new color palette with scale_fill_viridis_d
# Update geom_text for label visibility and style

Congratulations! The charts should now look great!

Q: Labeling Pie Charts

Plot total TB cases in all rural vs urban areas in the aus_tb_notifs dataset as a pie chart. Use geom_text() to place labels correctly, indicating the number of cases in that area.

You can use the code and comments below as a guide:

Code
# Pivot then summarize the total cases per area type
pie_table <-  aus_tb_notifs %>%
  pivot_longer(cols = c(rural, urban),
               names_to = "area_type",
               values_to = "cases") %>% 
  group_by(area_type) %>%
  summarise(total_cases = sum(cases)) 

# Now, create the pie chart
# For the text labels, use geom_text() and position_stack(vjust = 0.5)
ggplot(pie_table, aes(x = 4, y = total_cases,
             fill = area_type)) +
  geom_col() +
  coord_polar(theta = "y") +
    geom_text(aes(label = total_cases), position = position_stack(vjust = 0.5)) +
  theme_void()

Pro-Tip: Enhancing Text Labels with ggtext

For advanced plotters seeking even more sophisticated control over text formatting in ggplot2, the {ggtext} package may come in handy.

Consider the example below, which uses {ggtext} for the plot title, subtitle and bar labels:

Code
pacman::p_load(tidyverse, ggtext, medicaldata)

# Data and Plot
medicaldata::strep_tb %>% 
  count(gender) %>% 
  mutate(gender_label = paste0("**<span style='font-size:16pt'>", n, "</span>**", 
                               if_else(gender == "M", " men", " women"))) %>% 
  ggplot(aes(x = gender, fill = gender, y = n)) +
  geom_col() +
  scale_fill_manual(values = c("M" = "#ee6c4d", "F" = "#424874")) +
  labs(
    title = "<b><span style='color:#424874; font-size:19pt'>Female</span> vs
    <span style='color:#ee6c4d; font-size:19pt'>Male</span> 
    Patients in Strep Study</b>",
    subtitle = "<span style='color:gray60'>A demonstration of custom text labels with </span>**{ggtext}**") +
  theme_classic() +
  theme(plot.title = element_textbox_simple(), 
        plot.subtitle = element_textbox_simple(),
        legend.position = "none", 
        axis.text.x = element_blank()) +
  geom_richtext(aes(label = gender_label, y = n/2), 
                label.r = grid::unit(5, "pt"), fill = "white")

To learn more about {ggtext}, visit the package website.

4 Learning Objectives

After this lesson, you will be able to:

Use two different text geoms to label ggplots: - geom_text() for simple labels - geom_label() for emphasized labels

‣ Appropriately transform and summarize data in the appropriate format for different chart types.

‣ Adjust text placement to position labels on stacked, Dodged, and percent-stacked bar plots.

‣ Position labels on pie charts and donut plots.