Crafting Publication-Ready Graphics in R

Facets, coordinates, annotations, themes, and professional export in ggplot2

Bogdan G. Popescu

John Cabot University

Introduction

Today we move beyond geoms/aesthetics to compose, clarify, and polish plots.

  • Facets vs. color/fill: when to split panels vs. overlay groups.
  • Coordinates for honesty: coord_cartesian() (zoom w/o dropping), coord_flip() (long labels). Beware truncated axes.
  • Annotations that guide the eye: geom_text(), geom_label(), ggrepel, arrows/callouts.
  • Themes that professionalize: built-ins (gray, bw, minimal, classic, void) + ggthemes (e.g., economist, wsj); quick tweaks.
  • Combining, motion, export: ggarrange()gganimate/ggiraphggsave (formats, aspect, DPI).

Simple Geoms

Aesthetics & the Layer Cake

Simple Geoms

Aesthetics & the Layer Cake

  • Geoms are the marks (points, lines, bars) your data takes on a plot.
  • Aesthetics: color, fill, size, shape, alpha, linetype.
  • Today’s focus:

    • Facets (small multiples)
    • Coordinates (zoom/flip)
    • Annotations (guide the eye)
    • Themes (professional polish).

Facets

Facets

What are they?

Facets split the data into multiple small plots (panels).

  • Each panel shows only the subset of data for that category.

When to use facets?

  • When you want to give each group its own space (avoid overlap/clutter).
  • When the comparison you care about is between panels, not within the same axes.

When to use color/fill aesthetics?

  • When you want to directly compare groups side by side on the same scale.
  • When the overlap is not a problem, or even desirable

Facets

geom_line

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

This is our dataset:

library(ggplot2)
set.seed(123)

# Toy dataset with US and UK
eg5 <- data.frame(
  year = rep(c(2000, 2004, 2008, 2012, 2016, 2020), times = 2),
  turnout = c(
    # US presidential elections
    54, 60, 62, 58, 56, 65,
    # UK general elections (closest years aligned to US election years for teaching)
    59, 61, 65, 66, 68, 67
  ),
  country = rep(c("United States", "United Kingdom"), each = 6)
)

Facets

geom_line

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

This is what it looks like:

year turnout country
2000 54 United States
2004 60 United States
2008 62 United States
2012 58 United States
2016 56 United States
2020 65 United States
2000 59 United Kingdom

Facets

geom_line

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

This is what we had previously:

Show the code
ggplot(eg5, aes(x = year, y = turnout, color = country)) +
  geom_line()

Facets

geom_line

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

This is the alternative using facets:

Show the code
ggplot(eg5, aes(x = year, y = turnout)) +
  geom_line() +
  facet_wrap(~ country, ncol = 1)

Facets

geom_histogram

Suppose we surveyed people about their trust in government on a 1–10 scale (1 = no trust, 10 = complete trust). We want to compare typical values and how spread out the answers are for men and women.

This is our dataset:

library(ggplot2)
set.seed(123)
# Toy survey dataset
eg6 <- data.frame(
  gender = rep(c("Men", "Women"), each = 20),
  trust = c(
    rnorm(20, mean = 5, sd = 2),
    rnorm(20, mean = 8, sd = 1)
  )
)

Facets

geom_histogram

Suppose we surveyed people about their trust in government on a 1–10 scale (1 = no trust, 10 = complete trust). We want to compare typical values and how spread out the answers are for men and women.

This is what it looks like:

gender trust
Men 3.879049
Men 4.539645
Men 8.117417
Men 5.141017
Men 5.258576

Facets

geom_histogram

Suppose we surveyed people about their trust in government on a 1–10 scale (1 = no trust, 10 = complete trust). We want to compare typical values and how spread out the answers are for men and women.

This is what we had previously:

Show the code
ggplot(eg6, aes(x = trust, fill = gender)) +
  geom_histogram(bins = 10, color = "white", alpha = 0.5)

Facets

geom_histogram

Suppose we surveyed people about their trust in government on a 1–10 scale (1 = no trust, 10 = complete trust). We want to compare typical values and how spread out the answers are for men and women.

This is the alternative using facets:

Show the code
ggplot(eg6, aes(x = trust)) +
  geom_histogram(bins = 10, color = "white") +
  facet_wrap(~ gender, ncol = 1)               # one panel per gender

Putting Plots Side by Side

ggarrange()

Sometimes you don’t want facets (splitting one dataset).

Instead, you might want to combine different plots — for example:

Show the code
library(ggpubr)

# Plot 1: voter turnout
p1 <- ggplot(eg5, aes(x = year, y = turnout, color = country, group = country)) +
  geom_line(linewidth=1.2) +
  labs(title = "Voter Turnout")

# Plot 2: trust in government
p2 <- ggplot(eg6, aes(x = trust, fill = gender)) +
  geom_histogram(bins = 10, color = "white", alpha = 0.7) +
  labs(title = "Trust in Government")

# Arrange them side by side
ggarrange(p1, p2, ncol = 2)

Coordinates

Coordinates

Coordinates in ggplot2

  • Control how data space is mapped to the plot space
  • Can zoom in/out or flip axes
  • Crucial for clarity and honesty in visualizations

Common functions:

  • coord_cartesian() – zoom without dropping data
  • coord_flip() – swap x & y axes (useful for long labels)

Coordinates

coord_cartesian

Let us check out another toy example inspired by real data.

The data looks like this:

brexit_data <- data.frame(
  side = c("Leave", "Remain"),
  support = c(52, 48))
side support
Leave 52
Remain 48

Coordinates

coord_cartesian

This is how we can plot our data:

Show the code
ggplot(brexit_data, aes(x = side, y = support, fill = side)) +
  geom_col()

Coordinates

coord_cartesian

This is how we can truncate the axis (zoom the view) without dropping data:

Show the code
ggplot(brexit_data, aes(x = side, y = support, fill = side)) +
  geom_col() +
  coord_cartesian(ylim = c(45, 55))  # Truncates axis and exaggerates gap

Coordinates

coord_cartesian

Moral: Truncated axes exaggerate differences.

Use coord_cartesian() for transparency.

Flipping

coord_flip

  • coord_flip() swaps x and y axes, rotating the entire plot.

  • Useful for bar charts with long category labels (improves readability).

  • Often makes rankings and comparisons easier to interpret.

Flipping

coord_flip

Suppose we have survey data on average voter turnout rates (%) across different education groups. Here we don’t want to plot individual points — instead, we’re comparing aggregated values (categories on x, turnout on y).

# Pre-aggregated counts instead of raw rows
eg2 <- data.frame(
  education = c(
    "Primary", "Primary", "High School", "High School", "High School",
    "College", "College", "College", "College"
  ))
education
Primary
Primary
High School
High School

Flipping

coord_flip

Suppose we have survey data on average voter turnout rates (%) across different education groups. Here we don’t want to plot individual points — instead, we’re comparing aggregated values (categories on x, turnout on y).

ggplot(eg2, aes(x = education)) +
  geom_bar()

Flipping

coord_flip

Suppose we have survey data on average voter turnout rates (%) across different education groups. Here we don’t want to plot individual points — instead, we’re comparing aggregated values (categories on x, turnout on y).

This is how we can use coord_flip

Show the code
ggplot(eg2, aes(x = education)) +
  geom_bar()+
  coord_flip()

Annotations

Annotations

Why Annotations?

Raw plots ≠ finished plots.

Annotations direct the reader’s eye and add context.

Labels vs Annotations

  • Labels: Titles, axis labels, captions (labs()).
  • Annotations: Extra text, marks on the plot itself

Annotations

geom_text

We learned to use geom_text in this example:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

eg1 <- data.frame(
  democracy = c(-8, -7, -5, -3, 0, 2, 5, 8, 9),   # democracy score
  gdp = c(2, 9, 4, 7, 8, 20, 15, 25, 27),  # GDP per capita in $1,000s
  country = c(
    "North Korea",   # very autocratic, very poor
    "Saudi Arabia",  # autocratic, but richer due to oil
    "Zimbabwe",      # authoritarian, low GDP
    "Russia",        # hybrid regime, middle income
    "Nigeria",       # similar position
    "India",         # low–mid democracy, growing GDP
    "Brazil",        # democracy, mid GDP
    "Poland",        # consolidated democracy, higher GDP
    "South Korea"))   # rich democracy

Annotations

geom_text

We learned to use geom_text in this example:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

democracy gdp country
-8 2 North Korea
-7 9 Saudi Arabia
-5 4 Zimbabwe
-3 7 Russia
0 8 Nigeria

Annotations

geom_text

We learned to use geom_text in this example:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
# Scatterplot with geom_point
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5)

Annotations

geom_label

We could also use geom_label in this example:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
# Scatterplot with geom_point
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_label(aes(label = country), vjust = -1, size = 5) 

Annotations

geom_text_repel

We could also use geom_text_repel in this example to avoid clutter:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
library(ggrepel)

# Scatterplot with geom_point
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text_repel(aes(label = country), size = 5)

Annotations

geom_label_repel

We could also use geom_label_repel in this example to avoid clutter:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
library(ggrepel)
# Scatterplot with geom_point
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_label_repel(aes(label = country), size = 5)

Annotations

geom_text

We learned to use geom_text in this example:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
# Scatterplot with geom_point
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  annotate("text", x = -7, y = 11, label = "Oil-rich outlier", 
           color = "red", size = 6, fontface = "bold") +
  annotate("segment", x = -7, xend = -7, y = 9, yend = 10.5, 
           arrow = arrow(length = unit(0.2, "cm")), color = "red")

Labs

Labs

Why labs()?

Every good plot needs context. labs() lets you set:

  • Title / Subtitle — what the figure is about
  • Axis labels — what each dimension represents
  • Caption — notes, sources, disclaimers
  • Legend title — clarify what colors/shapes mean

Labs

Example with Democracy & GDP

Previously, we plotted democracy score vs GDP with labels:

Show the code
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point() +
  geom_text(aes(label = country), vjust = -1, size = 5)

Labs

Adding Context with labs()

Now, we can improve readability:

Show the code
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) +
  labs(
    title = "Democracy and Wealth",
    subtitle = "Higher democracy scores often align with higher GDP",
    x = "Democracy Score (-10 = Autocracy, +10 = Democracy)",
    y = "GDP per Capita (in $1,000s)",
    caption = "Toy dataset, inspired by real-world patterns"
  )

Themes

Themes

What are Themes?

Control non-data elements of the plot.

Fonts, background, gridlines, legend placement, margins.

Do not affect the data → only the presentation.

Good themes = clarity + professionalism.

Themes

Built-in Themes: theme_grey()

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
# Scatterplot with geom_point
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  theme_grey()

Themes

Built-in Themes: theme_bw()

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  theme_bw()

Themes

Built-in Themes: theme_minimal()

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  theme_minimal()

Themes

Built-in Themes: theme_classic()

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  theme_classic()

Themes

Built-in Themes: theme_void()

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  theme_void()

Themes

Supplementary Themes: ggthemes - theme_economist

To access additional themes you should install the ggthemes package

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
library(ggthemes)
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  theme_economist()

Themes

Supplementary Themes: ggthemes - theme_wsj

To access additional themes you should install the ggthemes package

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
library(ggthemes)
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  theme_wsj()

Themes

Tweaking a Theme

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
# Scatterplot with geom_point
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(size = 5) +
  geom_text(aes(label = country), vjust = -1, size = 5) + 
  theme_minimal()+
  theme(panel.background = element_rect(fill = "white"), # Set background color
        axis.title.x = element_text(color = "#BF0404", face = "bold"), # Set x-axis label color
        axis.title.y = element_text(color = "#BF0404", face = "bold"),
        axis.line = element_line(color = "#BF0404", size = 1.5), 
        panel.grid = element_blank(), 
        panel.border = element_blank(), 
        panel.grid.major.y = element_line(color = "#40403E", size = 0.5, linetype = "dotted"))

Interactive Plots

Interactive Plots

Time Series

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

set.seed(123)
library(gganimate)

# Toy dataset with US and UK
eg5 <- data.frame(
  year = rep(c(2000, 2004, 2008, 2012, 2016, 2020), times = 2),
  turnout = c(
    # US presidential elections
    54, 60, 62, 58, 56, 65,
    # UK general elections (closest years aligned to US election years for teaching)
    59, 61, 65, 66, 68, 67
  ),
  country = rep(c("United States", "United Kingdom"), each = 6)
)

Interactive Plots

Time Series

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

year turnout country
2000 54 United States
2004 60 United States
2008 62 United States
2012 58 United States
2016 56 United States
2020 65 United States
2000 59 United Kingdom

Interactive Plots

Time Series

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

ggplot(eg5, aes(x = year, y = turnout, color = country, group = country)) +
  geom_line() +
  transition_reveal(year)+
  labs(
    subtitle = "Year: {round(frame_along)}"
  )

Interactive Plots

Time Scatterplots

Let us first create a panel data.

library(dplyr)
library(tidyr)

# Base cross-section
eg1 <- data.frame(
  democracy = c(-8, -7, -5, -3, 0, 2, 5, 8, 9),   # democracy score
  gdp = c(2, 9, 4, 7, 8, 20, 15, 25, 27),  # GDP per capita ($1,000s)
  country = c(
    "North Korea",
    "Saudi Arabia",
    "Zimbabwe",
    "Russia",
    "Nigeria",
    "India",
    "Brazil",
    "Poland",
    "South Korea"
  )
)

# Add a fake panel (2000–2020 every 5 years)
set.seed(123)  # reproducible "wiggles"

eg_panel <- eg1 %>%
  slice(rep(1:n(), each = 5)) %>%           # repeat each country 5 times
  mutate(year = rep(seq(2000, 2020, 5), times = nrow(eg1))) %>%
  group_by(country) %>%
  mutate(
    # let democracy scores drift a bit
    democracy = democracy + cumsum(runif(5, -0.6, 0.3)),
    # let GDP grow with some noise
    gdp = gdp + cumsum(runif(5, 0, 2))
  )

Interactive Plots

Time Scatterplots

Let us first examine the data

democracy gdp country year
-7.790158 3.330230 North Korea 2000
-8.168679 3.519912 North Korea 2005
-8.730825 4.287851 North Korea 2010
-9.035696 4.836618 North Korea 2015
-8.776643 6.465898 North Korea 2020
-6.733278 9.440238 Saudi Arabia 2000
-6.521209 10.199871 Saudi Arabia 2005

Interactive Plots

Time Scatterplots

We learned to use geom_text in this example:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code
# Scatterplot with geom_point
ggplot(eg_panel, aes(x = democracy, y = gdp, label = country)) +
  geom_point(size = 5) +
  geom_text(vjust = -1, size = 5, show.legend = FALSE) +
  labs(
    subtitle = "Year: {round(frame_time)}"
  ) +
  transition_time(round(year)) +
  ease_aes('linear')

Interactive Plots

Interactive Maps

Show the code
library(ggplot2)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)
library(ggiraph)
library(glue)
library(scales)


world <- ne_countries(scale = "medium", 
                      returnclass = "sf")
europe_bounds <- list(x = c(-10, 40),
                      y = c(35, 70))

# Mapping Them
# Interactive ggplot
p <- ggplot() +
  geom_sf_interactive(
    data = world,
    aes(
      fill = log(pop_est),
      tooltip = glue("{admin}: {comma(pop_est)}")  # hover text
    ),
    color = "white", linewidth = 0.1
  ) +
  coord_sf(
    xlim = europe_bounds$x,
    ylim = europe_bounds$y
  ) +
  theme_grey(base_size = 25) +
  scale_fill_viridis_c(option = "viridis")

# Render interactive plot with tooltips
girafe(ggobj = p)

Saving Figures

This was the first example we had:

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

eg1 <- data.frame(
  democracy = c(-8, -7, -5, -3, 0, 2, 5, 8, 9),   # democracy score
  gdp = c(2, 9, 4, 7, 8, 20, 15, 25, 27)  # GDP per capita in $1,000s
)

Saving Figures

We save the plot to an object and then print it:

# Scatterplot with geom_point
pic<-ggplot(data=eg1, 
       aes(x = democracy, y = gdp)) +
  geom_point()
print(pic)

Saving Figures

The command to save a picture is ggsave

We can save it as:

  • PDF → small file, best for documents and printing
# Square picture
ggsave(pic, filename = "pic.pdf")
  • PNG → crisp, good for websites or presentations
ggsave(pic, filename = "pic.png")
  • JPG → also common, but usually larger file size
ggsave(pic, filename = "pic.jpg")

Saving Figures

Controlling Shape

When you save, you can tell R how wide and tall the picture should be.

ggsave(pic, filename = "pic_square.png", width = 20, height = 20, units = "cm")

Saving Figures

Controlling Shape

When you save, you can tell R how wide and tall the picture should be.

ggsave(pic, filename = "pic_height.png", width = 10, height = 20, units = "cm")

Saving Figures

Controlling Shape

When you save, you can tell R how wide and tall the picture should be.

ggsave(pic, filename = "pic_width.png", width = 20, height = 10, units = "cm")

Saving Figures

Image Quality

Here’s a new word: dpi = dots per inch.

  • Higher dpi = sharper image, but larger file.
  • Lower dpi = fuzzier image, but smaller file.
ggsave(pic, filename = "pic_hi.png", width = 20, height = 20, units = "cm", dpi = 300)

Saving Figures

Image Quality

Here’s a new word: dpi = dots per inch.

  • Higher dpi = sharper image, but larger file.
  • Lower dpi = fuzzier image, but smaller file.
ggsave(pic, filename = "pic_lo.png", width = 20, height = 20, units = "cm", dpi = 30)

Conclusion

Big Takeaways

  • Compose wisely:

    • Use facets when comparisons are between groups;
    • Use color/fill when comparisons are within the same axes.
  • Be honest with coordinates:

    • Zoom with coord_cartesian();
    • flip long labels with coord_flip().
    • Avoid misleading truncated axes.
  • Guide the eye:

    • Annotations (geom_text[_repel], geom_label[_repel], arrows) for readable stories.
  • Context matters: labs() for titles, subtitles, axes, captions, and clear legends.
  • Polish with themes: Built-ins or ggthemes to professionalize—data first, décor second.
  • Combine & share:

    • ggarrange() to curate narratives;
    • gganimate/ggiraph to add motion/interaction;
    • ggsave to export with the right size & DPI.