Designing with Aesthetics

Color, Size, Shape, Transparency, and Linetype

Bogdan G. Popescu

bogdan.popescu@johncabot.edu

John Cabot University

Learning Outcomes

By the end of this lecture, you will be able to:

Identify and apply key aesthetics in ggplot2 (color, size, shape, transparency, linetype)
Distinguish between mapping (data-driven) and setting (fixed) aesthetics
Enhance plots by layering aesthetics onto common geoms (points, bars, lines, text)
Evaluate when aesthetics improve clarity vs. when they create clutter

Simple Geoms

Aesthetics

Simple Geoms

Aesthetics

Geoms are the “shapes” your data takes on a graph.

Aesthetics bring plots to life: color, size, shape, transparency, linetype

aes() maps variables to these aesthetics, linking data to visual features.

Aesthetics highlight comparisons and guide attention

Simple Geoms

Examples

These are the simple geoms that we covered previously.

Simple Geoms

Reference

This is what they are used for:

geom	Use for
`geom_point()`	Relationships between two variables; at least 10 obs.
`geom_col()`	Totals/percent per category; ordered bars. Uses pre-computed values for bar height. You must supply both `x` and `y`
`geom_bar()`	Counts the observations in each category. You must supply `x`, not `y`
`geom_text()`	Direct labels for small N; annotate outliers
`geom_line()`	Relationships between two variables; at least 10 obs.

Point Aesthetics

geom_point

Aesthetics

Let’s explore how aesthetics change meaning and clarity.

geom_point

Aesthetics

This is what they are used for:

Aesthetic	Use for
`color` (discrete)	Differentiate categories with distinct hues
`color` (continuous)	Show gradual change or intensity across a numeric scale
`size` (discrete + continuous)	Represent magnitude, frequency, or importance; best with moderate differences
`fill` (discrete + continuous)	Similar to `color` but applies to filled shapes (bars, areas, points with `pch=21–25`)
`shape` (discrete)	Distinguish categories when colors alone aren’t enough; limited shapes available
`alpha` (discrete + continuous)	Control transparency to show overlap/density; reduce clutter in crowded plots

geom_point

Simple Geom

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

eg1 <- data.frame(
  democracy = c(-8, -7, -5, -3, 0, 2, 5, 8, 9),   # democracy score
  gdp = c(2, 9, 4, 7, 8, 20, 15, 25, 27)  # GDP per capita in $1,000s
)

geom_point

Simple Geom

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

democracy	gdp
-8	2
-7	9
-5	4
-3	7
0	8

geom_point

Simple Geom

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# Scatterplot with geom_point
ggplot(data=eg1, 
       aes(x = democracy, y = gdp)) +
  geom_point()

geom_point

color(discrete)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Let’s imagine that we have a new dataset with more variables:

# Reusable dataset
eg1 <- data.frame(
  country = c(
    "North Korea",   # very autocratic, very poor
    "Saudi Arabia",  # autocratic, but richer due to oil
    "Zimbabwe",      # authoritarian, low GDP
    "Russia",        # hybrid regime, middle income
    "Nigeria",       # similar position
    "India",         # low–mid democracy, growing GDP
    "Brazil",        # democracy, mid GDP
    "Poland",        # consolidated democracy, higher GDP
    "South Korea"   # rich democracy
  ), 
  democracy = c(-8, -7, -5, -3, 0, 2, 5, 8, 9),          # democracy score
  gdp = c(2, 9, 4, 7, 8, 20, 15, 25, 27),          # GDP per capita ($1,000s)
  region = c("Asia", "Asia", "Africa",
             "Europe", "Africa", "Asia",
             "Americas", "Europe", "Asia"),        # categorical
  population = c(5, 50, 30, 12, 80, 60, 40, 100, 70), # continuous (millions)
  income_group = factor(
    c("Low", "Low", "Low",
      "Middle", "Middle", "Middle",
      "High", "High", "High"),
    levels = c("Low", "Middle", "High")),
  corruption = c(80, 65, 50, 40, 35, 30, 25, 20, 15))

geom_point

color(discrete)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Let’s imagine that we have a new dataset with more variables:

country	democracy	gdp	region	population	income_group	corruption
North Korea	-8	2	Asia	5	Low	80
Saudi Arabia	-7	9	Asia	50	Low	65
Zimbabwe	-5	4	Africa	30	Low	50
Russia	-3	7	Europe	12	Middle	40
Nigeria	0	8	Africa	80	Middle	35

geom_point

color(discrete)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

This is how we can emphasize the region (discrete):

Show the code

# 1) color (discrete): distinguish categories with distinct hues
ggplot(eg1, aes(x = democracy, y = gdp, color = region)) +
  geom_point()

geom_point

color(continuous)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

This is how we can emphasize the corruption levels (continuous):

Show the code

# 2) color (continuous): encode a numeric gradient (low → high)
ggplot(eg1, aes(x = democracy, y = gdp, color = corruption)) +
  geom_point()

geom_point

size(discrete)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

This is how we can emphasize the income groups (discrete):

Show the code

# 3) size (discrete): High income = large points
ggplot(eg1, aes(x = democracy, y = gdp, size = income_group)) +
  geom_point()

geom_point

size(continuous)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

This is how we can emphasize population size (continuous)

Show the code

# 4) size (continuous): encode magnitude (e.g., population)
ggplot(eg1, aes(x = democracy, y = gdp, size = population)) +
  geom_point()

geom_point

fill(discrete)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

This is how we can emphasize income groups (discrete).

Show the code

# 5) fill (discrete): categories on filled shapes (pch 21–25, bars, areas)
ggplot(eg1, aes(x = democracy, y = gdp, fill = income_group)) +
  geom_point(shape = 21, color = "black")

geom_point

fill(continuous)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

This is how we can emphasize the level of corruption (continuous).

Show the code

# 6) fill (continuous): numeric gradient on filled shapes
ggplot(eg1, aes(x = democracy, y = gdp, fill = corruption)) +
  geom_point(shape = 21, color = "black")

geom_point

shape(discrete)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# 7) shape (discrete): differentiate categories; limited distinct shapes
ggplot(eg1, aes(x = democracy, y = gdp, shape = region)) +
  geom_point()

geom_point

alpha(discrete)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# 8) alpha (discrete): subtle group differences; use sparingly in legends
ggplot(eg1, aes(x = democracy, y = gdp, alpha = region)) +
  geom_point()

geom_point

alpha(continuous)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# 9) alpha (continuous): opacity encodes intensity (higher → more opaque)
ggplot(eg1, aes(x = democracy, y = gdp, alpha = corruption)) +
  geom_point()

geom_point

multiple aesthetics (readable)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# 10) Combine judiciously: color = region (discrete), size = population (continuous)
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(aes(color = region, size = population))

geom_point

too many aesthetics (overloaded)

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# Overloaded example (hard to read): too many mapped aesthetics at once
# Prefer ≤2 mappings; prioritize the question your plot answers
ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point(
    aes(color = region,
        fill  = income_group,
        shape = region,
        size  = population,
        alpha = corruption)
  )

Mapping vs. Setting

Key Idea

Mapping: aesthetic varies with data → put inside aes()
Setting: aesthetic is fixed → put outside aes()

This is how we create the data:

eg2 <- data.frame(
  x = 1:5,
  y = c(2,4,6,8,10),
  group = c("A","B","A","B","A")
)

Mapping vs. Setting

Key Idea

Mapping: aesthetic varies with data → put inside aes()
Setting: aesthetic is fixed → put outside aes()

This is what it looks like:

x	y	group
1	2	A
2	4	B
3	6	A
4	8	B
5	10	A

Mapping vs. Setting

Key Idea

Mapping: aesthetic varies with data → put inside aes()
Setting: aesthetic is fixed → put outside aes()

This is the difference:

Show the code

# Mapping: put inside aes() when the value comes from data
# Setting: put outside aes() for a constant value
ggplot(eg2, aes(x = x, y = y, color = group)) +
  geom_point()+
  labs(title = "Mapping: color depends on `group`")

Show the code

# Mapping: put inside aes() when the value comes from data
# Setting: put outside aes() for a constant value
ggplot(eg2, aes(x = x, y = y)) +
  geom_point(color = "red") +
  labs(title = "Setting: all points red")

Mapping vs. Setting

Common Mistake

# Wrong: constant inside aes()
ggplot(eg2, aes(x = x, y = y, color = "red")) +
  geom_point() +
  labs(title = "Mistake: constant mapped")

# Correct: constant outside aes()
ggplot(eg2, aes(x = x, y = y)) +
  geom_point(color = "red") +
  labs(title = "Correct: constant set")

Why it’s wrong:

Left plot: "red" treated as a category, so ggplot makes a useless legend.
Right plot: "red" is just a style — no legend clutter.

Column Aesthetics

geom_col

Aesthetics

geom_col

Aesthetics

Aesthetic	Use for
`color` (discrete)	Differentiate categories with distinct outline hues (rare for bars)
`color` (continuous)	Show gradual change or intensity in outline scale (rare for bars)
`fill` (discrete)	Fill bars by category with distinct hues (common)
`fill` + `color` (outline)	Combine interior fill with a contrasting border for clarity
`fill` (continuous)	Show gradual change or intensity across a numeric fill scale (rare for bars)
`alpha` (discrete + continuous)	Control transparency to show emphasis or reduce clutter (rare for bars)

geom_point

Simple Geom

Suppose we collected a small survey and just recorded the education level of each respondent. We want to visualize how many respondents fall into each category.

# Pre-aggregated counts instead of raw rows
eg3 <- data.frame(
  education = c(
    "Primary", "Primary", "High School", "High School", "High School",
    "College", "College", "College", "College"
  )
)

education
Primary
Primary
High School
High School

geom_col

Simple Geom

Suppose we collected a small survey and just recorded the education level of each respondent. We want to visualize how many respondents fall into each category.

library(dplyr)
# Count occurrences of each education level
eg3b <- eg3 %>%
  count(education)

education	n
College	4
High School	3
Primary	2

geom_col

Simple Geom

Suppose we collected a small survey and just recorded the education level of each respondent. We want to visualize how many respondents fall into each category.

Show the code

ggplot(eg3b, aes(x = education, y = n)) +
  geom_col()

geom_col

fill(discrete)

Suppose we collected a small survey and just recorded the education level of each respondent. We want to visualize how many respondents fall into each category.

Show the code

# fill = education (discrete)
ggplot(eg3b, aes(x = education, y = n, fill = education)) +
  geom_col()

geom_col

fill(discrete)

Suppose we collected a small survey and just recorded the education level of each respondent. We want to visualize how many respondents fall into each category.

Show the code

# fill = education, color = black border
ggplot(eg3b, aes(x = education, y = n, fill = education)) +
  geom_col(color = "black", linewidth = 1)

Line Aesthetics

geom_line

Aesthetics

geom_line

Aesthetics

Aesthetic	Use for
`color` (discrete)	Distinguish groups with different line colors
`color` (continuous)	Show gradient along x or y values; possible but unusual (rare for lines)
`size`	Deprecated; replaced by `linewidth`
`linewidth` (discrete + continuous)	Vary line thickness to emphasize magnitude (sometimes used) or weight (rare for lines)
`linetype` (discrete)	Differentiate groups with solid/dashed/dotted styles
`alpha` (discrete + continuous)	Control transparency to reduce clutter when many lines overlap (rare for lines)

geom_line

Simple Geom

Suppose we have data on average voter turnout (%) in national elections over several years. We want to see the trend in participation.

# Toy dataset
eg4 <- data.frame(
  year = c(2000, 2004, 2008, 2012, 2016, 2020),
  turnout = c(55, 58, 62, 60, 59, 65)
)

year	turnout
2000	55
2004	58
2008	62
2012	60
2016	59
2020	65

geom_line

Simple Geom

Suppose we have data on average voter turnout (%) in national elections over several years. We want to see the trend in participation.

Show the code

ggplot(eg4, aes(x = year, y = turnout)) +
  geom_line()

geom_line

color(discrete)

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

# Toy dataset with US and UK
eg5 <- data.frame(
  year = rep(c(2000, 2004, 2008, 2012, 2016, 2020), times = 2),
  turnout = c(
    # US presidential elections
    54, 60, 62, 58, 56, 65,
    # UK general elections (closest years aligned to US election years for teaching)
    59, 61, 65, 66, 68, 67
  ),
  country = rep(c("United States", "United Kingdom"), each = 6)
)

geom_line

color(discrete)

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

year	turnout	country
2000	54	United States
2004	60	United States
2008	62	United States
2012	58	United States
2016	56	United States
2020	65	United States
2000	59	United Kingdom
2004	61	United Kingdom

geom_line

color(discrete)

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

Show the code

ggplot(eg5, aes(x = year, y = turnout, color = country)) +
  geom_line()

geom_line

linewidth(discrete)

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

Show the code

ggplot(eg5, aes(x = year, y = turnout, linewidth = country)) +
  geom_line()

geom_line

linetype(discrete)

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

Show the code

ggplot(eg5, aes(x = year, y = turnout, linetype = country)) +
  geom_line()

geom_line

Line Types and Color

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

Show the code

ggplot(eg5, aes(x = year, y = turnout, 
      linetype = country, color = country)) +
  geom_line()

geom_line

Line Types, Color, and Points

Suppose we have data on average voter turnout (%) in national elections over several years for the US and the UK. We want to see the trend in participation.

Show the code

ggplot(eg5, aes(x = year, y = turnout, 
  linetype = country, color = country)) +
  geom_line() +
  geom_point()

Complex Geom Aesthetics

Geom	Useful aesthetics	Avoid
Boxplot	fill, color	size, shape
Histogram	fill, alpha	shape
Density	color, linetype	size
Violin	fill, color	shape
Smooth	color, linetype	size
SF (maps)	fill, color	shape

Complex Geoms

`geom_boxplot`

Suppose we surveyed people about their trust in government on a 1–10 scale (1 = no trust, 10 = complete trust). We want to compare typical values and how spread out the answers are for men and women.

set.seed(123)

# Toy survey dataset
eg6 <- data.frame(
  gender = rep(c("Men", "Women"), each = 20),
  trust = c(
    rnorm(20, mean = 5, sd = 2),
    rnorm(20, mean = 8, sd = 1)
  )
)

Complex Geoms

`geom_boxplot`

gender	trust
Men	3.879049
Men	4.539645
Men	8.117417
Men	5.141017
Men	5.258576
Men	8.430130
Men	5.921832
Men	2.469877

Complex Geoms

`geom_boxplot`

Show the code

ggplot(eg6, aes(x = gender, y = trust)) +
  geom_boxplot()

Complex Geoms

`geom_boxplot`: fill(color)

Show the code

ggplot(eg6, aes(x = gender, y = trust, fill = gender)) +
  geom_boxplot()

Complex Geoms

`geom_histogram`: fill + alpha

Show the code

ggplot(eg6, aes(x = trust, fill = gender)) +
  geom_histogram(bins = 10, color = "white", alpha = 0.5, position = "identity")

Complex Geoms

`geom_sf` fill(color)

Show the code

library(ggplot2)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)

world <- ne_countries(scale = "medium", 
                      returnclass = "sf")
europe_bounds <- list(x = c(-10, 40),
                      y = c(35, 70))

# Mapping it
ggplot() +
  geom_sf(data = world, aes(fill=log(pop_est))) +
  coord_sf(xlim = europe_bounds$x, 
           ylim = europe_bounds$y)

Color and Accessibility

Colors

On the most fundamental level, we need to use the right colors for our visualizations

This is relevant for:

Clarity and Readability: users can distinguish among different categories

Accessibility: color-blind people can also see the different categories in your visualization

Emphasis: the right colors can be used to emphasize specific aspects of the data or analysis

Consistency: using a consistent color palette for the same project is helpful

Color Statistics

8% of men and 0.5% of women have some form of color blindness

Thus, colors should be distinguishable by people with different forms of color blindness

Color Contrasts

The Viridis palette in R allows us to create color-blind-friendly graphs

These are predefined palettes that are widely used.

geom_point

color(discrete) - No Viridis

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Let’s imagine that we have a new dataset with more variables:

# Reusable dataset
eg1 <- data.frame(
  country = c(
    "North Korea",   # very autocratic, very poor
    "Saudi Arabia",  # autocratic, but richer due to oil
    "Zimbabwe",      # authoritarian, low GDP
    "Russia",        # hybrid regime, middle income
    "Nigeria",       # similar position
    "India",         # low–mid democracy, growing GDP
    "Brazil",        # democracy, mid GDP
    "Poland",        # consolidated democracy, higher GDP
    "South Korea"   # rich democracy
  ), 
  democracy = c(-8, -7, -5, -3, 0, 2, 5, 8, 9),          # democracy score
  gdp = c(2, 9, 4, 7, 8, 20, 15, 25, 27),          # GDP per capita ($1,000s)
  region = c("Asia", "Asia", "Africa",
             "Europe", "Africa", "Asia",
             "Americas", "Europe", "Asia"),        # categorical
  population = c(5, 50, 30, 12, 80, 60, 40, 100, 70), # continuous (millions)
  income_group = factor(
    c("Low", "Low", "Low",
      "Middle", "Middle", "Middle",
      "High", "High", "High"),
    levels = c("Low", "Middle", "High")),
  corruption = c(80, 65, 50, 40, 35, 30, 25, 20, 15))

geom_point

color(discrete) - No Viridis

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Let’s imagine that we have a new dataset with more variables:

country	democracy	gdp	region	population	income_group	corruption
North Korea	-8	2	Asia	5	Low	80
Saudi Arabia	-7	9	Asia	50	Low	65
Zimbabwe	-5	4	Africa	30	Low	50
Russia	-3	7	Europe	12	Middle	40
Nigeria	0	8	Africa	80	Middle	35

geom_point

color(discrete) - No Viridis

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# 1) color (discrete): differentiate categories with distinct hues
ggplot(eg1, aes(x = democracy, y = gdp, color = region)) +
  geom_point()

geom_point

color(discrete) - Viridis

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# 1) color (discrete): differentiate categories with distinct hues
ggplot(eg1, aes(x = democracy, y = gdp, color = region)) +
  geom_point()+
  scale_color_viridis_d(option = "viridis")  # you can also try "magma", "inferno", "cividis", etc.

geom_point

color(continuous) - No Viridis

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# 2) color (continuous): encode a numeric gradient (low → high)
ggplot(eg1, aes(x = democracy, y = gdp, color = corruption)) +
  geom_point()

geom_point

color(continuous) - Viridis

Imagine we have data about 9 countries that record their level of democracy from -10 to +10 (x-axis) and their GDP per capita in $1,000s (y-axis).

Show the code

# 2) color (continuous): encode a numeric gradient (low → high)
ggplot(eg1, aes(x = democracy, y = gdp, color = corruption)) +
  geom_point()+
  scale_color_viridis_c(option = "viridis")  # you can also try "magma", "inferno", "cividis", etc.

Complex Geoms

`geom_sf` fill(color) - No Viridis

Show the code

library(ggplot2)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)

world <- ne_countries(scale = "medium", 
                      returnclass = "sf")
europe_bounds <- list(x = c(-10, 40),
                      y = c(35, 70))

# Mapping it
ggplot() +
  geom_sf(data = world, aes(fill=log(pop_est))) +
  coord_sf(xlim = europe_bounds$x, 
           ylim = europe_bounds$y) +
  labs(title = "European Countries")

Complex Geoms

`geom_sf` fill(color) - Viridis

Show the code

library(ggplot2)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)

world <- ne_countries(scale = "medium", 
                      returnclass = "sf")
europe_bounds <- list(x = c(-10, 40),
                      y = c(35, 70))

# Mapping it
ggplot() +
  geom_sf(data = world, aes(fill=log(pop_est))) +
  coord_sf(xlim = europe_bounds$x, 
           ylim = europe_bounds$y) +
  labs(title = "European Countries")+
  scale_fill_viridis_c(option = "viridis")  # you can also try "magma", "inferno", "cividis", etc.

Conclusion

Geoms provide the structure; aesthetics add meaning
Use aesthetics (color, size, shape, alpha, linetype) to highlight patterns
Remember: mapping = variable-driven | setting = fixed value
Aesthetics can clarify or confuse — use them thoughtfully
Goal: craft plots that are clear, engaging, and persuasive

Exercises

geom_point

color(continuous)

What do you see?

Show the code

ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point()

geom_point

color(continuous)

How does the story change?

Show the code

# 2) color (continuous): encode a numeric gradient (low → high)
ggplot(eg1, aes(x = democracy, y = gdp, color=corruption)) +
  geom_point() +
  scale_color_viridis_c(option = "viridis")  # you can also try "magma", "inferno", "cividis", etc.

geom_point

size(continuous)

What does adding size tell us?

Show the code

# color (continuous): encode a numeric gradient (low → high)
# size (continuous): encode magnitude with point size
ggplot(eg1, aes(x = democracy, y = gdp, color=corruption, size=population)) +
  geom_point() +
  scale_color_viridis_c(option = "viridis")  # you can also try "magma", "inferno", "cividis", etc.

Exercises Answers

geom_point

color(discrete)

What do you see?

Show the code

ggplot(eg1, aes(x = democracy, y = gdp)) +
  geom_point()

democracy ↔︎ gdp

geom_point

color(continuous)

How does the story change?

Show the code

# 2) color (continuous): encode a numeric gradient (low → high)
ggplot(eg1, aes(x = democracy, y = gdp, color=corruption)) +
  geom_point() +
  scale_color_viridis_c(option = "viridis")  # you can also try "magma", "inferno", "cividis", etc.

Corruption and Institutions matter

geom_point

color(continuous)

What does adding size tell us?

Show the code

# color (continuous): encode a numeric gradient (low → high)
# size (continuous): encode magnitude with point size
ggplot(eg1, aes(x = democracy, y = gdp, color=corruption, size=population)) +
  geom_point() +
  scale_color_viridis_c(option = "viridis")  # you can also try "magma", "inferno", "cividis", etc.

Population weights the story: more populous countries also have higher GDP and are more democratic.