Regression Discontinuity Design

Causality at the Cutoff

Bogdan G. Popescu

John Cabot University

Intro

How can we identify causal effects using observational data?

Some of the ways to conduct causal analysis with observational data, we need to run

  • differences in differences
  • regression discontinuity design

RDD

RDD are about arbitrary rules and thresholds determining access to policy programs

  • subjects are in the program if they have scores above a specific threshold
  • subject are not in the program if they have scores below a specific threshold

Or the other way around

Running or forcing variable - index that measures eligibility for the program (e.g. scores)
Cutoff / cutpoint / threshold - the number that the determines access to the program

RDD

RDD

Student ID Exam Score Scholarship Awarded? First-Year GPA
101 83 No 3.2
102 84 No 3.3
103 85 Yes 3.6
104 86 Yes 3.5
105 87 Yes 3.7
106 84.5 No 3.4
107 85.1 Yes 3.6

RDD

Hypothetical tutoring program

Students take an entrance exam

Those who score 70 or lower get a free tutor for the year

Students then take an exit exam at the end of the year

RDD

Hypothetical tutoring program

Show the code
# Fake program data
set.seed(1234)
num_students <- 1000
tutoring <- tibble(
  id = 1:num_students,
  entrance_exam = rbeta(num_students, shape1 = 7, shape2 = 2),
  exit_exam = rbeta(num_students, shape1 = 5, shape2 = 3)
) |> 
  mutate(entrance_exam = entrance_exam * 100,
         tutoring = entrance_exam <= 70) |> 
  mutate(exit_exam = exit_exam * 40 + 10 * tutoring + entrance_exam / 2) |> 
  mutate(tutoring_fuzzy = ifelse(entrance_exam > 60 & entrance_exam < 80,
                                 sample(c(TRUE, FALSE), n(), replace = TRUE),
                                 tutoring)) |> 
  mutate(tutoring_text = factor(tutoring, levels = c(FALSE, TRUE),
                                labels = c("No tutor", "Tutor")),
         tutoring_fuzzy_text = factor(tutoring_fuzzy, levels = c(FALSE, TRUE),
                                      labels = c("No tutor", "Tutor"))) |> 
  mutate(entrance_centered = entrance_exam - 70)
Show the code
ggplot(tutoring, aes(x = entrance_exam, y = tutoring_text, fill = tutoring_text)) +
  geom_vline(xintercept = 70, linewidth = 2, color = "#FFDC00") + 
  geom_point(size = 3, pch = 21, color = "white", alpha = 0.4,
             position = position_jitter(width = 0, height = 0.15, seed = 1234)) + 
  labs(x = "Entrance exam score", y = NULL) + 
  guides(fill = "none") +
  scale_fill_manual(values = c("black", "red"), name = NULL) +
  theme_bw(base_size = 28)

Intuition

Hypothetical tutoring program

The people right before and right after the threshold are very similar

The idea is that we can measure the effect of the program by examining students just around the threshold

Show the code
ggplot(tutoring, aes(x = entrance_exam, y = tutoring_text, fill = tutoring_text)) +
  geom_vline(xintercept = 70, linewidth = 2, color = "#FFDC00") + 
    annotate(geom = "rect", fill = "grey50", alpha = 0.25, ymin = -Inf, ymax = Inf,
           xmin = 70 - 5,  xmax = 70 + 5) +
    annotate(geom = "rect", fill = "grey50", alpha = 0.5, ymin = -Inf, ymax = Inf,
           xmin = 70 - 2,  xmax = 70 + 2) +
  geom_point(size = 3, pch = 21, color = "white", alpha = 0.4,
             position = position_jitter(width = 0, height = 0.15, seed = 1234)) + 
  labs(x = "Entrance exam score", y = NULL) + 
  guides(fill = "none") +
  scale_fill_manual(values = c("black", "red"), name = NULL) +
  theme_bw(base_size = 28)

Intuition

Hypothetical tutoring program

The people right before and right after the threshold are very similar

The idea is that we can measure the effect of the program by examining students just around the threshold.

Show the code
ggplot(tutoring, aes(x = entrance_exam, y = tutoring_text, fill = tutoring_text)) +
  geom_vline(xintercept = 70, linewidth = 2, color = "#FFDC00") + 
    annotate(geom = "rect", fill = "grey50", alpha = 0.25, ymin = -Inf, ymax = Inf,
           xmin = 70 - 5,  xmax = 70 + 5) +
    annotate(geom = "rect", fill = "grey50", alpha = 0.5, ymin = -Inf, ymax = Inf,
           xmin = 70 - 2,  xmax = 70 + 2) +
  geom_point(size = 3, pch = 21, color = "white", alpha = 0.4,
             position = position_jitter(width = 0, height = 0.15, seed = 1234)) + 
  labs(x = "Entrance exam score", y = NULL) + 
  guides(fill = "none") +
  scale_fill_manual(values = c("black", "red"), name = NULL) +
  theme_bw(base_size = 28)+
    coord_cartesian(xlim = c(70 - 5, 70 + 5))

Intuition

Hypothetical tutoring program

Show the code
ggplot(tutoring, aes(x = entrance_exam, y = exit_exam, fill = tutoring_text)) +
  geom_point(size = 3, pch = 21, color = "white", alpha = 0.4)+
  geom_vline(xintercept = 70, size = 2, color = "#FFDC00") + 
  labs(x = "Entrance exam score", y = "Exit exam score") + 
  #guides(fill = "none") +
    guides(fill = guide_legend(reverse = TRUE)) +
  scale_fill_manual(values = c("black", "red"), name = NULL) +
  theme_bw(base_size = 28)+
    coord_sf(xlim = c(30, 100), 
           ylim = c(40, 90))