Assignment 4

Author

Your Name

Published

November 20, 2024

Instructions

Please submit your assignment as html file to me via email. Please call your assignment: “yourname_assignment4.html”. For example, I would submit something called “popescu_assignment4.html”. Please make sure that it is all in lower case.

To match the appearance of this template, ensure that your assignment preamble or YAML is set up as follows

---
title: "Assignment 4"
author: "Your Name"
date: "November 20, 2024"
format:
  html:
    toc: true
    number-sections: true
    colorlinks: true
    smooth-scroll: true
    embed-resources: true
---

Important: Headers like “Question” should be marked with a # (for example, # Question). This will format the text as a bold, larger header when the document is rendered, and it will also appear in the Table of Contents. For regular text, simply write it on the line below the header.

Please answer all the questions below.

Instructions

Please submit your assignment as html file to me via email. Please call your assignment: “yourname_assignment4.html”. For example, I would submit submit something called “popescu_assignment4.html”. Please make sure that it is all in lower case. Please answer all the questions below.

1 Introduction

In this hypothetical situation (the data is made up, i.e. does not exist), Carrefour Rome is planning on launching a coupon program designed to increase revenue by providing discount coupon for purchases over 40 euros. So they have funding to run a randomized controlled trial (RCT) in some stores They selected about a 70% of their stores in Rome and they want to investigate whether offering coupons has any effect on revenue.

Thus, we want to calculate \(E(\text{Post-Treatment revenue | Coupon})\)

In a first instance, we create the relationship among the many different variables:

Show the code
revenue_dag <- dagify(post_treatment_revenue ~ treat + traffic + store_size + variety_products,
                     variety_products ~ traffic + store_size,
                     exposure = "treat",
                     outcome = "post_treatment_revenue",
                     labels = c(post_treatment_revenue = "Post Treatment revenue",
                                treat = "Coupons",
                                traffic = "Traffic",
                                store_size = "Store Size",
                                variety_products = "Variety of Product"),
                     coords = list(x = c(treat = 4, 
                                         post_treatment_revenue=7, 
                                         store_size = 5, variety_products = 2, 
                                         traffic = 4),
                                   y = c(treat = 3,
                                         post_treatment_revenue=2,
                                         store_size = 5, variety_products = 2, 
                                         traffic = 4)))

#bigger_dag <-data.frame(tidy_dagitty(revenue_dag))
Show the code
bigger_dag <-data.frame(tidy_dagitty(revenue_dag))
bigger_dag$type<-NA
bigger_dag$type<-"Counfounder"
bigger_dag$type[bigger_dag$name=="post_treatment_revenue"]<-"Outcome"
bigger_dag$type[bigger_dag$name=="treat"]<-"Intervention"
min_lon_x<-min(bigger_dag$x)
max_lon_x<-max(bigger_dag$x)
min_lat_y<-min(bigger_dag$y)
max_lat_y<-max(bigger_dag$y)
Show the code
col = c("Outcome"="green3",
        "Intervention"="red",
        "Counfounder"="grey60")

order_col<-c("Outcome", "Intervention", "Counfounder")


ggplot(data = bigger_dag, aes(x = x, y = y, xend = xend, yend = yend, color=type)) +
  geom_dag_point() +
  geom_dag_edges() +
 coord_sf(xlim = c(min_lon_x-1, max_lon_x+1),
          ylim = c(min_lat_y-1, max_lat_y+1), expand = FALSE)+
    scale_colour_manual(values = col,
                      name = "Group",
                      breaks = order_col)+
        geom_label_repel(data = subset(bigger_dag, !duplicated(bigger_dag$label)), 
                   aes(label = label), fill = alpha(c("white"),0.8))+

    theme_bw()+
  labs(x = "", y="")

2 Inspecting the Area

Let’s assume that the experiment has been conducted and the data has come in. Let us first have a first look at the area where Carrefour ran the experiment.

Question 1

Produce the same map in leaflet.

Question 2

Download the data and map the stores and the roads. Use different colors for the three different types of stores in the data.

Question 3

Create a different static map that displays the stores that got treated vs. not.

Question 4

Create a summary statistics for the data and comment on the skewness or normality.

Question 5

What is the percentage of stores that received the treatment?

Question 6

Create graphs that compare the pointrange for the treatment and control groups for the following variables:

  • variety of products
  • store size
  • traffic
  • pre-treatment income

Question 7

Create graphs that compare the distributions for the treatment and control groups for the following variables:

  • variety of products
  • store size
  • traffic
  • pre-treatment income

Question 8

Are the treatment and the control group the same on pre-treatment covariates? Perform adequate tests and interpret them?

Question 9

Estimate the difference: \(E(\text{Post-revenue | Coupon})\). What is the ATE? Calculate the average between the treatment and control group using means.

Question 10

Calculate the difference between treatment and control groups using a regression. Make sure that the variables are labelled appropriately.

Question 11

Plot the differences in the distribution in income after the experiment between the treatment and the control group.

Question 12

Plot the differences in the pointrange in income after the experiment between the treatment and the control group.

Question 13

What do you tell the Carrefour manager? Do the coupons work? Should Carrefour expand the coupon system?