# Sample dataframe
students <- data.frame(
name = c("Alice", "Bob", "Charlie"),
age = c(20, 21, 22),
grade = c(80, 75, 100)
)Assignment 2
Instructions
Please submit your assignment as html file to me via email. Please call your assignment: “yourname_assignment2.html”. For example, I would submit submit something called “popescu_assignment2.html”. Please make sure that it is all in lower case. Please answer all the questions below
1 Question
Suppose you have a dataframe called students with three columns: name, age, and grade. Write a function called increase_grade that takes a dataframe and a percentage increase as input, and returns a new dataframe where the grade column has been increased by the specified percentage for all students.
Assume that the the percentage increase is 10. Your output should look like below:
# Test the function
percentage_increase <- 10 # 10% increase in grades
students$new_grade <- increase_grade(students$grade, percentage_increase)
students name age grade new_grade
1 Alice 20 80 88.0
2 Bob 21 75 82.5
3 Charlie 22 100 110.0
2 Question
Suppose you have two datasets:
# Sample data for employee_data
employee_data <- data.frame(
employee_id = c(101, 102, 103, 104, 105),
name = c("Alice", "Bob", "Charlie", "David", "Eve"),
department = c("HR", "Finance", "Marketing", "IT", "Operations")
)# Sample data for performance_data
performance_data <- data.frame(
employee_id = c(101, 102, 103, 104, 105),
rating = c(4.5, 3.2, 2.9, 4.8, 3.7),
bonus = c(1000, 800, 500, 1200, 900)
)Instructions:
Create a function employee_performance_analysis that takes the two datasets as inputs and performs the following tasks:
Step1: Merge employee_data with performance_data based on the employee_id.
Step2: Calculate a new column performance_grade based on the following criteria:
- If
ratingis greater than or equal to 4, assign “High”. - If
ratingis between 3 and 4, assign “Medium”. - If
ratingis less than 3, assign “Low”.
Step3: Return the merged dataset with the added performance_grade column.
This is what your output should look like:
# Test the function
employee_performance <- employee_performance_analysis(employee_data, performance_data)
employee_performance employee_id name department rating bonus performance_grade
1 101 Alice HR 4.5 1000 High
2 102 Bob Finance 3.2 800 Medium
3 103 Charlie Marketing 2.9 500 Low
4 104 David IT 4.8 1200 High
5 105 Eve Operations 3.7 900 Medium
3 Question
Create a function in R that calculates the number of days until the next holiday based on the current date. Here’s the exercise:
Step1: You have the following dataframe:
dates_df <- data.frame(
date_interest = c("2024-02-07",
"2024-03-10",
"2024-04-01")
)Step2: Define a function called days_until_holiday that takes one argument - a character string representing the current date in the format “YYYY-MM-DD”.
Step3: Inside the function, create the following vector of holidays
holidays <- c("New Year's Day" = "2024-01-01",
"Easter" = "2024-04-21",
"Christmas" = "2024-12-25")Step4: Calculate the difference in days between the date of interest and the next holiday and return the number of days until the next holiday.
Your output should look like:
dates_df$days_until_next_holiday<-days_until_holiday(dates_df$date_interest)
dates_df date_interest days_until_next_holiday
1 2024-02-07 74
2 2024-03-10 42
3 2024-04-01 20
4 Question
Now add a new colum in which you list what holiday that is
dates_df$holiday<-name_holiday(dates_df$date_interest)
dates_df date_interest days_until_next_holiday holiday
1 2024-02-07 74 Easter
2 2024-03-10 42 Easter
3 2024-04-01 20 Easter
5 Question
Load the Life expectancy and Urbanization Data
Create a scatterplot showing the relationship between life expectancy and urbanization in 1970
Your output should look like this:
6 Question
Now do the same for 2000
7 Question
Now animate the graph to produce something like below.
8 Question
What can you say about the relationship between urbanization and life expectancy over time. Write 5 sentences.
9 Question
Load the life_exp_urb.csv and produce the following graph:
10 Question
And now the next graph:
11 Question
And now the next graph: