# Sample dataframe
<- data.frame(
students name = c("Alice", "Bob", "Charlie"),
age = c(20, 21, 22),
grade = c(80, 75, 100)
)
Assignment 2
Instructions
Please submit your assignment as html file to me via email. Please call your assignment: “yourname_assignment2.html”. For example, I would submit submit something called “popescu_assignment2.html”. Please make sure that it is all in lower case. Please answer all the questions below
1 Question
Suppose you have a dataframe called students with three columns: name, age, and grade. Write a function called increase_grade
that takes a dataframe and a percentage increase as input, and returns a new dataframe where the grade
column has been increased by the specified percentage for all students.
Assume that the the percentage increase is 10. Your output should look like below:
# Test the function
<- 10 # 10% increase in grades
percentage_increase $new_grade <- increase_grade(students$grade, percentage_increase)
students students
name age grade new_grade
1 Alice 20 80 88.0
2 Bob 21 75 82.5
3 Charlie 22 100 110.0
2 Question
Suppose you have two datasets:
# Sample data for employee_data
<- data.frame(
employee_data employee_id = c(101, 102, 103, 104, 105),
name = c("Alice", "Bob", "Charlie", "David", "Eve"),
department = c("HR", "Finance", "Marketing", "IT", "Operations")
)
# Sample data for performance_data
<- data.frame(
performance_data employee_id = c(101, 102, 103, 104, 105),
rating = c(4.5, 3.2, 2.9, 4.8, 3.7),
bonus = c(1000, 800, 500, 1200, 900)
)
Instructions:
Create a function employee_performance_analysis
that takes the two datasets as inputs and performs the following tasks:
Step1: Merge employee_data
with performance_data
based on the employee_id.
Step2: Calculate a new column performance_grade
based on the following criteria:
- If
rating
is greater than or equal to 4, assign “High”. - If
rating
is between 3 and 4, assign “Medium”. - If
rating
is less than 3, assign “Low”.
Step3: Return the merged dataset with the added performance_grade
column.
This is what your output should look like:
# Test the function
<- employee_performance_analysis(employee_data, performance_data)
employee_performance employee_performance
employee_id name department rating bonus performance_grade
1 101 Alice HR 4.5 1000 High
2 102 Bob Finance 3.2 800 Medium
3 103 Charlie Marketing 2.9 500 Low
4 104 David IT 4.8 1200 High
5 105 Eve Operations 3.7 900 Medium
3 Question
Create a function in R that calculates the number of days until the next holiday based on the current date. Here’s the exercise:
Step1: You have the following dataframe:
<- data.frame(
dates_df date_interest = c("2024-02-07",
"2024-03-10",
"2024-04-01")
)
Step2: Define a function called days_until_holiday
that takes one argument - a character string representing the current date in the format “YYYY-MM-DD”.
Step3: Inside the function, create the following vector of holidays
<- c("New Year's Day" = "2024-01-01",
holidays "Easter" = "2024-04-21",
"Christmas" = "2024-12-25")
Step4: Calculate the difference in days between the date of interest and the next holiday and return the number of days until the next holiday.
Your output should look like:
$days_until_next_holiday<-days_until_holiday(dates_df$date_interest)
dates_df dates_df
date_interest days_until_next_holiday
1 2024-02-07 74
2 2024-03-10 42
3 2024-04-01 20
4 Question
Now add a new colum in which you list what holiday that is
$holiday<-name_holiday(dates_df$date_interest)
dates_df dates_df
date_interest days_until_next_holiday holiday
1 2024-02-07 74 Easter
2 2024-03-10 42 Easter
3 2024-04-01 20 Easter
5 Question
Load the Life expectancy and Urbanization Data
Create a scatterplot showing the relationship between life expectancy and urbanization in 1970
Your output should look like this:
6 Question
Now do the same for 2000
7 Question
Now animate the graph to produce something like below.
8 Question
What can you say about the relationship between urbanization and life expectancy over time. Write 5 sentences.
9 Question
Load the life_exp_urb.csv
and produce the following graph:
10 Question
And now the next graph:
11 Question
And now the next graph: