Assignment 2
Instructions
Please submit your assignment as html file to me via email. Please call your assignment: “yourname_assignment2.html”. For example, I would submit something called “popescu_assignment2.html”. Please make sure that it is all in lower case.
To match the appearance of this template, ensure that your assignment preamble or YAML is set up as follows
---
title: "Assignment 2"
author: "Your Name"
date: "September 13, 2024"
format:
html:
toc: true
number-sections: true
colorlinks: true
smooth-scroll: true
embed-resources: true
---
Important: Headers like “Question” should be marked with a #
(for example, # Question
). This will format the text as a bold, larger header when the document is rendered, and it will also appear in the Table of Contents. For regular text, simply write it on the line below the header.
Please answer all the questions below.
1 Question
Open the urbanization and life expectancy datasets
2 Question
Print only the first 4 entries of both datasets.
3 Question
Create a subset of Latin American countries that has the two variables: life expectancy and urbanization. Calculate the averages by country and then perform a left join based on country. The Latin American country list should include: Belize, Costa Rica, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, Panama, Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Guyana, Paraguay, Peru, Suriname, Uruguay, Venezuela, Cuba, Dominican Republic, Haiti.
Note: make sure you calculate country averages before you run this. In other words, you should have for example Afghanistan only once as opposed to Afghanistan in 1990, Afghanistan in 1991, or Afghanistan in 1992, etc.
4 Question
Create a histogram for life expectancy for the list of Latin American countries with 10 bins. Use the geom_histogram
option.
5 Question
Create a histogram for urbanization for the list of Latin American countries with 10 bins
6 Question
Put the two histograms side by side using the grid.arrange
function.
7 Question
What is the mean, median, standard deviation, min, and max for the Latin American sample for life expectancy? Just write down the numbers. There is no need for a graph.
8 Question
What is the mean, median, standard deviation, min, and max for the Latin American sample for urbanization? Just write down the numbers. There is no need for a graph.
9 Question
Draw a scatterplot in which you examine the relationship between life expectancy and urbanization for the list of Latin American countries. Make sure that the graph meets the following specifications:
- the X and Y axis have to be fixed between 0 and 100
- the X axis should be called “Urbanization” and Y axis should be called “Life Expectancy”
- the title should be “Latin America”
10 Question
Draw a scatterplot in which you examine the relationship between life expectancy and urbanization for the list of Latin American countries. Make sure that the graph meets the following specifications:
- the X and Y axis have to be fixed between 50 and 80
- the X axis should be called “Urbanization” and Y axis should be called “Life Expectancy”
- the title should be “Latin America”
11 Question
Fit a line to the same scatter plot in the original scatterplot where the limits of X and Y are 0 and 100, respectively, and interpret it: is there a relationship or is there not between urbanization and life expectancy?
12 Question
Label all the countries in the graph
13 Question
Label only Brazil and Mexico
14 Question
Select Italy and the US in from the original dataset and draw a time trend where you depict life expectancy from 1900 until the present day. Make the US blue and Italy red.
15 Question
When is the dip in life expectancy and what could be the cause. Write maximum 5 sentences.