Assignment 2

Author

Your Name

Published

September 13, 2024

Instructions

Please submit your assignment as html file to me via email. Please call your assignment: “yourname_assignment2.html”. For example, I would submit something called “popescu_assignment2.html”. Please make sure that it is all in lower case.

To match the appearance of this template, ensure that your assignment preamble or YAML is set up as follows

---
title: "Assignment 2"
author: "Your Name"
date: "September 13, 2024"
format:
  html:
    toc: true
    number-sections: true
    colorlinks: true
    smooth-scroll: true
    embed-resources: true
---

Important: Headers like “Question” should be marked with a # (for example, # Question). This will format the text as a bold, larger header when the document is rendered, and it will also appear in the Table of Contents. For regular text, simply write it on the line below the header.

Please answer all the questions below.

1 Question

Open the urbanization and life expectancy datasets

2 Question

Print only the first 4 entries of both datasets.

3 Question

Create a subset of Latin American countries that has the two variables: life expectancy and urbanization. Calculate the averages by country and then perform a left join based on country. The Latin American country list should include: Belize, Costa Rica, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, Panama, Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Guyana, Paraguay, Peru, Suriname, Uruguay, Venezuela, Cuba, Dominican Republic, Haiti.
Note: make sure you calculate country averages before you run this. In other words, you should have for example Afghanistan only once as opposed to Afghanistan in 1990, Afghanistan in 1991, or Afghanistan in 1992, etc.

4 Question

Create a histogram for life expectancy for the list of Latin American countries with 10 bins. Use the geom_histogram option.

5 Question

Create a histogram for urbanization for the list of Latin American countries with 10 bins

6 Question

Put the two histograms side by side using the grid.arrange function.

7 Question

What is the mean, median, standard deviation, min, and max for the Latin American sample for life expectancy? Just write down the numbers. There is no need for a graph.

8 Question

What is the mean, median, standard deviation, min, and max for the Latin American sample for urbanization? Just write down the numbers. There is no need for a graph.

9 Question

Draw a scatterplot in which you examine the relationship between life expectancy and urbanization for the list of Latin American countries. Make sure that the graph meets the following specifications:

  • the X and Y axis have to be fixed between 0 and 100
  • the X axis should be called “Urbanization” and Y axis should be called “Life Expectancy”
  • the title should be “Latin America”

10 Question

Draw a scatterplot in which you examine the relationship between life expectancy and urbanization for the list of Latin American countries. Make sure that the graph meets the following specifications:

  • the X and Y axis have to be fixed between 50 and 80
  • the X axis should be called “Urbanization” and Y axis should be called “Life Expectancy”
  • the title should be “Latin America”

11 Question

Fit a line to the same scatter plot in the original scatterplot where the limits of X and Y are 0 and 100, respectively, and interpret it: is there a relationship or is there not between urbanization and life expectancy?

12 Question

Label all the countries in the graph

13 Question

Label only Brazil and Mexico

14 Question

Select Italy and the US in from the original dataset and draw a time trend where you depict life expectancy from 1900 until the present day. Make the US blue and Italy red.

15 Question

When is the dip in life expectancy and what could be the cause. Write maximum 5 sentences.