Upon successful completion of this course the students will be able to:
Data Scientist/Data Analyst
GIS Analyst/GIS Specialist
Environmental Scientist
Market Research Analyst
Remote Sensing Specialist
Transportation Planner
You will be graded on four problem sets during the semester (each 12.5% of your grade) and a final report and presentation (each 25% of your grade).
You will undertake a GIS project that emphasizes practical application of spatial analysis techniques using the R programming language
The project entails a few steps:
In the first part of the course, we will learn about the R programming language and its capabilities with respect to spatial data
The first lectures will be dedicated to acquire the basic knowledge to work with spatial data bit also with R
We will then move to work with spatial data in R, including how to process: vectors, rasters, and combine the two
In the final part, we will also learn how to deal with spatio-temporal data and point pattern analysis
R is a programming language originally designed for statistical computing
It is an open-source ecosystem (i.e. everyone can contribute and it’s free)
It is compatible with Windows, Mac, and Linux
A variety of libraries already exist which allow you to do easy things like:
This is how R compares to other programming languages
R is used in a variety of fields:
Examples of companies which use R include
GIS stands for Geographic Information Systems
GIS is a system that that creates, manages, analyzes, and maps all types of data
It helps us understand patterns, relationships, and geographic context
It can be used to:
Mapping focuses on the visual representation of data
Spatial analysis focuses on a variety of aspects:
GIS comprises both mapping (visualization) and geographic data manipulations and analysis
GIS Software
#Step1: Data Cleaning
clean_countries<-subset(life_expectancy2, !(Code %in% weird_labels))
clean_countries_urbanization<-subset(urbanization2, !(Code %in% weird_labels))
#Step2: Further Data Cleaning
clean_countries<-subset(life_expectancy2, !(Code %in% weird_labels))
clean_countries_urbanization<-subset(urbanization2, !(Code %in% weird_labels))
#Step3: Left Join
new_data<-left_join(clean_countries, clean_countries_urbanization, by = c("Code"="Code"))#Step1: Data Cleaning
clean_countries<-subset(life_expectancy2, !(Code %in% weird_labels))
clean_countries_urbanization<-subset(urbanization2, !(Code %in% weird_labels))
#Step2: Further Data Cleaning
clean_countries<-subset(life_expectancy2, !(Code %in% weird_labels))
clean_countries_urbanization<-subset(urbanization2, !(Code %in% weird_labels))
#Step3: Left Join
new_data<-left_join(clean_countries, clean_countries_urbanization, by = c("Code"="Code"))R is good for:
Reading and writing spatial data into R is done through external libraries
sfsf will be the main library that we will work with
It will help us deal with:
sf: bufferstarsWe can perform geometric operation on rasters (pictures) with the stars package
starsTemperature in 1901
starsTemperature in 2022
starsTemperature difference between 2022 and 1901 > 4
ggplot2 is the library that will allow to visualize data analysis results, but also to make mapsleaflet is a library that allows us to make interactive mapsmapview is a wrapper around leaflet automating the addition of: labels, popups, color scales, and common basemapsA programming language is a machine-readable artificial language designed to express computations that can be performed by a computer.
Programming allows us to edit code and re-use it in the future and obtain the same results in the future
In object-oriented programming, the interaction with the computer takes place though objects
Each object belongs to a class: an abstract structure that has specific properties
Example:
All cars in the parking lot are instances of the “car” class
The “car” class has specific properties: make, color, year and methods: start, drive stop
We will see that everything that we work with in R is an object
For example, we can load up a geojson file in R.
#Step1: Loading the geojson file
library(sf)
restaurants <- read_sf("/Users/bgpopescu/Dropbox/john_cabot/teaching/big_data/week7/data/restaurant.geojson")
#Step2: Selecting only the relevant variables
restaurants<-subset(restaurants, select = c(name, `addr:street`))
#Step3: Removing the restaurants without a name or without an address
restaurants2<-subset(restaurants, !is.na(restaurants$name) | !is.na(restaurants$`addr:street`))R transforms the geojson file into an object of a class named sf data.frame
This type of object has numerous properties such as:
Once imported, the sf data.frame is saved in the computer memory
Printing the object will display some of its properties and specific properties
Simple feature collection with 2811 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 12.21167 ymin: 41.70574 xmax: 12.77428 ymax: 42.06974
Geodetic CRS:  WGS 84
# A tibble: 2,811 × 3
   name                          `addr:street`                    geometry
   <chr>                         <chr>                         <POINT [°]>
 1 Pizzeria ai Marmi             Viale di Trastevere   (12.47379 41.88826)
 2 Sichuan Haozi                 Via di San Martino a…  (12.49948 41.8958)
 3 Dar filettaro a Santa Barbara Largo dei Librari      (12.4737 41.89467)
 4 Al Peperoncino                Via Ostiense          (12.47698 41.85343)
 5 Ai Tre Scalini                Via Panisperna        (12.49044 41.89628)
 6 Trattoria Ada e Mario         Circonvallazione App… (12.51433 41.87532)
 7 Gustosando                    <NA>                  (12.42743 41.89954)
 8 Sa Posada                     Via Elvia Recina       (12.5079 41.87995)
 9 Pizzeria Formula 1            Via degli Equi        (12.51268 41.89702)
10 Da Francesco                  Piazza del Fico        (12.4704 41.89932)
# ℹ 2,801 more rows
By printing the object, we can see some of its properties including:
One of the characteristics of object oriented programming is inheritance
Inheritance is what makes it possible for one class to extend to another class, by adding other properties
Example:
A “taxi” is an extension of a “car” class, inheriting all of its properties and methods.
A taxi could have new properties like taxi company name
In R, every complex object is a collection of smaller components such a properties
We can use str to examine the properties of the class
sf [2,811 × 3] (S3: sf/tbl_df/tbl/data.frame)
 $ name       : chr [1:2811] "Pizzeria ai Marmi" "Sichuan Haozi" "Dar filettaro a Santa Barbara" "Al Peperoncino" ...
 $ addr:street: chr [1:2811] "Viale di Trastevere" "Via di San Martino ai Monti" "Largo dei Librari" "Via Ostiense" ...
 $ geometry   :sfc_POINT of length 2811; first list element:  'XY' num [1:2] 12.5 41.9
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA
  ..- attr(*, "names")= chr [1:2] "name" "addr:street"
For example, the names of the restaurants are stored as a string variable called name (second line of output)
The addresses of the restaurants are stored as a string variable called addr:street (second line of output)
We will now familiarize ourselves with the R environment
We first need to install R: R-project
We will then install an R interface that allows us to interact with R in a more user-friendly manner: R-studio
Popescu (JCU): Lecture 1