Reading Rasters, Attributes, Raster Subsetting, Vectorizing, Cropping
Today’s agenda includes:
We will use sf
, stars
, and units
packages.
Spatial entries in GIS can be represented as a vector or as a raster
In the previous lecture, we examine vectors and geometric operations with vectors.
Rasters are matrices or an array representing a rectangular area on the surface of the earth.
They contain elements such as :
.tif
).img
).hdf
, .he5
).nc
)There is one important package that we will work with: stars
terra
is also another important package developed by the same authors
Both contain classes, and extensive functions for working with rasters in R
We will mostly work with stars
stars
packageUnlike terra
and raster
, stars
is very well integrated with sf
sf
is the package that we have used for vector analysis.
A typical example of raster is satellite luminosity
Within economics, researchers use satellite luminosity as a proxy for economic activity
You can download satellite luminosity by going to https://www.ncei.noaa.gov/products/dmsp-operational-linescan-system
The next step is to scroll down to products
The next step is to scroll down to products
You can download these files by clicking on the links
Note that these are large files: Every zip file will range from 355 MB to 411.1 MB
The files are cloud-free composites
In cases where two satellites were collecting data - two composites were produced.
Note that these are large files: Every zip file will range from 355 MB to 411.1 MB
The files are cloud-free composites
In cases where two satellites were collecting data - two composites were produced.
The products are 30 arc second grids, spanning -180 to 180 degrees longitude and -65 to 75 degrees latitude.
Each composite set is named with the satellite and the year (F121995 is from DMSP satellite number F12 for the year 1995)
Three types of images are available:
The DMSP/OLS resolution is approximately 1000m
It comprises six different DMSP satellites:
After downloading, your folder should look like below:
You download all the relevant files for this exercise below:
These are the steps to have the data ready for plotting in ggplot
This what the shape file looks like:
These are the steps to have the data ready for plotting in ggplot
This is what our folder looked like before running the code.
These are the steps to have the data ready for plotting in ggplot
This is what our folder looks like after running the code.
This is what our folder looks like before running the code.
This is what our folder looks like after running the code.
This is what our folder looks like before running the code.
This is what our folder looks like after running the code.
Note: do not try to plot it yet with ggplot
.
The raster is too detailed.
It will freeze your computer.
This is how we can reduce the resolution of a raster
Let us now compare the two
We can finally add a more friendly name to the raster
Just reviewing the steps:
#Step1: Read country borders
borders <- st_read(dsn="./data/boundaries/all-country-boundaries.shp", quiet = TRUE)
#Step2: Creating a folder called "F101992"
target_dir <- "./data/luminosity/F101992"
#Step3: Create the target directory if it doesn't exist
if (!dir.exists(target_dir)) {
dir.create(target_dir)
}
#Step4: Unzip the file
untar("./data/luminosity/F101992.v4.tar", exdir = target_dir)
#Step5: Unzip the file further
R.utils::gunzip("./data/luminosity/F101992/F101992.v4b_web.avg_vis.tif.gz", remove = FALSE, overwrite=TRUE)
#Step6: Read the raster
f1992<-read_stars("./data/luminosity/F101992/F101992.v4b_web.avg_vis.tif")
#Step7: Make sure the two files have the same CRS
st_crs(f1992)<-st_crs(borders)
#Step8: Reduce the resolution of the raster: Higher dx/day values mean lower resolution
f1992b<-st_warp(f1992, st_as_stars(st_bbox(f1992), dx = 0.7, dy = 0.7))
#Step9: Rename the raster
names(f1992b)<-"lights_1992"
fig<-ggplot()+
geom_stars(data=f1992b)+
scale_fill_gradientn(colours=c("black","white"))+
geom_sf(data=borders, linewidth = 0.1, fill = NA, color = "white", alpha=0.5)+
theme_bw()+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
norway<-subset(borders, SOVEREIGNT == "Norway")
max_lat_y_eu<-st_bbox(norway)["ymax"]
greece<-subset(borders, SOVEREIGNT == "Greece")
min_lat_y_eu<-st_bbox(greece)["ymin"]
ukraine<-subset(borders, SOVEREIGNT == "Ukraine")
max_lon_x_eu<-st_bbox(ukraine)["xmax"]
portugal<-subset(borders, SOVEREIGNT == "Portugal")
min_lon_x_eu<-st_bbox(portugal)["xmin"]
ggplot()+
geom_stars(data=f1992b)+
scale_fill_gradientn(colours=c("black","white"))+
geom_sf(data=borders, linewidth = 0.1, fill = NA, color = "white", alpha=0.5)+
coord_sf(xlim = c(min_lon_x_eu-3, max_lon_x_eu+3), ylim = c(min_lat_y_eu-1, max_lat_y_eu-6))+
theme_bw()+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())
The print method gives a summary of the raster’s properties.
Fundamentally, a stars
object is a collection of matrix
or array
objects.
This is along with additional properties of the dimensions, such as dimension names, Coordinate Reference Systems (CRS).
We can display the structure of a stars
object using str
.
List of 1
$ lights_1992: num [1:515, 1:201] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimensions")=List of 2
..$ x:List of 7
.. ..$ from : num 1
.. ..$ to : num 515
.. ..$ offset: num -180
.. ..$ delta : num 0.7
.. ..$ refsys:List of 2
.. .. ..$ input: chr "WGS 84"
.. .. ..$ wkt : chr "GEOGCRS[\"WGS 84\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.257223"| __truncated__
.. .. ..- attr(*, "class")= chr "crs"
.. ..$ point : logi NA
.. ..$ values: NULL
.. ..- attr(*, "class")= chr "dimension"
..$ y:List of 7
.. ..$ from : num 1
.. ..$ to : num 201
.. ..$ offset: num 75
.. ..$ delta : num -0.7
.. ..$ refsys:List of 2
.. .. ..$ input: chr "WGS 84"
.. .. ..$ wkt : chr "GEOGCRS[\"WGS 84\",\n DATUM[\"World Geodetic System 1984\",\n ELLIPSOID[\"WGS 84\",6378137,298.257223"| __truncated__
.. .. ..- attr(*, "class")= chr "crs"
.. ..$ point : logi NA
.. ..$ values: NULL
.. ..- attr(*, "class")= chr "dimension"
..- attr(*, "raster")=List of 4
.. ..$ affine : num [1:2] 0 0
.. ..$ dimensions : chr [1:2] "x" "y"
.. ..$ curvilinear: logi FALSE
.. ..$ blocksizes : NULL
.. ..- attr(*, "class")= chr "stars_raster"
..- attr(*, "class")= chr "dimensions"
- attr(*, "class")= chr "stars"
From this, we see that we have the f1992b
object of length 1.
The 2nd line in the str
printout specifies the name and contents of the first (and only) item in the list, namely its type, dimensions, and a sample of the first few values.
In our case, the first item is lights_1992
, and it is a matrix with 515 rows and 201 columns
To access some of the attributes of our objects, we can type:
We can access our raster’s attributes by using the list
access methods
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 0 0 10 7 9
[2,] 0 0 0 0 0 0 0 6 7 6
[3,] 0 0 0 0 0 0 0 5 6 9
[4,] 0 0 0 0 0 0 0 9 6 9
[5,] 0 0 0 0 0 0 0 8 8 8
[6,] 0 0 0 0 0 0 0 9 9 7
[7,] 0 0 0 0 0 0 0 8 7 8
[8,] 0 0 0 0 0 0 0 4 10 9
[9,] 0 0 0 0 0 0 0 6 7 9
[10,] 0 0 0 0 0 0 0 0 8 9
And that it has 512 rows and 201 columns.
Syntax | Objects | Returns |
---|---|---|
x[i] |
vector, list |
Subset i |
x[i, j] |
data.frame , matrix |
Subset i , j |
x[i, j, k] |
array |
Subset i , j , k |
x[[i]] |
vectors, lists |
Single element i |
x$i |
data.frame , list |
Single column or element i |
x@n |
S4 objects | Slot n |
We can use the following function to get at important raster properties
The other properties of rasters can be extracted in the following way:
[1] -180.0042
[1] 75.00417
[1] 0.7
[1] -0.7
Note that the resolution is separate for x
and y
.
If the (absolute) resolution is equal for both, raster pixels are square.
When the x
and y
resolutions are unequal, raster pixels are non-square rectangles.
We can find out the crs of our raster using the st_crs
function
Coordinate Reference System:
User input: WGS 84
wkt:
GEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["latitude",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["longitude",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]]
As indicated, raster values can be extracted directly, as a matrix
or as an array
.
For example:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 0 0 10 7 9
[2,] 0 0 0 0 0 0 0 6 7 6
[3,] 0 0 0 0 0 0 0 5 6 9
[4,] 0 0 0 0 0 0 0 9 6 9
[5,] 0 0 0 0 0 0 0 8 8 8
[6,] 0 0 0 0 0 0 0 9 9 7
[7,] 0 0 0 0 0 0 0 8 7 8
[8,] 0 0 0 0 0 0 0 4 10 9
[9,] 0 0 0 0 0 0 0 6 7 9
[10,] 0 0 0 0 0 0 0 0 8 9
A histogram will help us identify the raster values distribution
Here is the difference between the original raster and the dataframe
A histogram will help us identify the raster values distribution
And here is how we can plot the raster vs. the dataframe
And this is what the data looks like:
We thus see that most of values around the world are 0.
This is because 71 percent of the Earth’s surface is water-covered.
The important part is that by transforming the stars object into a dataframe (grid), we can perform the usual operations dedicated to dataframes.
We can also calculate the mean, sd, min, and max
This is what luminosity around the world looks like
This is what the data looks like:
p2<- ggplot(df, aes(lum_1992)) +
guides(color = "none") +
geom_histogram(binwidth = 2,
color = "white",
alpha = 0.5)+
#Adding density and multiplying by a manually defined value
#for visibility
geom_density(aes(y = after_stat(density)* 80000),
fill=NA, color=alpha("blue", 0.5))+
#Adding a secondary axis
scale_y_continuous(name = "count",
sec.axis =
sec_axis(~.x/80000,
name = "density"))+
theme_bw()
p2
Let’s focus on Italy.
You can download the Rome central area polygon
You can also download the main streets for the central area
Make sure you place them in the relevant folder.
Let’s focus on Italy
#Step1: Reading the central polygon
relev_poly<-st_read(dsn="./data/rome_center/central_polygon.shp", quiet = T)
#Step2: Reading roads
gis_osm_roads <- st_read(dsn="./data/selection.gdb", layer="gis_osm_roads", quiet = T)
#Step3: Extracting the bix from the central polygon
ext2<-st_bbox(relev_poly)
#Step4: Ensuring the same crs for the polygon as for the raster
st_crs(relev_poly)<-st_crs(f1992)
#Step5: Cropping the raster
rc <- st_crop(f1992, relev_poly)
This is how the original vs. the cropped raster compare
We can finally add the roads in Rome to get more context.
Let’s focus on Italy
Every grid has a value:
#Turning the object into a dataframe
rc2<-st_as_sf(rc)
names(rc2)[1]<-"lum_1992"
#Step6: Mapping everything
ggplot()+
geom_sf(data = rc2, aes(fill=lum_1992), linewidth=0)+
geom_sf(data = relev_poly, color="white", fill=NA)+
geom_sf(data = gis_osm_roads, color="grey50", linewidth=0.05)+
theme_bw()+
scale_fill_gradientn(colours=c("black","white"), na.value=NA)+
geom_sf_text(data = rc2,
aes(label = lum_1992, geometry = geometry),
color = ifelse(rc2$lum_1992 < 61, "white", "black"), size = 1.5)
Every grid has a value:
Let’s zoom in:
#Turning the object into a dataframe
rc2<-st_as_sf(rc)
names(rc2)[1]<-"lum_1992"
#Step6: Mapping everything
ggplot()+
geom_sf(data = rc2, aes(fill=lum_1992), linewidth=0)+
geom_sf(data = relev_poly, color="white", fill=NA)+
geom_sf(data = gis_osm_roads, color="grey50", linewidth=0.05)+
theme_bw()+
scale_fill_gradientn(colours=c("black","white"), na.value=NA)+
geom_sf_text(data = rc2,
aes(label = lum_1992, geometry = geometry),
color = ifelse(rc2$lum_1992 < 61, "white", "black"),
#aes(label = lum_1992, geometry = geometry, color = ifelse(lum_1992 < 61, "white", "black")),
size = 4)+
coord_sf(xlim = c(12.45, 12.60),
ylim = c(41.87, 41.79))
Let’s zoom in:
As you might imagine, we can use different colors for these values:
As you might imagine, we can use different colors for these values:
mapview
We can easily replace the values inside the raster (grid) by using simple dataframe operations.
For example, let’s replace all the values that have 63 with NA:
rc3<-rc2
rc3$lum_1992[rc3$lum_1992==63]<-NA
pic<-ggplot()+
geom_sf(data = rc3, aes(fill=lum_1992), linewidth=0)+
geom_sf(data = relev_poly, color="white", fill=NA)+
geom_sf(data = gis_osm_roads, color="grey50", linewidth=0.05)+
theme_bw()+
scale_fill_viridis_c(option = "magma",begin = 0.1)+
#scale_fill_gradientn(colours=c("black","white"))+
geom_sf_text(data = rc2,
aes(label = lum_1992, geometry = geometry),
color = ifelse(rc3$lum_1992 < 61, "white", "black"),
#aes(label = lum_1992, geometry = geometry, color = ifelse(lum_1992 < 61, "white", "black")),
size = 4)+
coord_sf(xlim = c(12.45, 12.60),
ylim = c(41.87, 41.79))
pic
We can easily replace the values inside the raster (grid) by using simple dataframe operations.
For example, let’s replace all the values that have 63 with NA:
We can easily replace the values inside the raster (grid) by using simple dataframe operations.
For example, let’s replace all the values that have 63 with NA:
Let us now make the NA grids transparent
rc3<-rc2
rc3$lum_1992[rc3$lum_1992==63]<-NA
pic<-ggplot()+
geom_sf(data = rc3, aes(fill=lum_1992), linewidth=0)+
geom_sf(data = relev_poly, color="white", fill=NA)+
geom_sf(data = gis_osm_roads, color="grey50", linewidth=0.05)+
theme_bw()+
scale_fill_viridis_c(option = "magma",begin = 0.1,
na.value = "white")+
#scale_fill_gradientn(colours=c("black","white"))+
geom_sf_text(data = rc2,
aes(label = lum_1992, geometry = geometry),
color = ifelse(rc3$lum_1992 < 61, "white", "black"),
#aes(label = lum_1992, geometry = geometry, color = ifelse(lum_1992 < 61, "white", "black")),
size = 4)+
coord_sf(xlim = c(12.45, 12.60),
ylim = c(41.87, 41.79))
pic
We can easily replace the values inside the raster (grid) by using simple dataframe operations.
For example, let’s replace all the values that have 63 with NA:
Let us now make the NA grids transparent
We can of course write the grid as a shapefile for we can turn it back to a raster.
The raster
The grid
We can write the raster in the following way:
We can write the grid in the following way:
#Step1: Creating a folder called
target_dir <- "./data/output"
#Step2: Create the target directory if it doesn't exist
if (!dir.exists(target_dir)) {
dir.create(target_dir)
}
#Step4: Writing the shape files
st_write(rc3, "./data/output/rc3_grid.shp", update = TRUE)
Updating layer `rc3_grid' to data source `./data/output/rc3_grid.shp' using driver `ESRI Shapefile'
Updating existing layer rc3_grid
Writing 535 features with 1 fields and geometry type Polygon.
We have covered a lot of operations related to rasters
We learned about:
Popescu (JCU): Lecture 15