L17: Rasters and Raster Operations 3

Complex Rasters, Temperature NC Data

Bogdan G. Popescu

John Cabot University

Today

Today’s agenda includes:

  • working with more complex rasters: nc files
  • examining temperature data over time
  • performing raster algebra

Working with More Complex Rasters

Working with lights was simple

In many other contexts, we can have more complex rasters

Temperature data from CEDA is an example of a raster that contains time series

Temperature

Temperature in 2019

Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download temperature data from by going to https://archive.ceda.ac.uk


Temperature

You can download the CRU file (3.04 GB)

Place it in a folder called “temp”

Reading Temperature

Here is how we can read the file

library(terra)
library(sf)
library(stars)
library(ggplot2)
#Step1: Read the raster
b <- terra::rast("./datal17/temp/cru_ts4.07.1901.2022.tmp.dat.nc")

This is the type of file we deal with

class(b)
[1] "SpatRaster"
attr(,"package")
[1] "terra"

Reading Temperature

We then we extract the time component

#Step2: Extract time information
time_values <- time(b)

Let’s print the first 10 entries

time_values[1:10]
 [1] "1901-01-16" "1901-02-15" "1901-03-16" "1901-04-16" "1901-05-16"
 [6] "1901-06-16" "1901-07-16" "1901-08-16" "1901-09-16" "1901-10-16"
min(time_values)
[1] "1901-01-16"
max(time_values)
[1] "2022-12-16"

Reading Temperature

We now turn those numbers into Date objects

#Step3: Turning strings into time
date_objects <- as.Date(time_values)
date_objects[1:10]
 [1] "1901-01-16" "1901-02-15" "1901-03-16" "1901-04-16" "1901-05-16"
 [6] "1901-06-16" "1901-07-16" "1901-08-16" "1901-09-16" "1901-10-16"
min(date_objects)
[1] "1901-01-16"
max(date_objects)
[1] "2022-12-16"

Reading Temperature

We can select all the dates from 1901 and 2022

#Step4: Select only the dates from the year 1901 and 2022
dates_1901 <- date_objects[format(date_objects, "%Y") == "1901"]
dates_2022 <- date_objects[format(date_objects, "%Y") == "2022"]
print(dates_1901)
 [1] "1901-01-16" "1901-02-15" "1901-03-16" "1901-04-16" "1901-05-16"
 [6] "1901-06-16" "1901-07-16" "1901-08-16" "1901-09-16" "1901-10-16"
[11] "1901-11-16" "1901-12-16" "1901-01-16" "1901-02-15" "1901-03-16"
[16] "1901-04-16" "1901-05-16" "1901-06-16" "1901-07-16" "1901-08-16"
[21] "1901-09-16" "1901-10-16" "1901-11-16" "1901-12-16"
print(dates_2022)
 [1] "2022-01-16" "2022-02-15" "2022-03-16" "2022-04-16" "2022-05-16"
 [6] "2022-06-16" "2022-07-16" "2022-08-16" "2022-09-16" "2022-10-16"
[11] "2022-11-16" "2022-12-16" "2022-01-16" "2022-02-15" "2022-03-16"
[16] "2022-04-16" "2022-05-16" "2022-06-16" "2022-07-16" "2022-08-16"
[21] "2022-09-16" "2022-10-16" "2022-11-16" "2022-12-16"

Reading Temperature

We next need to identify the index of those dates

#Step5: Find the index of the selected dates in the time_values
selected_indices_1901 <- which(time_values %in% as.Date(dates_1901))
selected_indices_2022 <- which(time_values %in% as.Date(dates_2022))
print(selected_indices_1901)
 [1]    1    2    3    4    5    6    7    8    9   10   11   12 1465 1466 1467
[16] 1468 1469 1470 1471 1472 1473 1474 1475 1476
print(selected_indices_2022)
 [1] 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 2917 2918 2919
[16] 2920 2921 2922 2923 2924 2925 2926 2927 2928

Reading Temperature

We can now extract those rasters from the spatraster b using those indices

#Step6: Extract raster values for the selected dates
selected_rasters_1901 <- b[[selected_indices_1901]]
selected_rasters_2022 <- b[[selected_indices_2022]]

Here is for example what the 1901 spatraster looks like:

class(selected_rasters_1901)
[1] "SpatRaster"
attr(,"package")
[1] "terra"
print(selected_rasters_1901)
class       : SpatRaster 
dimensions  : 360, 720, 24  (nrow, ncol, nlyr)
resolution  : 0.5, 0.5  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 
sources     : cru_ts4.07.1901.2022.tmp.dat.nc:tmp  (12 layers) 
              cru_ts4.07.1901.2022.tmp.dat.nc:stn  (12 layers) 
varnames    : tmp (near-surface temperature) 
              stn 
names       :           tmp_1,           tmp_2,           tmp_3,           tmp_4,           tmp_5,           tmp_6, ... 
unit        : degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, ... 
time (days) : 1901-01-16 to 1901-12-16 

Reading Temperature

We can then save the names of the rasters:

#Step7: Extract layer names
layer_names_1901 <- names(selected_rasters_1901)
layer_names_2022 <- names(selected_rasters_2022)

Here is what they look like:

print(layer_names_1901)
 [1] "tmp_1"  "tmp_2"  "tmp_3"  "tmp_4"  "tmp_5"  "tmp_6"  "tmp_7"  "tmp_8" 
 [9] "tmp_9"  "tmp_10" "tmp_11" "tmp_12" "stn_1"  "stn_2"  "stn_3"  "stn_4" 
[17] "stn_5"  "stn_6"  "stn_7"  "stn_8"  "stn_9"  "stn_10" "stn_11" "stn_12"

Temperature

This is what we have covered so far:

library(terra)
library(sf)
library(stars)
#Step1: Read the raster
b <- rast("./datal17/temp/cru_ts4.07.1901.2022.tmp.dat.nc")

#Step2: Extract time information
time_values <- time(b)

#Step3: Turning strings into time
date_objects <- as.Date(time_values)

#Step4: Select only the dates from the year 1901 and 2022
dates_1901 <- unique(date_objects[format(date_objects, "%Y") == "1901"])
dates_2022 <- unique(date_objects[format(date_objects, "%Y") == "2022"])

#Step5: Find the index of the selected dates in the time_values
selected_indices_1901 <- which(time_values %in% as.Date(dates_1901))
selected_indices_2022 <- which(time_values %in% as.Date(dates_2022))

#Step6: Extract raster values for the selected dates
selected_rasters_1901 <- b[[selected_indices_1901]]
selected_rasters_2022 <- b[[selected_indices_2022]]

#Step7: Extract layer names
layer_names_1901 <- names(selected_rasters_1901)
layer_names_2022 <- names(selected_rasters_2022)

Temperature

We then extract the the index of the layers that start with tmp

We need to do this because there are other types of laters inside this .nc file: stn - number of stations contributing to each interpolation

#Step8: Find indices of layers that start with "tmp"
#tmp stands for temperature. There are other rasters within these files.
indices_tmp_1901 <- grep("^tmp", layer_names_1901)

This is what indices_tmp_1901 looks like

print(indices_tmp_1901)
 [1]  1  2  3  4  5  6  7  8  9 10 11 12

Temperature

We then select only the temperature rasters

tmp_layers_1901 <- selected_rasters_1901[[indices_tmp_1901]]

Here is what tmp_layers_1901 looks like:

print(tmp_layers_1901)
class       : SpatRaster 
dimensions  : 360, 720, 12  (nrow, ncol, nlyr)
resolution  : 0.5, 0.5  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 
source      : cru_ts4.07.1901.2022.tmp.dat.nc:tmp 
varname     : tmp (near-surface temperature) 
names       :           tmp_1,           tmp_2,           tmp_3,           tmp_4,           tmp_5,           tmp_6, ... 
unit        : degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, ... 
time (days) : 1901-01-16 to 1901-12-16 

Temperature

We then do the same for 2022

#Step8: Find indices of layers that start with "tmp"
#tmp stands for temperature. There are other rasters within these files.
indices_tmp_2022 <- grep("^tmp", layer_names_2022)
tmp_layers_2022 <- selected_rasters_2022[[indices_tmp_2022]]

Here is what tmp_layers_2022 looks like

print(tmp_layers_2022)
class       : SpatRaster 
dimensions  : 360, 720, 12  (nrow, ncol, nlyr)
resolution  : 0.5, 0.5  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 
source      : cru_ts4.07.1901.2022.tmp.dat.nc:tmp 
varname     : tmp (near-surface temperature) 
names       :        tmp_1453,        tmp_1454,        tmp_1455,        tmp_1456,        tmp_1457,        tmp_1458, ... 
unit        : degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, degrees Celsius, ... 
time (days) : 2022-01-16 to 2022-12-16 

Temperature

We can then rename the layers

names(tmp_layers_1901)
 [1] "tmp_1"  "tmp_2"  "tmp_3"  "tmp_4"  "tmp_5"  "tmp_6"  "tmp_7"  "tmp_8" 
 [9] "tmp_9"  "tmp_10" "tmp_11" "tmp_12"

We should rename those using:

dates_1901
 [1] "1901-01-16" "1901-02-15" "1901-03-16" "1901-04-16" "1901-05-16"
 [6] "1901-06-16" "1901-07-16" "1901-08-16" "1901-09-16" "1901-10-16"
[11] "1901-11-16" "1901-12-16"

Finally, this is how we rename them:

#Step9: Changing raster names to temperature
names(tmp_layers_1901)<-dates_1901
names(tmp_layers_2022)<-dates_2022

Let us look again at the output:

names(tmp_layers_1901)
 [1] "1901-01-16" "1901-02-15" "1901-03-16" "1901-04-16" "1901-05-16"
 [6] "1901-06-16" "1901-07-16" "1901-08-16" "1901-09-16" "1901-10-16"
[11] "1901-11-16" "1901-12-16"

Temperature

We then need to turn our spatraster object into a raster object

#Step10: Turning the spatraster into a list of raster
stars_list_1901 <- lapply(tmp_layers_1901, function(raster_obj) st_as_stars(raster::raster(raster_obj)))
stars_list_2022 <- lapply(tmp_layers_2022, function(raster_obj) st_as_stars(raster::raster(raster_obj)))

Notice the difference

class(tmp_layers_1901)
[1] "SpatRaster"
attr(,"package")
[1] "terra"
class(stars_list_1901)
[1] "list"

Temperature lapply()

lapply() is a fundamental function in R used to apply a given function to each element of a list or vector.

The name “lapply” stands for “list apply,” highlighting its primary usage with lists.

Syntax:
lapply(X, FUN, …)

# Create a list
my_df <- data.frame(a = 1:3, b = 4:6)

# Define a function to square each element
square <- function(x) {
  return(x^2)
}

# Apply the square function to each element of the list
result <- lapply(my_df, square)

Temperature lapply()

This what that final dataframe looks like:

# Create a list
# Convert the list result to a dataframe
result <- as.data.frame(result)

# Output the result
print(result)
  a  b
1 1 16
2 4 25
3 9 36

Temperature: One-line functions

In R, it’s common to use anonymous or in-line functions for concise code

These functions are defined on-the-fly and typically used when the function is simple and doesn’t require reusability.

Thus, we can write:

# Example of creating stars objects from raster layers using lapply and an inline function
stars_list_1901 <- lapply(tmp_layers_1901, function(raster_obj) st_as_stars(raster::raster(raster_obj)))

or

# Define the function separately
convert_to_stars <- function(raster_obj) {
  return(st_as_stars(raster::raster(raster_obj)))
}

# Apply the function using lapply
stars_list_1901 <- lapply(tmp_layers_1901, convert_to_stars)

Temperature

We then take the cell mean per raster for the whole year

#Step11: Use lapply to apply st_apply to each stars object to calculate the mean
mean_list_1901 <- lapply(stars_list_1901, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))
mean_list_2022 <- lapply(stars_list_2022, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))

Notice that the input is stars_list_1901

stars_list_1901 is a list of stars object containing 12 objects

Temperature

You can see below the first object in the stars_list_1901

ggplot()+
  geom_stars(data=stars_list_1901[[1]])

Temperature

You can see below the second object in the stars_list_1901

ggplot()+
  geom_stars(data=stars_list_1901[[2]])

Temperature

Thus the following function takes the average temperature per cell the whole year

#Step11: Use lapply to apply st_apply to each stars object to calculate the mean
mean_list_1901 <- lapply(stars_list_1901, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))
mean_list_2022 <- lapply(stars_list_2022, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))

Also note that st_apply is a function similar to lapply, but only applies to sf objects.

Here is what one item looks like:

class(mean_list_1901[[1]])
[1] "stars"
class(stars_list_1901[[1]])
[1] "stars"

Here is what the object looks like

class(mean_list_1901)
[1] "list"

Temperature

We then combine the list of results into a single stars object

#Step12: Combine the list of results into a single stars object
mean_stars_1901 <- do.call(stars::st_as_stars, mean_list_1901)
mean_stars_2022 <- do.call(stars::st_as_stars, mean_list_2022)

do.call() Function calls a function with a list of arguments

# Example usage of do.call
my_list <- list(a = 1, b = 2)
do.call(sum, my_list)
[1] 3

Temperature

We then combine the list of results into a single stars object

#Step12: Combine the list of results into a single stars object
mean_stars_1901 <- do.call(stars::st_as_stars, mean_list_1901)
mean_stars_2022 <- do.call(stars::st_as_stars, mean_list_2022)

This is the result:

ggplot()+
  geom_stars(data=mean_stars_1901)

Temperature

These are the steps that we covered in part 2.

#Step8: Find indices of layers that start with "tmp"
#tmp stands for temperature. There are other rasters within these files.
indices_tmp_1901 <- grep("^tmp", layer_names_1901)
tmp_layers_1901 <- selected_rasters_1901[[indices_tmp_1901]]
indices_tmp_2022 <- grep("^tmp", layer_names_2022)
tmp_layers_2022 <- selected_rasters_2022[[indices_tmp_2022]]

#Step9: Changing raster names to temperature
names(tmp_layers_1901)<-dates_1901
names(tmp_layers_2022)<-dates_2022

#Step10: Turning the spatraster into raster
stars_list_1901 <- lapply(tmp_layers_1901, function(obj) st_as_stars(raster::raster(obj)))
stars_list_2022 <- lapply(tmp_layers_2022, function(obj) st_as_stars(raster::raster(obj)))

#Step11: Use lapply to apply st_apply to each stars object to calculate the mean
mean_list_1901 <- lapply(stars_list_1901, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))
mean_list_2022 <- lapply(stars_list_2022, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))

#Step12: Combine the list of results into a single stars object
mean_stars_1901 <- do.call(stars::st_as_stars, mean_list_1901)
mean_stars_2022 <- do.call(stars::st_as_stars, mean_list_2022)

#Step13: Changing the name of the rasters
names(mean_stars_1901)<-"avg_tmp_1901"
names(mean_stars_2022)<-"avg_tmp_2022"

Temperature

Temperature in 1901

library(viridis)
fig<-ggplot()+
  geom_stars(data=mean_stars_1901)+
  scale_fill_viridis(option="turbo", na.value=NA)+
  theme_bw()+
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())

Temperature

Temperature in 1901

Temperature

Temperature in 2022

fig<-ggplot()+
  geom_stars(data=mean_stars_2022)+
  scale_fill_viridis(option="turbo", na.value=NA)+
  theme_bw()+
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())

Temperature

Temperature in 2022

Working with Temperature

Now plotting was to some extent straightforward.

What if we wanted to extract the average temperature for every single year and make a time plot?

Working with Temperature

This is where our programming skills come in handy from earlier lectures

library(terra)

# Step 1: Read the raster
b <- rast("./datal17/temp/cru_ts4.07.1901.2022.tmp.dat.nc")

# Step 2: Extract time information
time_values <- time(b)

# Step 3: Turning strings into time
date_objects <- as.Date(time_values)

# Step 4: Get unique years
unique_years <- unique(format(date_objects, "%Y"))
#All years take too long, so we will only choose two years
unique_years <-c("2021", "2022")

# Step 5: Create an empty list to store mean stars for each year
mean_stars_list <- list()

Working with Temperature

This is where our programming skills come in handy from earlier lectures

# Step 6: Iterate over each unique year
for (year in unique_years) {
  cat("Calculating average for year", year, "\n")
  #Step1: Select dates for the current year
  selected_dates <- unique(date_objects[format(date_objects, "%Y") == year])
  #Step2: Find the index of the selected dates in the time_values
  selected_indices <- which(time_values %in% as.Date(selected_dates))
  #Step3: Extract raster values for the selected dates
  selected_rasters <- b[[selected_indices]]
  #Step4: Extract layer names
  layer_names <- names(selected_rasters)
  #Step5: Find indices of layers that start with "tmp"
  indices_tmp <- grep("^tmp", layer_names)
  #Step6: Extract temperature layers
  tmp_layers <- selected_rasters[[indices_tmp]]
  #Step7: Change raster names to temperature
  names(tmp_layers) <- selected_dates
  #Step8: Convert spatraster into raster
  stars_list <- lapply(tmp_layers, function(raster_obj) st_as_stars(raster::raster(raster_obj)))
  #Step9: Use lapply to apply st_apply to each stars object to calculate the mean
  mean_list <- lapply(stars_list, function(stars_obj) st_apply(stars_obj, c("x", "y"), mean))
  #Step10: Combine the list of results into a single stars object
  mean_stars <- do.call(stars::st_as_stars, mean_list)
  #Step11: Change the name of the raster
  names(mean_stars) <- paste("avg_tmp", year, sep = "_")
  #Step12: Add the mean stars to the list
  mean_stars_list[[year]] <- mean_stars
  cat("Average calculation for year", year, "completed\n\n")
}
Calculating average for year 2021 
Average calculation for year 2021 completed

Calculating average for year 2022 
Average calculation for year 2022 completed

Working with Temperature

This is where our programming skills come in handy from earlier lectures

# Combine all the mean stars into a single stars object
# Initialize an empty data frame
result_df <- data.frame()

# Loop through each year and calculate the mean
for (year in names(mean_stars_list)) {
  #Creating an object where we look through the years
  avg_tmp_col_name <- paste0("avg_tmp_", year)
  #Calculating mean and keeping track of the year
  mean_value <- mean(mean_stars_list[[year]][[avg_tmp_col_name]], na.rm = TRUE)
  #Saving everything into a dataframe
  result_df <- rbind(result_df, data.frame(year = as.numeric(year), average_temperature = mean_value))
}

# Writing the df to a dataframe
write.csv(result_df, "./datal17/output/average_temp_2021_2022.csv", row.names = FALSE)

Working with Temperature

So, we wrote our dataframe to a csv file.

Note that we did this dataframe for two years only: 2021 and 2022.

To complete the loop for all years, put a # in front of
unique_years <-c("2021", "2022")

This means that unique_years is an object made out of
unique(format(date_objects, "%Y"))

In other words, all years from 1901 until 2022. But this takes a long time (i.e. a few hours): multiply how long it took for 2021 and 2022 by 50.

To save you time, I have already done all the years on my machine. You can download the csv file with all the years.

Put it in the relevant folder

Working with Temperature

#write.csv(result_df, "./datal17/output/average_temp_1901_2022.csv", row.names = FALSE)
#Reading the file
average_temp<-read.csv("./datal17/output/average_temp_1901_2022.csv")
#Subsetting to below 2001
average_temp_1901_2000<-subset(average_temp, year<2001)
#Calculating the 1901-2000 average
average_temp_1901_2000<-mean(average_temp_1901_2000$average_temperature)
#Calculation the deviations from the 1901-2000 average
average_temp$dev_1901_2000_avg<-average_temp$average_temperature-average_temp_1901_2000

Working with Temperature

Creating a time trend

#Mapping the Temperature Deviaitonsover time
ggplot(average_temp, aes(x = year, y = dev_1901_2000_avg)) +
  geom_line() +
  geom_point() +
  theme_bw()+
  labs(title = "Time Trend of Average Temperature",
       x = "Year",
       y = "Average Temperature")

Working with Temperature

Creating a time trend

Working with Temperature

Creating a bar plot with the values

# Create a new column for bar color based on the condition
average_temp$bar_color <- ifelse(average_temp$dev_1901_2000_avg > 0, "red", "blue")

# Create the bar plot
ggplot(average_temp, aes(x = year, y = dev_1901_2000_avg, fill = bar_color)) +
  geom_bar(stat = "identity", color = "black") +
  scale_fill_manual(values = c("blue", "red"), guide="none") +
  theme_bw() +
  labs(title = "Time Trend of Average Temperature",
       x = "Year",
       y = "Difference from 1901-2000 avearge (degrees C)")

Working with Temperature

Creating a bar plot with the values

Working with Temperature

This is very close to the official figures produced by NOAA (National Ceneter for Environmental Information)

See: https://www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/global/time-series/globe/land/1/11/1901-2022.

Conclusion

We have covered a lot of operations related to rasters

We learned about:

  • nc temperature rasters
  • different type of raster format: spatraster, raster, and stars
  • accessing raster values
  • doing raster algebra: calculating average cell temperature
  • examining temperature over time