Quarto Notebooks, Operations and Objects
By the end, you’ll be able to write simple R scripts and explore your own data.
We will install two programs
We can do that by going to: https://posit.co/download/rstudio-desktop/
We can do that by going to: https://posit.co/download/rstudio-desktop/
Install the version of RStudio relevant for your OS.
Note that there are different files for Apple silicon (M1/M2) Macs and for Intel Macs
We can do that by going to: https://posit.co/download/rstudio-desktop/
Install the version of RStudio relevant for your OS.
Note that there are different files for Apple silicon (M1/M2) Macs and for Intel Macs.
The platform interface for R studio looks like below:
The platform interface for R studio looks like below:
The platform interface for R studio looks like below:
The platform interface for R studio looks like below:
Let us now use R to understand how it works.
Let’s create a new quarto document and work with it.
Quarto is a version of R Markdown from RStudio that allows us to run code and write text.
Quarto files have the *.qmd extension
You can now start typing.
To use Quarto with R, you should install the rmarkdown R package:
You can now start typing.
To use Quarto with R, you should install the rmarkdown R package:
You can now start typing.
To use Quarto with R, you should install the rmarkdown R package:
You can now start typing.
To use Quarto with R, you should install the rmarkdown R package:
Let us save the quarto file in a work folder called “example”.
Press CMD + A or Ctrl + A and then Press Delete
Then type:
[1] 2
[1] 5
After we type 2+3 and press Enter, 2+3 is sent to your computer’s processor.
[1] 2
[1] 5
After we type 2+3 and press Enter, 2+3 is sent to your computer’s processor.
The returned value 5 is then printed in the console
[1] 2
[1] 5
After we type 2+3 and press Enter, 2+3 is sent to your computer’s processor.
The returned value 5 is then printed in the console
Note that this is not kept in memory.
[1] 2
[1] 5
After we type 2+3 and press Enter, 2+3 is sent to your computer’s processor.
The returned value 5 is then printed in the console
Note that this is not kept in the RAM memory: 5 is simply printed in the console.
This is what the output looks like in your working directory.
[1] 2
[1] "Hello"
[1] "Hello"
Note that when x is 1 or 3, we say that x is a numeric object
[1] 2
[1] "Hello"
[1] "Hello"
Note that when x is 1 or 3, we say that x is a numeric object
When x is "Hello" or 'Hello', we say that x is a string object.
| Operator | Meaning | 
|---|---|
+ | 
Addition | 
- | 
Subtraction | 
* | 
Multiplication | 
/ | 
Division | 
^ | 
Exponent | 
Here are examples of this:
Here are examples of this:
Here are examples of this:
Here are examples of this:
Here are examples of this:
Here are examples of this:
Note that numbers that are either very large or very small, R uses scientific notation
Note that 1000000 has 6 zeros.
This can also be written as \(1*10^{-6}\) or \((\frac{1}{10})^6\)
Thus, the output produced by R - 1e-06 makes sense.
Infinity is treated as a special type of number: Inf or -Inf.
This results in Inf: division by zero is undefined
In the limit, as a number approaches zero, its reciprocal becomes larger and larger, eventually reaching infinity.
Conditions are expressions that use conditional operators and that have TRUE or FALSE as a result
The conditional operators available in R are listed below:
| Operator | Meaning | 
|---|---|
== | 
Equal | 
> | 
Greater than | 
>= | 
Greater than or equal | 
< | 
Less than | 
<= | 
Less than or equal | 
!= | 
Not equal | 
&, | | 
And, Or | 
We can use conditional operators in the following examples:
Or
| Operator | Meaning | 
|---|---|
Inf | 
Infinity | 
NA | 
Not Available | 
NaN | 
Not a Number | 
NULL | 
Empty object | 
R is an object-oriented language where each object belongs to a class
The class function accepts an object and returns a class name
The vector is the simplest data structure in R.
The vector is the simplest data structure in R.
It is an ordered collection of values of the same type:
The vector is the simplest data structure in R.
It is an ordered collection of values of the same type:
numeric (numbers with a decimal point) or integer (whole numbers)characterlogicalA vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length > 1 can be achieved by using the c function with various inputs
A vector of length > 1 can be achieved by using the c function with various inputs
For example we can create a vector of length 2
A vector of length > 1 can be achieved by using the c function with various inputs
For example we can create a vector of length 2
c functionA vector of length > 1 can be achieved by using the c function with various inputs
For example we can create a vector of length 2
c functionA vector of length > 1 can be achieved by using the c function with various inputs
For example we can create a vector of length 2
c functionA vector of length > 1 can be achieved by using the c function with various inputs
c functionA vector of length > 1 can be achieved by using the c function with various inputs
c functionA vector of length > 1 can be achieved by using the c function with various inputs
For example we can create a vector of length 2
ages, income_levels, country_codesaverage_income (snake_case)averageIncome (camelCase)2nd_vector, #incomemean, sum, dataIncome ≠ incomeWe can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector (list) made out of character elements
We can also create a vector (list) made out of character elements
This is how we remove an element from the list
Notice the length difference
This is how we can do logical subsetting
This is how we can do logical subsetting
This is how we can do logical subsetting
This is how we can do logical subsetting
This is how we can do logical subsetting
This is how we can do logical subsetting
This is how we can do logical subsetting with strings
This is how we can check which string is “greater” alphabetically and how to count the number of characters in a string
This is how we can handle missing data
This is how we can handle missing data
This is how we can handle missing data
This is how we can handle missing data
Many functions such as sum, mean, median have an na.rm parameters to exclude NA values from the calculation
Here is how we can perform basic mathematical operations with vectors
Here is how we can perform basic mathematical operations with vectors
Here is how we can perform basic mathematical operations with vectors
Here is how we can perform basic mathematical operations with vectors
# This is another vector 
weight <- c(88, 72, 85, 52, 71, 69, 61, 61, 51, 75)
# This is another vector 
height <- c(168, 177, 177, 177, 178, 172, 165, 171, 178, 170)
# Performing a simple calculation using vectors
bmi = weight/((height/100)^2)
print(bmi) [1] 31.17914 22.98190 27.13141 16.59804 22.40879 23.32342 22.40588 20.86112
 [9] 16.09645 25.95156
Other than the c function there are three additional operations for creating vectors:
: operatorseq functionrep function: operatorWe can create a vector with consecutive numbers:
: operatorWe can create a vector with consecutive numbers:
seq functionThe seq function gives more flexibility by introducing three additional parameters.
seq functionThe seq function gives more flexibility by introducing three additional parameters.
from - where to startseq functionThe seq function gives more flexibility by introducing three additional parameters.
from - where to startto - where to endseq functionThe seq function gives more flexibility by introducing three additional parameters.
from - where to startto - where to endby - step sizeseq functionThe seq function gives more flexibility by introducing three additional parameters.
from - where to startto - where to endby - step sizeseq functionThe seq function gives more flexibility by introducing three additional parameters.
from - where to startto - where to endby - step sizeseq functionThe seq function gives more flexibility by introducing three additional parameters.
from - where to startto - where to endby - step sizeThe rep function replicates its argument to create a repetitive vector:
The rep function replicates its argument to create a repetitive vector:
x - the object to replicateThe rep function replicates its argument to create a repetitive vector:
x - the object to replicatetimes - how many times to replicate the vectorThe rep function replicates its argument to create a repetitive vector:
x - the object to replicatetimes - how many times to replicate the vectoreach - how many times to replicate each elementThe rep function replicates its argument to create a repetitive vector:
x - the object to replicatetimes - how many times to replicate the vectoreach - how many times to replicate each elementThe rep function replicates its argument to create a repetitive vector:
x - the object to replicatetimes - how many times to replicate the vectoreach - how many times to replicate each elementsort functionThe sort function returns ordered vector indices
paste and paste0 functionThe paste function is used to “paste” text values.
The sep determines the separating character with the default being sep="".
paste and paste0 functionWe can use paste to obtain names of files
paste and paste0 functionpaste() is like concatenation using separation factor
paste0() is like append function using separation factor - simply pastes with no separator.
Defining vectors:
Visualizing the matrix:
Visualizing the matrix:
      height weight      bmi
 [1,]    168     88 31.17914
 [2,]    177     72 22.98190
 [3,]    177     85 27.13141
 [4,]    177     52 16.59804
 [5,]    178     71 22.40879
 [6,]    172     69 23.32342
 [7,]    165     61 22.40588
 [8,]    171     61 20.86112
 [9,]    178     51 16.09645
[10,]    170     75 25.95156
Visualizing the matrix:
      height weight      bmi
 [1,]    168     88 31.17914
 [2,]    177     72 22.98190
 [3,]    177     85 27.13141
 [4,]    177     52 16.59804
 [5,]    178     71 22.40879
 [6,]    172     69 23.32342
 [7,]    165     61 22.40588
 [8,]    171     61 20.86112
 [9,]    178     51 16.09645
[10,]    170     75 25.95156
The matrix we just created can be turned into a dataframe.
Dataframes are essentially lists of vectors with names
You should see something like
trying URL 'https://cran.rstudio.com/bin/macosx/big-sur-arm64/contrib/4.5/tibble_3.3.0.tgz'
Content type 'application/x-gzip' length 692985 bytes (676 KB)
==================================================
downloaded 676 KB
The downloaded binary packages are in
    /var/folders/vl/wq4b0z_s3mj_rvz59myqgmx40000gn/T//RtmpdWk47v/downloaded_packagesThe matrix we just created can be turned into a dataframe.
Dataframes are essentially lists of vectors with names
Once you are done comment out the install command
The matrix we just created can be turned into a dataframe.
Dataframes are essentially lists of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially lists of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially lists of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially lists of vectors with names
Within a dataframe:
We can also create a dataframe manually in the following way:
We can also create a dataframe manually in the following way:
Some important dataframe properties include:
nrow - number of rowsncol - number of columnsdim - both the number of rows and columnsrownames - reveals the index numbers of the dataframecolnames - reveals the column namesHere is how they would work
Here is how they would work
Here is how they would work
Here is how they would work
Here is how they would work
[1] 10
[1] 2
[1] 10  2
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
Here is how they would work
[1] 10
[1] 2
[1] 10  2
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
[1] "weight" "height"
These properties allow us to also make changes to the dataframe.
For example, we can change column names:
The glimpse command from dplyr allows us to see the dataframe effectively.
The $ operator is a shortcut for getting a single column, by name, from a data.frame:
head and tail allow us to see the beginning and the end of our dataframe
For example, the following command gives us the first 4 entries
is.na() and na.rmmean(), paste(), sort(), rep(), seq()You now have the foundation to write reproducible scripts and explore real data in R.
Popescu (JCU): Lab 1