.qmd filesmean(), paste(), and sort()By the end, you’ll be able to write simple R scripts and explore your own data.
We will install two programs
We can do that by going to: https://posit.co/download/rstudio-desktop/
We can do that by going to: https://posit.co/download/rstudio-desktop/
Install the version of RStudio relevant for your OS.
Note that there are different files for Apple silicon (M1/M2) Macs and for Intel Macs
We can do that by going to: https://posit.co/download/rstudio-desktop/
Install the version of RStudio relevant for your OS.
Note that there are different files for Apple silicon (M1/M2) Macs and for Intel Macs.
The platform interface for R studio looks like below:
The platform interface for R studio looks like below:
The platform interface for R studio looks like below:
The platform interface for R studio looks like below:
Quarto is a version of R Markdown from RStudio that allows us to run code and write text.
Quarto files have the *.qmd extension
You can produce a wide variety of output types:
You can now start typing.
To use Quarto with R, you should install the rmarkdown R package:
You can now start typing.
To use Quarto with R, you should install the rmarkdown R package:
You can now start typing.
To use Quarto with R, you should install the rmarkdown R package:
You can now start typing.
To use Quarto with R, you should install the rmarkdown R package:
Let us now use R to understand how it works.
Let’s create a new quarto document and save it in your “week2” folder
[1] 2
[1] 5
After we type 2+3 and press Enter, 2+3 is sent to your computer’s processor.
[1] 2
[1] 5
After we type 2+3 and press Enter, 2+3 is sent to your computer’s processor.
The returned value 5 is then printed in the console
[1] 2
[1] 5
After we type 2+3 and press Enter, 2+3 is sent to your computer’s processor.
The returned value 5 is then printed in the console
Note that this is not kept in the RAM memory.
[1] 2
[1] 5
After we type 2+3 and press Enter, 2+3 is sent to your computer’s processor.
The returned value 5 is then printed in the console
Note that this is not kept in the RAM memory: 5 is simply printed in the console.
You can create Quarto Notebooks and presentations easily.
You simply need to change the preamble.
This is for example the preamble for course syllabus
This is for example the preamble for course syllabus
This is for example the preamble for course syllabus
This is the output:
This is for example the preamble for the the presentation for today.
---
title: "L2: Introduction to R, Quarto, and R evironments"
author:
  name: Bogdan G. Popescu
  email: bogdan.popescu@johncabot.edu
  affiliations: John Cabot University
format:
  revealjs:
    slide-number: c/t
    show-slide-number: all
    preview-links: auto
    width: 1050
    height: 700
    fontsize: 24pt
    footer: "Popescu (JCU): Lecture 2"
    sansfont: Latin Modern Roman
    embed-resources: true
---
      
# Slide 1
This is an example
# Slide 2
This is another exampleThis is for example the preamble for presentation today
This is for example the preamble for presentation today
This is the output for slide 1
This is the output for slide 2
This is the output for slide 3
[1] 2
[1] "Hello"
[1] "Hello"
Note that when x is 1 or 3, we say that x is a numeric object
[1] 2
[1] "Hello"
[1] "Hello"
Note that when x is 1 or 3, we say that x is a numeric object
When x is "Hello" or 'Hello', we say that x is a string object.
| Operator | Meaning | 
|---|---|
+ | 
Addition | 
- | 
Subtraction | 
* | 
Multiplication | 
/ | 
Division | 
^ | 
Exponent | 
Here are examples of this:
Here are examples of this:
Here are examples of this:
Here are examples of this:
Here are examples of this:
Note that numbers that are either very large or very small, R uses scientific notation
Example:
Note that 1000000 has 6 zeros.
This can also be written as \(1*10^{-6}\) or \((\frac{1}{10})^6\)
Thus, the output produced by R - 1e-06 makes sense.
Infinity is treated as a special type of number: Inf or -Inf.
Here are examples:
This results in Inf: division by zero is undefined
In the limit, as a number approaches zero, its reciprocal becomes larger and larger, eventually reaching infinity.
Another example:
Conditions are expressions that use conditional operators and that have TRUE or FALSE as a result
The conditional operators available in R are listed below:
| Operator | Meaning | 
|---|---|
== | 
Equal | 
> | 
Greater than | 
>= | 
Greater than or equal | 
< | 
Less than | 
<= | 
Less than or equal | 
!= | 
Not equal | 
&, | | 
And, Or | 
We can use conditional operators in the following examples:
We can use conditional operators in the following examples:
We can use conditional operators in the following examples:
We can use conditional operators in the following examples:
or
We can use conditional operators in the following examples:
or
We can use conditional operators in the following examples:
or
or
| Operator | Meaning | 
|---|---|
Inf | 
Infinity | 
NA | 
Not Available | 
NaN | 
Not a Number | 
NULL | 
Empty object | 
R is an object-oriented language where each object belongs to a class
The class function accepts an object and returns a class name
The vector is the simplest data structure in R.
The vector is the simplest data structure in R.
It is an ordered collection of values of the same type:
The vector is the simplest data structure in R.
It is an ordered collection of values of the same type:
numeric (numbers with a decimal point) or integer (whole numbers)characterlogicalA vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length 1 can be created by typing 600, "Hello" or TRUE.
A vector of length > 1 can be achieved by using the c function which various inputs
A vector of length > 1 can be achieved by using the c function which various inputs
For example we can create a vector of length 2
A vector of length > 1 can be achieved by using the c function which various inputs
For example we can create a vector of length 2
c functionA vector of length > 1 can be achieved by using the c function which various inputs
For example we can create a vector of length 2
c functionA vector of length > 1 can be achieved by using the c function which various inputs
For example we can create a vector of length 2
c functionA vector of length > 1 can be achieved by using the c function which various inputs
c functionA vector of length > 1 can be achieved by using the c function which various inputs
c functionA vector of length > 1 can be achieved by using the c function which various inputs
For example we can create a vector of length 2
ages, income_levels, country_codesaverage_income (snake_case)averageIncome (camelCase)2nd_vector, #incomemean, sum, dataIncome ≠ incomeWe can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of different elements
We can also create a vector made out of character elements
We can also create a vector made out of character elements
We can also create a vector made out of character elements
We can also create a vector made out of character elements
We can also create a vector made out of character elements
Other than the c function there are three additional operations for creating vectors:
: operatorseq functionrep function: operatorWe can create a vector with consecutive numbers:
: operatorWe can create a vector with consecutive numbers:
: operator: operatorseq functionThe seq function gives more flexibility by introducing three additional paramters.
seq functionThe seq function gives more flexibility by introducing three additional paramters.
from - where to startseq functionThe seq function gives more flexibility by introducing three additional paramters.
from - where to startto - where to endseq functionThe seq function gives more flexibility by introducing three additional paramters.
from - where to startto - where to endby - step sizeseq functionThe seq function gives more flexibility by introducing three additional paramters.
from - where to startto - where to endby - step sizeseq functionThe seq function gives more flexibility by introducing three additional paramters.
from - where to startto - where to endby - step sizeseq functionThe seq function gives more flexibility by introducing three additional paramters.
seq functionThe seq function gives more flexibility by introducing three additional paramters.
seq functionThe seq function gives more flexibility by introducing three additional paramters.
from - where to startto - where to endby - step sizeseq functionThe seq function gives more flexibility by introducing three additional paramters.
from - where to startto - where to endby - step sizeseq functionThe seq function gives more flexibility by introducing three additional paramters.
seq functionThe seq function gives more flexibility by introducing three additional paramters.
The rep function replicates its argument to create a repetitive vector:
The rep function replicates its argument to create a repetitive vector:
x - the object to replicateThe rep function replicates its argument to create a repetitive vector:
x - the object to replicatetimes - how many times to replicate the vectorThe rep function replicates its argument to create a repetitive vector:
x - the object to replicatetimes - how many times to replicate the vectoreach - how many times to replicate each elementThe rep function replicates its argument to create a repetitive vector:
x - the object to replicatetimes - how many times to replicate the vectoreach - how many times to replicate each elementThe rep function replicates its argument to create a repetitive vector:
x - the object to replicatetimes - how many times to replicate the vectoreach - how many times to replicate each elementThe rep function replicates its argument to create a repetitive vector:
The rep function replicates its argument to create a repetitive vector:
The is.na is the function to detect missing NA values
The is.na accepts any vector
It returns a logical vector, with TRUE in place of NA values and FALSE in place of non-NA values.
Example:
This is how we detect NAs
Many functions such as sum and mean have an na.rm parameters to exclude NA values from the calculation
Example:
sort functionThe sort function returns ordered vector indices
For example:
paste and paste0 functionThe paste function is used to “paste” text values.
The sep determines the separating character with the default being sep="".
Example:
Alternatively:
paste and paste0 functionWe can use paste to obtain names of files
paste and paste0 functionpaste() is like concatenation using separation factor
paste0() is like append function using separation factor - simply pastes with no separator.
Examples:
      height weight      bmi
 [1,]    168     88 31.17914
 [2,]    177     72 22.98190
 [3,]    177     85 27.13141
 [4,]    177     52 16.59804
 [5,]    178     71 22.40879
 [6,]    172     69 23.32342
 [7,]    165     61 22.40588
 [8,]    171     61 20.86112
 [9,]    178     51 16.09645
[10,]    170     75 25.95156
[1] TRUE
      height weight      bmi
 [1,]    168     88 31.17914
 [2,]    177     72 22.98190
 [3,]    177     85 27.13141
 [4,]    177     52 16.59804
 [5,]    178     71 22.40879
 [6,]    172     69 23.32342
 [7,]    165     61 22.40588
 [8,]    171     61 20.86112
 [9,]    178     51 16.09645
[10,]    170     75 25.95156
[1] TRUE
      height weight      bmi
 [1,]    168     88 31.17914
 [2,]    177     72 22.98190
 [3,]    177     85 27.13141
 [4,]    177     52 16.59804
 [5,]    178     71 22.40879
 [6,]    172     69 23.32342
 [7,]    165     61 22.40588
 [8,]    171     61 20.86112
 [9,]    178     51 16.09645
[10,]    170     75 25.95156
[1] TRUE
[1] 10  3
The matrix we just created can be turned into a dataframe.
Dataframes are essentially, list of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially, list of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially, list of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially, list of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially, list of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially, list of vectors with names
The matrix we just created can be turned into a dataframe.
Dataframes are essentially, list of vectors with names
# A tibble: 10 × 3
   height weight   bmi
    <dbl>  <dbl> <dbl>
 1    168     88  31.2
 2    177     72  23.0
 3    177     85  27.1
 4    177     52  16.6
 5    178     71  22.4
 6    172     69  23.3
 7    165     61  22.4
 8    171     61  20.9
 9    178     51  16.1
10    170     75  26.0
[1] "height" "weight" "bmi"   
The matrix we just created can be turned into a dataframe.
Dataframes are essentially, list of vectors with names
# A tibble: 10 × 3
   height weight   bmi
    <dbl>  <dbl> <dbl>
 1    168     88  31.2
 2    177     72  23.0
 3    177     85  27.1
 4    177     52  16.6
 5    178     71  22.4
 6    172     69  23.3
 7    165     61  22.4
 8    171     61  20.9
 9    178     51  16.1
10    170     75  26.0
[1] "height" "weight" "bmi"   
Within a dataframe:
We can also create a dataframe manually in the following way:
We can also create a dataframe manually in the following way:
We view our dataframe(s) by clicking on the environment
We view our dataframe(s) by clicking on the environment
We view our dataframe(s) by clicking on the environment
We view our dataframe(s) by clicking on the environment
Some important dataframe properties include:
nrow - number of rowsncol - number of columnsdim - both the number of rows and columnsrownames - reveals the index numbers of the dataframecolnames - reveals the column namesHere is how they would work
Here is how they would work
Here is how they would work
Here is how they would work
Here is how they would work
[1] 10
[1] 2
[1] 10  2
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
Here is how they would work
[1] 10
[1] 2
[1] 10  2
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
[1] "weight" "height"
These properties allow us to also make changes to the datafrane.
For example, we can change column names:
The glimpse command from dplyr allows us to see the dataframe effectively.
Rows: 10
Columns: 2
$ body_weight <dbl> 88, 72, 85, 52, 71, 69, 61, 61, 51, 75
$ height      <dbl> 168, 177, 177, 177, 178, 172, 165, 171, 178, 170
The $ operator is a shortcut for getting a single column, by name, from a data.frame:
Example:
head and tail allow us to see the beginning and the end of our dataframe
For example, the following command gives us the first 4 entries
The following command gives us the last 4 entries
is.na() and na.rmmean(), paste(), sort(), rep(), seq()You now have the foundation to write reproducible scripts and explore real data in R.
Popescu (JCU): Lecture 2