```
#Removing previous datasets in memory
rm(list = ls())
#Loading the relevant libraries
#install.packages("gghighlight")
library(ggplot2)
library(gridExtra)
library(gghighlight)
```

# Lab 5: Statistics

The Normal Distribution

Before we get into coding, let us go back to a short explanation of how frequncies relate to probability mass functions (pmf) and cummulative distribution functions (cdf).

There are four functions in R for calculating normal distributions: `rnorm`

, `pnorm`

, `qnorm`

, `dnorm`

.

# 1 Generating Random Values in a Standard Normal Distribution (mean: 0, sd:1)

The `rnorm`

function generates a vector of normally distributed random numbers. We can specify the mean \(\mu\) and the standard deviation \(\sigma\). Remember that that for a standard normal distribution the mean \(\mu = 0\) and the standard deviation \(\sigma = 1\).

Let us see how it works by creating three samples with 10, 100, and 1000 observations.

## 1.1 Creating a sample of 10 observations

```
#Set seed allows us to make sure everyone gets the same results
set.seed(123)
#Step1: Creating 10 obs. with mean 0 and sd of 1
<-rnorm(10, mean=0, sd=1)
generated_data_10obs#Step2: Turning the created data into a data frame
<-data.frame(generated_data_10obs)
generated_data_10obs#Step3: Inspecting the first 10 obs.
head(generated_data_10obs, n=10)
```

```
#Step4: Renaming variable inside the datafarme
names(generated_data_10obs)<-"value"
head(generated_data_10obs, n=10)
```

```
#Step5: Plotting the data
<-ggplot(data = generated_data_10obs, aes(x=value))+
figure_1geom_histogram()+
theme_bw()+
ggtitle("10 Obs.")
figure_1
```