1 Introduction
The goal of this practical is to familiarize yourself with R and the RStudio environment.
The objectives of this session will be to:
- Understand the purpose of each pane in RStudio
- Do basic computation with R
- Define variables and assign data to variables
- Manage a workspace in R
- Call functions
- Manage packages
- Be ready to write graphics !

1.2 Some R background
is a programming language and free software environment for statistical
computing and graphics supported by the R Foundation for Statistical Computing.
- Created by Ross Ihaka and Robert Gentleman
- initial version released in 1995
- free and open-source implementation the S programming language
- currently developed by the R Development Core Team.
Reasons to use it:
It’s open source, which means that we have access to every bit of underelying computer code to prove that our results are corrects (which is always a good point in science)
It’s free, well documented, and runs almost everywhere
it has a large (and growing) user base among scientists
it has a large library of external packages available for performing diverse tasks.
18082 available packages on https://cran.r-project.org/
2041 available packages on http://www.bioconductor.org
>500k available repository using R on https://github.com/
1.3 How do I use R ?
Unlike other statistical software programs like Excel, SPSS, or Minitab that provide point-and-click interfaces, R is an interpreted language.
This means that you have to write instructions for R. Which means that you are going to learn to write code / program in R.
R is usually used in a terminal in which you can type or paste your R code:
But navigating between your terminal, your code and your plots can be tedious, this is why in 2021 there is a better way to do use R !
1.4 RStudio, the R Integrated development environment (IDE)
An IDE application provides comprehensive facilities to computer programmers for software development. Rstudio is free and open-source.
To open RStudio, you can install the RStudio application and open the app.
Otherwise you can use the link and the login details provided to you by email. The web version of Rstudio is the same as the application expect that you can open it any recent browser.
1.5 Rstudio interface
1.6 The same console as before (in Red box)
1.7 Errors, warnings, and messages
The R console is a textual interface, which means that you will enter code, but it also means that R is is going to write informations back to you and that you will have to pay attention at what is written.
There are 3 categories of messages that R can send you: Errors prefaced with Error in…
, Warnings prefaced with Warning:
and Messages which don’t start with either Error
or Warning
.
- Errors, you must consider them as red light. You must figure out what is caussing it. Usually you can find usefull clue in the errors message about how to solve it.
- Warning, warnings are yellow light. The code is running but you have to pay attention. It’s almost always a good idea to try to fix warnings.
- Message are just frindly messages from R telling you how things are running.
2 R as a calculator
Now that we know what we should do and what to expect, we are going to try some basic R instructions. A computer can perform all the operations that a calculator can do, so let’s start with that:
- Add:
+
- Divide:
/
- Multiply:
*
- Subtract:
-
- Exponents:
^
or**
- Parentheses:
(
,)
Now Open RStudio. Write the commands in colors in a blue box in the terminal. The expected results will always be printed in white in a blue box.
You can copy-paste
but I advise you to practice writing directly in the terminal.
Like every langages you will become more familiar with R by using it.
To validate the line at the end of your command: press Return
.
2.1 First commands
You should see a >
character before a blinking cursor. The >
is called a prompt. The prompt is chown when you can enter a new line of R code.
1 + 100
For classical output R will write the results with a [N]
with N
the row number.
Here you have a one line results [1]
[1] 101
Do the same things but press ⏎
(return) after typing +
.
1 +
The console displays +
.
The >
can become a +
in case of multi-lines code.
As there are two side to the +
opperator, R know that you still need to enter the right side of your formula.
It is waiting for the next command. Write just 100
and press ⏎
:
100
[1] 101
2.2 R keeps to the mathematical order
The order of opperation is the natural mathematical order in R:
3 + 5 * 2
[1] 13
You can use parenthesis (
)
to change this order
3 + 5) * 2 (
[1] 16
But to much parenthesis can be hard to read
3 + (5 * (2 ^ 2))) # hard to read (
[1] 23
3 + 5 * (2 ^ 2) # if you forget some rules, this might help
[1] 23
Note : The text following a #
is a comment. It will not be interpreted by R. In the future, I advise you to use comments a lot to explain in your own words what the command means.
2.3 Scientific notation
For small of large numbers, R will automatically switch to scientific notation
2/10000
[1] 2e-04
2e-4
is shorthand for 2 * 10^(-4)
You can use e
to write your own scientific notation
5e3
[1] 5000
2.4 Mathematical functions
R is distributed with a large number of existing functions.
To call mathematical function you must with function_name(<number>)
.
For example for the natural logarithm:
log(1) # natural logarithm
[1] 0
log10(10) # base-10 logarithm
[1] 1
exp(0.5)
[1] 1.648721
Compute the factorial of 9 (9!
)
9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1
[1] 362880
or
factorial(9)
[1] 362880
2.5 Comparing things
We have seen some examples that R can do all the things that a calculator can do. But when we are speaking of programming langage, we are thinking of writing computer programs. Programs are collections of instructions that performs specifics tasks. If we want our futur programs to be able to perform automatic choices, we need them to be able to perform comparisons.
Comparisons can be made with R. The result will return a TRUE
or FALSE
value (which is not a number as before but a boolean
type).
Try the following opperator to get a TRUE
then change your command to get a FALSE
.
You can use the ↑
(upper arrow) key to edit the last command and go through your history of commands
- equality (note two equal signs read as “is equal to”)
1 == 1
[1] TRUE
- inequality (read as “is not equal to”)
1 != 2
[1] TRUE
- less than
1 < 2
[1] TRUE
- less than or equal to
1 <= 1
[1] TRUE
- greater than
1 > 0
[1] TRUE
Summary so far
- R is a programming language and free software environment for statistical computing and graphics (free & opensource) with a large library of external packages available for performing diverse tasks.
- RStudio is an IDR application that provides comprehensive facilities to computer programmers for software development.
- R can be used as a calculator
- R can perform comparisons
3 Variables and assignment
In addition to be able to perform a huge number of computation very fast, computers can also store information to memory. This is a mandatory function to load your data and store intermediate states in your analysis.
In R <-
is the assignment operator (read as left member take right member value).
=
also exists but is not recommended! It will be used preferentially in other cases. (We will see them later).
If you realy don’t want to press two consecutive keys for assignement you can press alt
+ -
to write <-
.
Rstudio provides lots of such shortcuts (you can display them by pressing alt
+ shift
+ k
).
We assign a value to x
, x
is called a variable.
<- 1/40 x
We can then ask R to display the value of x
.
x
[1] 0.025
3.1 The environment
You now see the x
value in the environment box (in red).
This variable is present in your work environment. You can use it to perform different mathematical applications.
log(x)
[1] -3.688879
You can assign another value to x
.
<- 100
x log(x)
[1] 4.60517
<- x + 1 # x become 101 (100 + 1)
x <- x * 2
y y
[1] 202
A variable can be assigned a numeric
value as well as a character
value.
Just put our character (or string) between double quote "
when you assign this value.
<- "x" # One character
z z
[1] "x"
<- "Hello world" # Multiple characters == String
a a
[1] "Hello world"
You cannot mix different types of variable together:
+ z x
How to test the type of the variable?
is.character(z)
[1] TRUE
<- 1/40
b b
[1] 0.025
typeof(b)
[1] "double"
You can type is.
and press tabulation
.
Rstudio will show you a list of function whose names start with is.
.
This is called autocompletion, don’t hesitate to spam your tabulation
key as you write R code.
3.2 Variables names
Variable names can contain letters, numbers, underscores and periods.
They cannot start with a number nor contain spaces at all.
Different people use different conventions for long variable names, these include:
periods.between.words
underscores_between_words camelCaseToSeparateWords
What you use is up to you, but be consistent.
min_height
max.height
_age
.mass
MaxLength-length
min
2widths celsius2kelvin
Solution
min_height
max.height
.mass
MaxLength celsius2kelvin
3.3 Functions are also variables
<- log logarithm
Try to use the logarithm
variable.
A R function can have different arguments
function (x, base = exp(1))
base
is a named argument are read from left to right- named arguments breaks the reading order
- named arguments make your code more readable
To know more about the log
function we can read its manual.
help(log)
or
?log
This block allows you to view the different outputs (?help, graphs, etc.).
Test that your logarithm
function can work in base 10
Solution
10^logarithm(12, base = 10)
3.4 A code editor
We are know going to write our first function. We could do it directly in the R console, with multi-line commands but this process is tidyous.
Instead we are going to use the Rstudio code editor pannel, to write our code. You can go to File > New File > R script to open your editor pannel.
## Writing function
We can define our own function with :
- function name,
- declaration of function type:
function
, - arguments: between
(
)
, {
and}
to open and close function body,
Here is a example of function declaration with two argumment a
and b
.
<- function(a, b){
function_name
}
- a series of operations,
The argument a
and b
are accessible from within the function body as the variable
a
and b
.
<- function(a, b){
function_name <- operation1(a, b)
result_1 <- operation2(result_1, b)
result_2
}
return
operation
At the end of a function we want to return a results, so function calls will be equal to this results.
<- function(a, b){
function_name <- operation1(a, b)
result_1 <- operation2(result_1, b)
result_2 return(result_2)
}
Note: if you don’t use return
by default the evaluation of the last line of your function body is returned
Try a function to test if a number is even?
You can use the %%
modulo opperator
Name this function even_test
and use the ==
comparison to test if the results
of the modulo is equal to 0
.
Solution
<- function(x){
even_test <- x %% 2
modulo_result <- modulo_result == 0
is_even return(is_even)
}even_test(4)
[1] TRUE
even_test(3)
[1] FALSE
Solution
<- function(x){
even_test2 %% 2) == 0
(x
}even_test(4)
[1] TRUE
even_test(3)
[1] FALSE
RStudio offers you great flexibility in running code from within the editor window. There are buttons, menu choices, and keyboard shortcuts. To run the current line, you can
- click on the
Run button
above the editor panel, or - select “Run Lines” from the “Code” menu, or
- hit
Ctrl
+Return
in Windows or Linux orCmd
+Return
on OS X. To run a block of code, select it and then Run.
If you have modified a line of code within a block of code you have just run, there is no need to reselect the section and Run, you can use the next button along, Rerun the previous region. This will run the previous code block including the modifications you have made.
3.5 Cleaning up
No We can now clean your environment
rm(x)
What appenned in the Environment panel ? Check the documentation of this command
Solution
?rm
ls()
[1] "a" "b" "bioconductor_packages"
[4] "biocPackages" "cran_packages" "even_test"
[7] "even_test2" "logarithm" "url"
[10] "y" "z"
Combine rm
and ls
to cleanup your Environment
Solution
rm(list = ls())
ls()
character(0)
Summary so far:
- Assigning a variable is done with
<-
. - The assigned variables are listed in the environment box.
- Variable names can contain letters, numbers, underscores and periods.
- Functions are also variable and can write in several forms
- An editing box is available on Rstudio.
4 Complex variable type
You can only go so far with the variables we have already seen. In R there are also complex variable type, which can be seen as combinaison of simple variable type.
4.1 Vector (aka list)
Vector are simple list of variable of the same type
c(1, 2, 3, 4, 5)
[1] 1 2 3 4 5
or
c(1:5)
[1] 1 2 3 4 5
A mathematical calculation can be performed on the elements of the vector:
2^c(1:5)
[1] 2 4 8 16 32
<- c(1:5)
x 2^x
[1] 2 4 8 16 32
Note: this kind of opperation is called vectorisation and is very powerfull in R.
To determine the type of the elements of a vector:
typeof(x)
[1] "integer"
typeof(x + 0.5)
[1] "double"
+ 0.5 x
[1] 1.5 2.5 3.5 4.5 5.5
is.vector(x)
[1] TRUE
Vector can be extended to named vectors:
<- c(a = 1, b = 2, c = 3, d = 4, e = 5)
y y
a b c d e
1 2 3 4 5
We can compare the elements of two vectors:
x
[1] 1 2 3 4 5
y
a b c d e
1 2 3 4 5
== y x
a b c d e
TRUE TRUE TRUE TRUE TRUE
Summary so far
- A variable can be of different types :
numeric
,character
,vector
,function
, etc. - Calculations and comparisons apply to vectors.
- Do not hesitate to use the help box to understand functions!
We will see other complex variables type during this formation.
5 Packages
As we have seen
5.1 Installing packages
install.packages("tidyverse")
or click on Tools
and Install Packages...
install.packages("ggplot2")
5.2 Loading packages
sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rvest_1.0.1
loaded via a namespace (and not attached):
[1] knitr_1.33 xml2_1.3.2 magrittr_2.0.1 R6_2.5.1
[5] rlang_0.4.11 fansi_0.5.0 stringr_1.4.0 httr_1.4.2
[9] tools_4.1.1 xfun_0.25 utf8_1.2.2 htmltools_0.5.1.1
[13] ellipsis_0.3.2 yaml_2.2.1 assertthat_0.2.1 digest_0.6.27
[17] tibble_3.1.3 lifecycle_1.0.0 crayon_1.4.1 bookdown_0.23
[21] klippy_0.0.0.9500 vctrs_0.3.8 curl_4.3.2 evaluate_0.14
[25] rmarkdown_2.10 stringi_1.7.3 compiler_4.1.1 pillar_1.6.2
[29] rmdformats_1.0.2 pkgconfig_2.0.3
library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
✓ ggplot2 3.3.5 ✓ purrr 0.3.4
✓ tibble 3.1.3 ✓ dplyr 1.0.7
✓ tidyr 1.1.3 ✓ stringr 1.4.0
✓ readr 2.0.1 ✓ forcats 0.5.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
x dplyr::filter() masks stats::filter()
x readr::guess_encoding() masks rvest::guess_encoding()
x dplyr::lag() masks stats::lag()
sessionInfo()
5.3 Unloading packages
unloadNamespace("tidyverse")
sessionInfo()
##See you to Session#2 : “Introduction to Tidyverse”