Cite as: Alfonso Garmendia (2019) R for life sciences. Chapter 2: Operations in R. http://personales.upv.es/algarsal/R-tutorials/02_Tutorial-2_R-operations.html

available also in other formats (pdf, docx, …): https://drive.google.com/drive/folders/19w914WCg8BVTVBE_zpgShmg2vpjguV1e?usp=sharing

Originals in bitbucket repository: https://bitbucket.org/alfonsogar/tea_daa_tutorials

Written in Rmarkdown, using Rstudio.

# Operations

## Sintax and operators

We have already seen in chapter 1 several operators to address or assign data into and object.

\$ , @ : Address to a component into an objects, by names.

[ , [[ : Indexing components into an object.

? : Help.

<- : Assignment, right to left. The use of = for assignment is not advisable. Remember that To write " <- " easily, use ALT + " - " .

~ : As in formulae (write it with Alt-4, or Alt-Ñ in Spanish keyboard).

### Arithmetic operators

by order of precedence. If the objects are not numeric, they will be coerced into numeric, if possible.

^ : Exponential.

• , / : Multiply and divide.

• , - : Addition and subtraction.

%/% , %% : Divisor and remainder for a division.

### Comparison operators

Output will be a Logical or a list of logicals (True - False).

< , > , <= , >= : Leaser, greater, leaser or equal, greater or equal.

== , != : Equal and different.

%in% : Indicates matches.

### Logical operators

! : Logical NOT

& , && : Logical AND

, || : Logical OR

Let’s see some examples with operators:

#### Some examples with operators ####
x <- -1:12                    # Create vector
x                             # See vector
x + 1                         # sum 1
2 * x + 3                     # same that (2*x)+3
2 * (x + 3)
x %% 2                        #-- is periodic
x %% 5                        #-- is periodic
x %/% 5
x / 5

### Logical AND ("&&") has higher precedence than OR ("||"):
TRUE || TRUE && FALSE   # is the same as
TRUE || (TRUE && FALSE) # and different from
(TRUE || TRUE) && FALSE

### Special operators have higher precedence than "!" (logical NOT).
### You can use this for %in% :
1:10 %in% c(2, 3, 5, 7)
!1:10 %in% c(2, 3, 5, 7)      # same as !(1:10 %in% c(2, 3, 5, 7))
!(1:10 %in% c(2, 3, 5, 7))
### but it is advised the "!( ... )" form in this case. 

We can see a table of operators, and more examples, using ?Syntax (also precedence order).

### Some Arithmetical commands

There are too many commands in R to list them all, but some of them are frequently used for calculations.

This commands return a number:

#### Commands that return a number ####
sum(x)            # sum of the elements of x
prod(x)           # product of the elements of x
max(x)            # maximum of the elements of x
min(x)            # minimum of the elements of x
which.max(x)      # index of the maximum of the elements of x
which.min(x)      # index of the minimum of the elements of x
which(x == 2)     # index of the first element that fits
length(x)         # number of elements in x
mean(x)           # mean of the elements in x
median(x)         # median of the elements in x
var(x)            # variance of the elements in x
sd(x)             # standard deviation of the elements in x

### Sometimes is useful to round the result, for example:
round(sd(x), 2)   # round(x, n) rounds the elements of x to n decimals

An these operations can modify either a number or all the numbers in a vector or a matrix:

#### Commands to modify vectors ####
log(x, 2)          # logarithm in base 2 ; log(x, base)
sqrt(x)            # Square root of x. (NaN: Not a Number)

###  match (x, y) returns a vector with the elements of x which are in y
match(x, 2)

###  na.omit(x) # supresses the observations with missing data
na.omit(log(x, 2)) # (NA: Not Available)

And of course, it is possible to make combinations of different commands. For example to calculate the standard error (SE) of x, which is:

$\mbox{Standard error} = SE = \frac{\tilde{S}} {\sqrt{n}}$

being $$\tilde{S}$$ the standard deviation of x and n the sample size (number of items in the sample).

#### Standard error formula ####
sd(x) / sqrt( length(x) )     # Standard error of x
#     or even
round( sd(x) / sqrt( length(x) ), 2)   # Two decimals rounded SE

## Condicionals and recursive commands

The most used ones are if() and for(). Other control flow commands are while() and repeat(). if() can be used either with of without else. They function in much the same way as control statements in any Algol-like language. Also important the expressions break and next to control the flow.

Braces are not necessary in the same line, but is advisable to use them always because is a frequent source of errors.

Examples:

#### Use of the "for" command ####
x <- -1:12                    # Create vector
for (i in 1:5) print(1:i)      # Print numbers
## [1] 1
## [1] 1 2
## [1] 1 2 3
## [1] 1 2 3 4
## [1] 1 2 3 4 5
### Always leave spaces before and after the parenthesis of the for command.

#### Use of brackets in "for" command ####
for (i in 1:5) { # print list of numbers
print(1:i)
} # print list of numbers
## [1] 1
## [1] 1 2
## [1] 1 2 3
## [1] 1 2 3 4
## [1] 1 2 3 4 5
####  example of "for" using 2^n, print() and paste() ####
for (i in x) { # print results for 2^n
y <- 2^i
z <- paste(i, ": ", y, sep = "")     # Design the output
print(z)                             # Output
} # print results for 2^n
## [1] "-1: 0.5"
## [1] "0: 1"
## [1] "1: 2"
## [1] "2: 4"
## [1] "3: 8"
## [1] "4: 16"
## [1] "5: 32"
## [1] "6: 64"
## [1] "7: 128"
## [1] "8: 256"
## [1] "9: 512"
## [1] "10: 1024"
## [1] "11: 2048"
## [1] "12: 4096"
#### Same example including several if() to alineate results ####
### paste0() is the same than paste(), but without separations. See ?paste.
for (i in x) { # for numbers in x
y <- 2^i                           # Result
### Output before the colon
if (i < 0 || i > 9)                # if lower than 0 or higher than 9
number <- paste0(i, ":")       # without space
if (i %in% 0:9)                    # if included in 0 to 9
number <- paste0(" ", i, ":")  # with one space
### Output after the colon
if (y < 10)                    # if lower than 10
result <- paste0("   ", y) # with three spaces
if (y < 100 && y >= 10)        # if lower than 100 and higher or equal 10
result <- paste0("  ", y)  # with two spaces
if (y < 1000 && y >= 100)      # if lower than 1000 and higher or equal 100
result <- paste0(" ", y)   # with one space
if (y < 10000 && y >= 1000)    # if lower than 10000 and higher or equal 1000
result <- y                # without spaces
### Output
print(paste(number, result))   # Output
} # for numbers in x
## [1] "-1:    0.5"
## [1] " 0:    1"
## [1] " 1:    2"
## [1] " 2:    4"
## [1] " 3:    8"
## [1] " 4:   16"
## [1] " 5:   32"
## [1] " 6:   64"
## [1] " 7:  128"
## [1] " 8:  256"
## [1] " 9:  512"
## [1] "10: 1024"
## [1] "11: 2048"
## [1] "12: 4096"
#### Clean the environment ####
### It is always a good idea to remove all old objects from your environment
rm(i, number, result, x, y, z)

# Exercises

1. Open the data frame in iris {datasets}. Use the help to know about this data. In which units are measured the length and width of sepals and petals? How many variables and observations are there in iris?

2. Create a vector with the species names. Remember that genus should be with capital letters and species in small letters (e.g. “Iris setosa”).

3. Create a vector with the name of all quantitative variables

4. Make a data frame with the combination of the two previous vectors like this:

##            Species     Variable
## 1      Iris setosa Sepal.Length
## 2      Iris setosa  Sepal.Width
## 3      Iris setosa Petal.Length
## 4      Iris setosa  Petal.Width
## 5  Iris versicolor Sepal.Length
## 6  Iris versicolor  Sepal.Width
## 7  Iris versicolor Petal.Length
## 8  Iris versicolor  Petal.Width
## 9   Iris virginica Sepal.Length
## 10  Iris virginica  Sepal.Width
## 11  Iris virginica Petal.Length
## 12  Iris virginica  Petal.Width
1. Using dataframe from exercise 4, make a data frame with the following variables:
• Species.
• Variable.
• Mean, the mean for each variable and species.
• Standard_error, the standard error for each variable and species.
• Median, the median for each variable and species.
• Minimum, the minimum for each variable and species.
• Maximum, the maximum for each variable and species.

Use the commands seen in this and previous chapters to do the code the neatest possible. Remember to comment each step to know what are you doing.