Operations

Syntax and operators

We have already seen in chapter 1 several operators to address or assign data into and object.

Addressing operators

$ , @ : Address to a component into an objects, by names.

[ , [[ : Indexing components into an object.

? : Help.

<- : Assignment, right to left. The use of = for assignment is not advisable. Remember that To write ” <- ” easily, use ALT + ” - “ .

~ : Tilde. As in formulae (write it with Alt-4 or Alt-Ñ in Spanish keyboard).

Arithmetic operators

by order of precedence. If the objects are not numeric, they will be coerced into numeric, if possible.

^ : Exponential.

* , / : Multiply and divide.

+ , - : Addition and subtraction.

%/% , %% : Divisor and remainder for a division.

Comparison operators

Output will be a Logical or a list of logicals (True - False).

< , > , <= , >= : Leaser, greater, leaser or equal, greater or equal.

== , != : Equal and different.

%in% : Indicates matches.

Logical operators

! : Logical NOT

& , && : Logical AND

| , || : Logical OR

Let’s see some examples with operators:

#### Some examples with operators ------------------------------------
x <- -1:12                    # Create vector
x                             # See vector
x + 1                         # sum 1
2 * x + 3                     # same that (2*x)+3
2 * (x + 3)
x %% 2                        #-- is periodic
x %% 5                        #-- is periodic
x %/% 5
x / 5

### Logical AND ("&&") has higher precedence than OR ("||"): 
TRUE || TRUE && FALSE   # is the same as
TRUE || (TRUE && FALSE) # and different from
(TRUE || TRUE) && FALSE

### Special operators have higher precedence than "!" (logical NOT).
### You can use this for %in% :
1:10 %in% c(2, 3, 5, 7)
!1:10 %in% c(2, 3, 5, 7)      # same as !(1:10 %in% c(2, 3, 5, 7))
!(1:10 %in% c(2, 3, 5, 7))
### but it is advised the "!( ... )" form in this case. 

We can see a table of operators, and more examples, using ?Syntax (also precedence order).

Some Arithmetical commands

There are too many commands in R to list them all, but some of them are frequently used for calculations.

This commands return a number:

#### Commands that return a number --------------------------------------
sum(x)            # sum of the elements of x
prod(x)           # product of the elements of x
max(x)            # maximum of the elements of x
min(x)            # minimum of the elements of x
which.max(x)      # index of the maximum of the elements of x
which.min(x)      # index of the minimum of the elements of x
which(x == 2)     # index of the elements that fit
length(x)         # number of elements in x
mean(x)           # mean of the elements in x
median(x)         # median of the elements in x
var(x)            # variance of the elements in x
sd(x)             # standard deviation of the elements in x

### Sometimes is useful to round the result, for example:
round(sd(x), 2)   # round(x, n) rounds the elements of x to n decimals

An these operations can modify either a number or all the numbers in a vector or a matrix:

#### Commands to modify vectors -----------------------------------
log(x, 2)          # logarithm in base 2 ; log(x, base)
sqrt(x)            # Square root of x. (NaN: Not a Number)

###  match (x, y) returns a vector with the elements of x which are in y
match(x, 2)      

###  na.omit(x) # supresses the observations with missing data 
na.omit(log(x, 2)) # (NA: Not Available)

And of course, it is possible to make combinations of different commands. For example to calculate the standard error (SE) of x, which is:

\[\mbox{Standard error} = SE = \frac{\tilde{S}} {\sqrt{n}}\]

being \(\tilde{S}\) the standard deviation of x and n the sample size (number of items in the sample).

#### Standard error formula ------------------------------------
sd(x) / sqrt( length(x) )     # Standard error of x
#     or even
round( sd(x) / sqrt( length(x) ), 2)   # Two decimals rounded SE

Condicionals and recursive commands

The most used ones are if() and for(). Other control flow commands are while() and repeat(). if() can be used either with of without else. They function in much the same way as control statements in any Algol-like language. Also important the expressions break and next to control the flow.

Braces are not necessary in the same line, but is advisable to use them always because is a frequent source of errors.

Examples:

#### Use of the "for" command -----------------------------------
x <- -1:12                    # Create vector
for (i in 1:5) print(1:i)      # Print numbers
## [1] 1
## [1] 1 2
## [1] 1 2 3
## [1] 1 2 3 4
## [1] 1 2 3 4 5
### Always leave spaces before and after the parenthesis of the for command.

#### Use of brackets in "for" command 
for (i in 1:5) { # print list of numbers
    print(1:i) 
} # print list of numbers
## [1] 1
## [1] 1 2
## [1] 1 2 3
## [1] 1 2 3 4
## [1] 1 2 3 4 5
####  example of "for" using 2^n, print() and paste() 
for (i in x) { # print results for 2^n
   y <- 2^i
   z <- paste(i, ": ", y, sep = "")     # Design the output
   print(z)                             # Output
} # print results for 2^n
## [1] "-1: 0.5"
## [1] "0: 1"
## [1] "1: 2"
## [1] "2: 4"
## [1] "3: 8"
## [1] "4: 16"
## [1] "5: 32"
## [1] "6: 64"
## [1] "7: 128"
## [1] "8: 256"
## [1] "9: 512"
## [1] "10: 1024"
## [1] "11: 2048"
## [1] "12: 4096"
#### Same example including several if() to alineate results -------------------
### paste0() is the same than paste(), but without separations. See ?paste.
for (i in x) { # for numbers in x
   y <- 2^i                           # Result
   ### Output before the colon
   if (i < 0 || i > 9)                # if lower than 0 or higher than 9
       number <- paste0(i, ":")       # without space
   if (i %in% 0:9)                    # if included in 0 to 9
       number <- paste0(" ", i, ":")  # with one space
    ### Output after the colon
   if (y < 10)                    # if lower than 10
       result <- paste0("   ", y) # with three spaces
   if (y < 100 && y >= 10)        # if lower than 100 and higher or equal 10
       result <- paste0("  ", y)  # with two spaces
   if (y < 1000 && y >= 100)      # if lower than 1000 and higher or equal 100
       result <- paste0(" ", y)   # with one space
   if (y < 10000 && y >= 1000)    # if lower than 10000 and higher or equal 1000
       result <- y                # without spaces
   ### Output
   print(paste(number, result))   # Output
} # for numbers in x
## [1] "-1:    0.5"
## [1] " 0:    1"
## [1] " 1:    2"
## [1] " 2:    4"
## [1] " 3:    8"
## [1] " 4:   16"
## [1] " 5:   32"
## [1] " 6:   64"
## [1] " 7:  128"
## [1] " 8:  256"
## [1] " 9:  512"
## [1] "10: 1024"
## [1] "11: 2048"
## [1] "12: 4096"
#### Clean the environment -----------------------------
### It is always a good idea to remove all old objects from your environment
rm(i, number, result, x, y, z)

Exercises

  1. Open the data frame in iris {datasets}. Use the help to know about this data. In which units are measured the length and width of sepals and petals? How many variables and observations are there in iris?

  2. Create a vector with the species names. Remember that genus should be with capital letters and species in small letters (e.g. “Iris setosa”).

  3. Create a vector with the name of all quantitative variables

  4. Make a data frame with the combination of the two previous vectors like this:

##            Species     Variable
## 1      Iris setosa Sepal.Length
## 2      Iris setosa  Sepal.Width
## 3      Iris setosa Petal.Length
## 4      Iris setosa  Petal.Width
## 5  Iris versicolor Sepal.Length
## 6  Iris versicolor  Sepal.Width
## 7  Iris versicolor Petal.Length
## 8  Iris versicolor  Petal.Width
## 9   Iris virginica Sepal.Length
## 10  Iris virginica  Sepal.Width
## 11  Iris virginica Petal.Length
## 12  Iris virginica  Petal.Width
  1. Using dataframe from exercise 4, make a data frame with the following variables:
  1. Install the package “writexl” and use the command write_xlsx to create a “yourname.xlsx” file with your data frame.

If you want, you can also use the command WriteXLS from the WriteXLS package, but you will need Perl installed in your computer.

Use the commands seen in this and previous chapters to do the code the neatest possible. Remember to comment each step to know what are you doing. When sourcing your script, the xlsx file should appear without errors nor warnings.

Always comment the “install.packages() line”.

 


 

About this tutorial

Cite as: Alfonso Garmendia (2023) R for life sciences. Chapter 2: Operations in R. http://personales.upv.es/algarsal/R-tutorials/02_Tutorial-2_R-operations.html.

Available also in other formats (pdf, docx, …): https://drive.google.com/drive/folders/19w914WCg8BVTVBE_zpgShmg2vpjguV1e?usp=sharing.

Other simmilar tutorials: https://garmendia.blogs.upv.es/r-lecture-notes/

Originals are in bitbucket repository: https://bitbucket.org/alfonsogar/tea_daa_tutorials.

 

Document written in Rmarkdown, using Rstudio.

 

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.