A new way of thinking about graphics

What is ggplot

ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics.

You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

Installation

It is in cran, so it is easy:

### Install ggplot2 package --------------------
install.packages("ggplot2", dependencies = TRUE)

Remember than in your script the install.packages() command should allways be commented and at the beginning of the script.

### Packages
# install.packages("ggplot2", dependencies = TRUE)

library(ggplot2)

Cheatsheet

There is an official cheatsheet here: https://rstudio.github.io/cheatsheets/html/data-visualization.html

Cheatsheet

Usage

All ggplot2 plots begin with a call to ggplot(), supplying default data and aesthethic mappings, specified by aes(). You then add layers, scales, coords and facets with +. To save a plot to disk, use ggsave().

For plotting a ggplot2 plot it is necessary to have the data into a dataframe in long format.

Producing a plot with ggplot2, we must give three things:

  1. A data frame containing our data.
  2. How the columns of the data frame can be translated into positions, colors, sizes, and shapes of graphical elements (“aesthetics”).
  3. The actual graphical elements to display (“geometric objects”).

Let’s make our first ggplot.

### Plot relationship between sepal length and width ----------
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
    geom_point()

The call to ggplot() and aes() sets up the basics of the data and variables forthe data frame.

aes() defines the “aesthetics”, which is how columns of the data frame map to graphical attributes such as x and y position, color, size, etc. Arguments to aes() may refer to columns of the data frame directly.

We then literally add layers of graphics (“geoms”) to this.

Further aesthetics can be used. Any aesthetic can be either numeric or categorical, an appropriate scale will be used.

### Plot relationship between sepal length and width
### related to Species and Petal.Length -----------------------
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, 
                 color = Species, size = Petal.Length)) +
    geom_point()

We can also use different geometries.

### Plot relationship between sepal length and width
### related to Species ----------------------------
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, 
                 color = Species)) +
    geom_smooth() +
    geom_point()
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

### boxplot of sepal width for each species ----------------------
g <- ggplot(iris, aes(x = Species, y = Sepal.Width, 
                 fill = Species))

g + geom_boxplot()

g + geom_boxplot() +
    geom_jitter()

Other layers of information

Different layers of information in ggplot
Different layers of information in ggplot

Besides data, aesthetics and geometries you can add different layers of information. For example, the previous plot can be draw with a different theme.

### boxplot of sepal width for each species --------------------
g +
    geom_boxplot() +
    geom_jitter() +
    theme_bw()

To save your plots

To save a particular plot from an R script you can use the command ggsave().

But for Quarto or RMarkdown documents it is better if in the first setup R chunk you include a fig.path parameter with the direction where to save all the plots from the document.

knitr::opts_chunk$set(fig.path  = 'results/Fig_')

More features

The best way to see more features is to look for them in google when needed. Some interesting ones might be:

Exercises

In a new Rmarkdown document, copy and answer the following questions:

  1. create a ggplot with the trees data using
  1. Try different geometries

¿Which is the best combination to represent the data?

  1. Look at the R Graph Gallery https://www.r-graph-gallery.com/ and choose a fancy plot you like. Copy paste the code from the example and do some modifications. Change the data, the theme or any other part of it.

 


 

About this tutorial

Cite as: Alfonso Garmendia (2024) R for life sciences. Chapter 6, ggplot: A new way of thinking about graphics. http://personales.upv.es/algarsal/R-tutorials/06_Tutorial-6_R-ggplot.html

Available also in other formats (pdf, docx, …): https://drive.google.com/drive/folders/19w914WCg8BVTVBE_zpgShmg2vpjguV1e?usp=sharing.

Other simmilar tutorials: https://garmendia.blogs.upv.es/r-lecture-notes/

Originals are in bitbucket repository: https://bitbucket.org/alfonsogar/tea_daa_tutorials.

 

Document written in Rmarkdown, using Rstudio.

 

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.