R for Bioinformatic Analyses

Hannah Tavalire and Bill Cresko - University of Oregon

January 2019 - Cesky Krumlov

Lecture 1 - Using R for Biostatistical Analyses

But first a beautiful chair

Before we talk about how amazing R is…

  • navigate into this directory: ~/workshop_materials/evomics_stats_2019/
  • type ‘git pull’

Why use R?

  • R is a statistical programming language (derived from S)
  • Superb data management & graphics capabilities
  • You can write your own functions
  • Powerful and flexible
  • Runs on all computer platforms
  • Well established system of packages and documentation
  • Active development and dedicated community
  • Can use a nice GUI front end such as Rstudio
  • Reproducibility
    • keep your scripts to see exactly what was done
    • distribute these with your data
    • embed your R analyses in polished RMarkdown files
  • FREE

R resources

Running R

  • Need to make sure that you have R installed
  • Run R from the command line
    • just type R
    • can run it locally as well as on clusters
  • Install a R Integrated Development Environment (IDE)
    • RStudio: http://www.rstudio.com
    • Makes working with R much easier, particularly for a new R user
    • Run on Windows, Mac or Linux OS
    • We’re running as a server on the AWS instances

RStudio

Exercise 1.1 - Exploring RStudio

  • Open RStudio by adding :8787 to your AMI url
  • Take a few minutes to familiarize yourself with the Rstudio environment by locating the following features:
    • See what types of new files can be made in Rstudio by clicking the top left icon- open a new R script.
    • The windows clockwise from top left are: the code editor, the workspace and history, the plots and files window, and the R console.
    • In the plots and files window, click on the packages and help tabs to see what they offer.
  • Now open the file called Exercises_for_R_Lectures.Rmd in /workshop_materials/evomics_stat_2019/03.Exercises/
    • This file will serve as your digital notebook for parts of the workshop and contains the other exercises.

Introduction to RMarkdown

RMarkdown

Exercise 1.2 - Intro to RMarkdown Files

  • Take a few minutes to familiarize yourself with RMarkdown files by completing exercise 1.2 in your exercises document.

BASICS of R

BASICS of R

  • Commands can be submitted through
    • terminal, console or scripts
    • can be embedded as code chunks in RMarkdown
  • On these slides evaluating code chunks and showing output
    • shown here after the two # symbols
    • the number of output items is in []
  • R follows the normal priority of mathematical evaluation (PEDMAS)

BASICS of R

Input code chunk and then output

## [1] 16

Input code chunk and then output

## [1] 16

Assigning Variables

  • A better way to do this is to assign variables
  • Variables are assigned values using the <- operator.
  • Variable names must begin with a letter, but other than that, just about anything goes.
  • Do keep in mind that R is case sensitive.

Assigning Variables

## [1] 6
## [1] 4

These do not work

Arithmetic operations on functions

  • Arithmetic operations can be performed easily on functions as well as numbers.
## [1] 14
## [1] 144
## [1] 2.484907

Arithmetic operations on functions

  • Note that the last of these - log - is a built in function of R, and therefore the object of the function needs to be put in parentheses
  • These parentheses will be important, and we’ll come back to them later when we add arguments after the object in the parentheses
  • The outcome of calculations can be assigned to new variables as well, and the results can be checked using the print command

Arithmetic operations on functions

## [1] 67
## [1] 69022864

STRINGS

  • Operations can be performed on character variables as well
  • Note that “characters” need to be set off by quotation marks to differentiate them from numbers
  • The c stands for concatenate
  • Note that we are using the same variable names as we did previously, which means that we’re overwriting our previous assignment
  • A good rule of thumb is to use new names for each variable, and make them short but still descriptive

STRINGS

## [1] "I Love"
## [1] "Biostatistics"
## [1] "I Love"        "Biostatistics"

VECTORS

  • In general R thinks in terms of vectors
    • a list of characters, factors or numerical values (“I Love”)
    • it will benefit any R user to try to write scripts with that in mind
    • it will simplify most things
  • Vectors can be assigned directly using the ‘c()’ function and then entering the exact values.

VECTORS

##  [1]  2  3  4  2  1  2  4  5 10  8  9
##  [1]  5  6  7  5  4  5  7  8 13 11 12

FACTORS

  • The vector x is now what is called a list of character values (“I Love”).
  • Sometimes we would like to treat the characters as if they were units for subsequent calculations.
  • These are called factors, and we can redefine our character variables as factors.
  • This might seem a bit strange, but it’s important for statistical analyses where we might want to see the mean or variance for two different treatments.

FACTORS

## [1] I Love
## Levels: I Love
  • Note that factor levels are reported alphabetically

FACTORS

  • We can also determine how R “sees” a variable using str() or class() functions.
  • This is a useful check when importing datasets or verifying that you assigned a class correctly
##  chr "I Love"
## [1] "character"

Types or ‘classes’ of vectors of data

Types of vectors of data

  • int stands for integers

  • dbl stands for doubles, or real numbers

  • chr stands for character vectors, or strings

  • dttm stands for date-times (a date + a time)

  • lgl stands for logical, vectors that contain only TRUE or FALSE

  • fctr stands for factors, which R uses to represent categorical variables with fixed possible values

  • date stands for dates

Types of vectors of data

  • Logical vectors can take only three possible values:
    • FALSE
    • TRUE
    • NA which is ‘not available’.
  • Integer and double vectors are known collectively as numeric vectors.
    • In R numbers are doubles by default.
  • Integers have one special value: NA, while doubles have four:
    • NA
    • NaN which is ‘not a number’
    • Inf
    • -Inf

Basic Statistics

Many functions exist to operate on vectors.

  • Arguments modify or direct the function in some way
    • There are many arguments for each function, some of which are defaults
    • Tab complete is helpful to view argument options

Getting Help

  • Getting Help on any function is very easy - just type a question mark and the name of the function.
  • There are functions for just about anything within R and it is easy enough to write your own functions if none already exist to do what you want to do.
  • In general, function calls have a simple structure: a function name, a set of parentheses and an optional set of parameters/arguments to send to the function.
  • Help pages exist for all functions that, at a minimum, explain what parameters exist for the function.

Getting Help

Creating vectors

  • Creating a vector of new data by entering it by hand can be a drag
  • However, it is also very easy to use functions such as
    • seq
    • sample

Creating vectors

  • What do the arguments mean?
##   [1]  0.0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.0  1.1  1.2  1.3
##  [15]  1.4  1.5  1.6  1.7  1.8  1.9  2.0  2.1  2.2  2.3  2.4  2.5  2.6  2.7
##  [29]  2.8  2.9  3.0  3.1  3.2  3.3  3.4  3.5  3.6  3.7  3.8  3.9  4.0  4.1
##  [43]  4.2  4.3  4.4  4.5  4.6  4.7  4.8  4.9  5.0  5.1  5.2  5.3  5.4  5.5
##  [57]  5.6  5.7  5.8  5.9  6.0  6.1  6.2  6.3  6.4  6.5  6.6  6.7  6.8  6.9
##  [71]  7.0  7.1  7.2  7.3  7.4  7.5  7.6  7.7  7.8  7.9  8.0  8.1  8.2  8.3
##  [85]  8.4  8.5  8.6  8.7  8.8  8.9  9.0  9.1  9.2  9.3  9.4  9.5  9.6  9.7
##  [99]  9.8  9.9 10.0

Creating vectors

##   [1] 10.0  9.9  9.8  9.7  9.6  9.5  9.4  9.3  9.2  9.1  9.0  8.9  8.8  8.7
##  [15]  8.6  8.5  8.4  8.3  8.2  8.1  8.0  7.9  7.8  7.7  7.6  7.5  7.4  7.3
##  [29]  7.2  7.1  7.0  6.9  6.8  6.7  6.6  6.5  6.4  6.3  6.2  6.1  6.0  5.9
##  [43]  5.8  5.7  5.6  5.5  5.4  5.3  5.2  5.1  5.0  4.9  4.8  4.7  4.6  4.5
##  [57]  4.4  4.3  4.2  4.1  4.0  3.9  3.8  3.7  3.6  3.5  3.4  3.3  3.2  3.1
##  [71]  3.0  2.9  2.8  2.7  2.6  2.5  2.4  2.3  2.2  2.1  2.0  1.9  1.8  1.7
##  [85]  1.6  1.5  1.4  1.3  1.2  1.1  1.0  0.9  0.8  0.7  0.6  0.5  0.4  0.3
##  [99]  0.2  0.1  0.0

Creating vectors

##   [1] 100.00  98.01  96.04  94.09  92.16  90.25  88.36  86.49  84.64  82.81
##  [11]  81.00  79.21  77.44  75.69  73.96  72.25  70.56  68.89  67.24  65.61
##  [21]  64.00  62.41  60.84  59.29  57.76  56.25  54.76  53.29  51.84  50.41
##  [31]  49.00  47.61  46.24  44.89  43.56  42.25  40.96  39.69  38.44  37.21
##  [41]  36.00  34.81  33.64  32.49  31.36  30.25  29.16  28.09  27.04  26.01
##  [51]  25.00  24.01  23.04  22.09  21.16  20.25  19.36  18.49  17.64  16.81
##  [61]  16.00  15.21  14.44  13.69  12.96  12.25  11.56  10.89  10.24   9.61
##  [71]   9.00   8.41   7.84   7.29   6.76   6.25   5.76   5.29   4.84   4.41
##  [81]   4.00   3.61   3.24   2.89   2.56   2.25   1.96   1.69   1.44   1.21
##  [91]   1.00   0.81   0.64   0.49   0.36   0.25   0.16   0.09   0.04   0.01
## [101]   0.00

Creating vectors

##   [1] 100.00  98.01  96.04  94.09  92.16  90.25  88.36  86.49  84.64  82.81
##  [11]  81.00  79.21  77.44  75.69  73.96  72.25  70.56  68.89  67.24  65.61
##  [21]  64.00  62.41  60.84  59.29  57.76  56.25  54.76  53.29  51.84  50.41
##  [31]  49.00  47.61  46.24  44.89  43.56  42.25  40.96  39.69  38.44  37.21
##  [41]  36.00  34.81  33.64  32.49  31.36  30.25  29.16  28.09  27.04  26.01
##  [51]  25.00  24.01  23.04  22.09  21.16  20.25  19.36  18.49  17.64  16.81
##  [61]  16.00  15.21  14.44  13.69  12.96  12.25  11.56  10.89  10.24   9.61
##  [71]   9.00   8.41   7.84   7.29   6.76   6.25   5.76   5.29   4.84   4.41
##  [81]   4.00   3.61   3.24   2.89   2.56   2.25   1.96   1.69   1.44   1.21
##  [91]   1.00   0.81   0.64   0.49   0.36   0.25   0.16   0.09   0.04   0.01
## [101]   0.00

R Interlude

Complete Exercises 1.3-1.6

Drawing samples from distributions

  • Here is a way to create your own data sets that are random samples…

Drawing samples from distributions

Drawing samples from distributions

  • You’ve probably figured out that y from the last example is drawing numbers with equal probability.
  • What if you want to draw from a distribution?
  • Again, play around with the arguments in the parentheses to see what happens.

Drawing samples from distributions

  • dnorm() generates the probability density, which can be plotted using the curve() function.
  • Note that is curve is added to the plot using add=TRUE

Visualizing Data in R

Visualizing Data

  • So far you’ve been visualizing just the list of output numbers
  • Except for the last example where I snuck in a hist function.
  • You can also visualize all of the variables that you’ve created using the plot function (as well as a number of more sophisticated plotting functions).
  • Each of these is called a high level plotting function, which sets the stage
  • Low level plotting functions will tweak the plots and make them beautiful

Visualizing Data

Putting plots in a single figure

  • The first line of the lower script tells R that you are going to create a composite figure that has two rows and two columns (on next slide)
    • Can you tell how?

Putting plots in a single figure

R Interlude

Complete Exercises 1.7-1.8

Working with Imported Datasets in R

Creating Data Frames in R

  • As you have seen, in R you can generate your own random data set drawn from nearly any distribution very easily.
  • Often we will want to use collected data.
  • Now, let’s make a dummy dataset to get used to dealing with data frames
    • Set up three variables (habitat, temp and elevation) as vectors

Creating Data Frames in R

  • Create a data frame where vectors become columns
##             habitat temp elevation
## Reedy Lake    mixed  3.4       0.0
## Pearcadale      wet  3.4       9.2
## Warneet         wet  8.4       3.8
## Cranbourne      wet  3.0       5.0
## Lysterfield     dry  5.6       5.6
## Red Hill        dry  8.1       4.1
  • Now you have a hand-made data frame with row names

R Interlude: Reading in Data Frames in R

  • A strength of R is being able to import data from an external source
    • Create the same table that you did above in a spreadsheet using LibreOffice
    • Export it to comma separated and tab separated text files for importing into R.
    • The first will read in a comma-delimited file, whereas the second is a tab-delimited
    • In both cases the header and row.names arguments indicate that there is a header row and row label column
    • Note that the name of the file by itself will have R look in the PWD, whereas a full path can also be used

Reading in Data Frames in R

Exporting Data Frames in R

  • you will get more practice with this during the next R interlude

Where we left off…

Where we left off…

  • Use :8787 to access R studio again
  • Please rename your Exercises document before doing any git pulls today, as to not overwrite your work!
    • You can do this through the terminal window in R studio using the mv command
  • Working with imported datasets and reading and writing datasets
  • Next up: indexing!
  • But first- a note about arguments…

Arguments in R Functions

  • Sometimes R can guess what you mean because of order…
##  [1]  17.600298   1.043059  -2.164877   5.635400   4.439286  -9.417456
##  [7] -26.140246   3.868278  -7.574435  12.027035
  • But sometimes if the order isn’t right, you can confuse R and get something you really didn’t want…
##  [1] 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000

Arguments in R Functions

  • A work-around and best-practice: include the arguments!!
##  [1]  6.869129 10.663631  5.367006 19.060287 10.631596 13.703436  5.277918
##  [8]  4.030967 11.677516  7.926794
##  [1]  6.869129 10.663631  5.367006 19.060287 10.631596 13.703436  5.277918
##  [8]  4.030967 11.677516  7.926794
  • Notice we also set the seed to replicate our sample results!

Indexing in data frames

  • Next up - indexing just a subset of the data
  • This is a very important feature in R, that allows you to analyze just a subset of the data.

Indexing in data frames

  • You can also assign values, or single values, from a data set to a new variable

Indexing in data frames

  • You can perform operations on particular levels of a factor
  • Note that the first argument is the numerical column vector, and the second is the factor column vector.
  • The third is the operation. Reversing the first two does not work
    • Tab complete will tell you the correct order for arguments

R Interlude

Complete Exercises 1.9-1.10

Lecture 2 - Collaboration, Documentation and Reproducibility

Collaboration using Git and GitHub

Git and GitHub

https://learngitbranching.js.org/

Clone the repository

  • First make a new directory into which you will clone our course repository
    • This will prevent you from overwriting any of the documents you have have edited
    • And it’s good practice to do it again
  • You should work through the terminal application and use Unix to do this
  • Open the terminal and navigate to your new directory and type the following:

Update the repository

  • Now to update the repository you just need to use these commands
  • The first command just tells you if anything has changed
  • If so, do the second!
  • This is much safer than git pull

Git and GitHub Interlude: Exercise 2.1

  • Please read the directions carefully to prevent pushing to the wrong repository

Clarity using Markdown and LaTeX

What is markdown?

  • Lightweight formal markup languages are used to add formatting to plaintext documents
    • Adding basic syntax to the text will make elements look different once rendered/knit
    • Available in many base editors (e.g., Atom text editor)
  • You then need a markdown application with a markdown processor/parser to render your text files into something more exciting
    • Static and dynamic outputs!
    • pdf, HTML, presentations, websites, scientific articles, books etc

What is Knitr and PANDOC?

  • Knitr is a package in R to render markdown files
  • PANDOC is a general way to render markdown files into something else
  • https://pandoc.orgis
  • Can include math using LaTeX
  • GitHub will render markdown directly
  • Markdown can easily be rendered within most editors now
  • Within RStudio just use the knit button to render markdown
  • Markdown syntax is very easy

Formatting text

  • Italic or Italic
  • Bold or Bold

Formatting text

“You know the greatest danger facing us is ourselves, an irrational fear of the unknown. But there’s no such thing as the unknown — only things temporarily hidden, temporarily not understood.”

— Captain James T. Kirk

Formatting lists

  • list_element
    • sub_list_element
    • sub_list_element
    • sub_list_element
  • list_element
    • sub_list_element

Formatting lists

  1. One
  2. Two
  3. Three
  4. Four

Inserting images or URLs

Link Image

What is LaTeX?

  • Pronounced «Lah-tech» or «Lay-tech» (to rhyme with «Bertolt Brecht»)
  • A document preparation system for high-quality typesetting
  • It is most often used for medium-to-large technical or scientific documents
  • Can be used for almost any form of publishing.
  • Typesetting journal articles, technical reports, books, and slide presentations
  • Allows for precise mathematical statements
  • https://www.latex-project.org

What is LaTeX?

  • LaTeX is not a word processor!
  • LaTeX encourages authors not to worry too much about the appearance of their documents but to concentrate on getting the right content.
  • Control over large documents containing sectioning, cross-references, tables and figures.
  • Typesetting of complex mathematical formulas.
  • Automatic generation of bibliographies and indexes.
  • Multi-lingual typesetting
  • https://bookdown.org/yihui/bookdown/

What is LaTeX?

  • Importantly, LaTeX can be included right into RMarkdown documents
  • The following slides have some examples

Operators and Symbols

\[ \large a^x, \sqrt[n]{x}, \vec{\jmath}, \tilde{\imath}\]

\[ \large \alpha, \beta, \gamma\]

Operators and Symbols

\[ \large\approx, \neq, \nsim \]

\[\large \partial, \mathbb{R}, \flat\]

Equations

Binomial sampling equation

\[\large f(k) = {n \choose k} p^{k} (1-p)^{n-k}\]

Poisson Sampling Equation

\[\large Pr(Y=r) = \frac{e^{-\mu}\mu^r}{r!}\]

Differential Equations

\[\iint xy^2\,dx\,dy =\frac{1}{6}x^2y^3\]

Matrix formulations

\[ \begin{matrix} -2 & 1 & 0 & 0 & \cdots & 0 \\ 1 & -2 & 1 & 0 & \cdots & 0 \\ 0 & 1 & -2 & 1 & \cdots & 0 \\ 0 & 0 & 1 & -2 & \ddots & \vdots \\ \vdots & \vdots & \vdots & \ddots & \ddots & 1 \\ 0 & 0 & 0 & \cdots & 1 & -2 \end{matrix} \]

Including LaTeX and Code into Markdown Files

  • Explicit inclusion of code and mathematical equations helps with reproducibility
  • Need to designate the ‘environment’ as being code or math
  • Can be included in-line or in ‘chunks’

In-line versus fenced

This equation, \(y=\frac{1}{2}\), is included inline

Whereas this equation \[y=\frac{1}{2}\] is put on a separate line

Markdown is very flexible

  • You can import RMarkdown templates into RStudio and open as a new Rmarkdown file
  • Better yet there are packages that add functionality
  • When you install the package it will show up in the ‘From Template’ section of the ‘new file’ startup screen
  • There are packages to make
    • books
    • journal articles
    • slide shows
    • interactive exercises
    • many more
  • Some of these use ‘Shiny’
    • an interactive web based application
    • allows users to input and get output

Final Thoughts on Markdown, LaTeX and GitHub

  • Many forms/flavors of markdown
    • HTML and Rmarkdown are just forms of markdown
    • There is a GitHub flavored markdown
    • Once you learn one, all the others are very easy
  • The goal is increased collaboration and reproducibility
    • Allows you to easily work with others by sharing the markdown file
    • Allows formal representation of code and math
    • Allows others to run your code directly
    • Allows reports to nontech people
    • All files are easily shared on GitHub
  • Once you start using Markdown you won’t stop…..

R Interlude | Exploring RMarkdown

  • Exercise 2.2

Data wrangling and exploratory data analysis (EDA)

A biological example to get us started

Say you perform an experiment on two different strains of stickleback fish, one from an ocean population (RS) and one from a freshwater lake (BP) by making them microbe free. Microbes in the gut are known to interact with the gut epithelium in ways that lead to a proper maturation of the immune system.

A biological example to get us started

You carry out an experiment by treating multiple fish from each strain so that some of them have a conventional microbiota, and some are inoculated with only one bacterial species. You then measure the levels of gene expression in the stickleback gut using RNA-seq. You suspect that the sex of the fish might be important so you track it too.

A biological example to get us started

Collecting Data with Analyses in Mind

  • How should the data set be organized to best analyze it?
  • What are the key properties of the variables?
  • Why does that matter for learning R?
  • Why does that matter for performing statistical analyses?

Data set rules of thumb (aka Tidy Data)

  • Store a copy of data in non-proprietary formats
  • Leave an uncorrected file when doing analyses
  • Maintain effective metadata about the data
  • When you add observations to a database, add rows
  • When you add variables to a database, add columns
  • A column of data should contain only one data type

Tidyverse family of packages

Tidyverse family of packages

Tidyverse family of packages

  • Hadley Wickham and others have written R packages to modify data

  • These packages do many of the same things as base functions in R

  • However, they are specifically designed to do them faster and more easily

  • Wickham also wrote the package GGPlot2 for elegant graphics creations

  • GG stands for ‘Grammar of Graphics’

Example of a tibble

Example of a tibble

Key functions in dplyr for vectors

  • Pick observations by their values with filter().
  • Reorder the rows with arrange().
  • Pick variables by their names with select().
  • Create new variables with functions of existing variables with mutate().
  • Collapse many values down to a single summary with summarise().

filter(), arrange() & select()

mutate() & transmutate()

This function will add a new variable that is a function of other variable(s)

This function will replace the old variable with the new variable

group_by( ) & summarize( )

This first function allows you to aggregate data by values of categorical variables (factors)

Once you have done this aggregation, you can then calculate values (in this case the mean) of other variables split by the new aggregated levels of the categorical variable

group_by( ) & summarize( )

  • Note - you can get a lot of missing values!
  • That’s because aggregation functions obey the usual rule of missing values:
    • if there’s any missing value in the input, the output will be a missing value.
    • fortunately, all aggregation functions have an na.rm argument which removes the missing values prior to computation

R INTERLUDE | Complete Exercise 2.3-2.4

Graphical Communication

GGPlot2 and the Grammar of Graphics

  • GG stands for ‘Grammar of Graphics’
  • A good paragraph uses good grammar to convey information
  • A good figure uses good grammar in the same way
  • Seven general components can be used to create most figures

GGPlot2 and the Grammar of Graphics

xxx

xxx

Graphical representation | general approaches

  1. Distributions of data
    • location
    • spread
    • shape
  2. Associations between variables
    • relationship among two or more variables
    • differences among groups in their distributions

Graphical representation | general approaches

  1. Distributions of data
    • bar graph
    • histogram
    • box plot
  2. Associations between variables
    • pie chart
    • grouped bar graph
    • mosaic plot
    • box plot
    • scatter plot
    • dot plot ‘stripchart’

Box Plot

  • Displays median, first and third quartile, range, and extreme observations
  • Can be combined with mean and standard error of the mean
  • Concise way to visualize many aspects of distribution

Scatter Plot

  • Displays association between two numerical variables
  • Goal is association not magnitude or frequency
  • Points fill the space available

Examples of the good, bad and the ugly of graphical representation

  • Examples of bad graphs and how to improve them.
  • Courtesy of K.W. Broman
  • www.biostat.wisc.edu/~kbroman/topten_worstgraphs/

Ticker tape parade

A line to no understanding

A cup of hot nothing

A bake sale of pie charts

Wack a mole

Graphical communication best practices

“Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space”

— Edward Tufte

Principles of effective display

  • Show the data
  • Encourage the eye to compare differences
  • Represent magnitudes honestly and accurately
  • Draw graphical elements clearly, minimizing clutter
  • Make displays easy to interpret

“Above all else show the data” | Tufte 1983

“Maximize the data to ink ratio, within reason” | Tufte 1983

Draw graphical elements clearly, minimizing clutter

“A graphic does not distort if the visual representation of the data is consistent with the numerical representation” – Tufte 1983

Represent magnitudes honestly and accurately

How Fox News makes a figure …

How Fox News makes a figure …

“Graphical excellence begins with telling the truth about the data” – Tufte 1983

Using GGPlot2 to make nice figures

GGPlot2 and the Grammar of Graphics

xxx

xxx

The geom_bar function

The geom_bar function

Now try this…

The geom_bar function

and this…

The geom_bar function

and finally this…

The geom_histogram and geom_freqpoly function

With this function you can make a histogram

The geom_histogram and geom_freqpoly function

This allows you to make a frequency polygram

The geom_boxplot function

Boxplots are very useful for visualizing data

The geom_boxplot function

The geom_boxplot function

The geom_point & geom_smooth functions

The geom_point & geom_smooth functions

The geom_point & geom_smooth functions

The geom_point & geom_smooth functions

Combining geoms

Adding labels

Themes

Arranging Multiple Figures- Flexdashboard

  • Modify YAML header to specify graph orientation

Arranging Multiple Figures- Flexdashboard

FINAL R INTERLUDE | Complete Exercise 2.5