SlideShare a Scribd company logo
INTRODUCTION TO RSTUDIO
ON AWS
Barry DeCicco
Ann Arbor Tech Meetup,
May 6, 2020
CONTENTS
Setting up the AWS Instance.
Orientation on RStudio.
SETTING UP AWS
INSTANCE
RSTUDIO ON AWS
You need to use the right AMI on AWS, one set up for
RStudio.
Set that up, and connect to Dropbox.
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
SELECT ‘NEXT’ ON EACH STEP AND ACCEPT
THE DEFAULTS
UNTIL YOU HIT ‘CONFIGURE SECURITY
GROUP’
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
Introduction to r studio on aws 2020 05_06
USING RSTUDIO
This is
where you
can write
and save
commands
.
Console – issue
commands; output
appears here.
Environment – list of
objects, including
data sets.
Multip-purpose
windows:
List of packages,
File structure,
Help,
Plot Viewer
Menu Bar
When ‘Files’ is
selected, you can
navigate through the
file system.
When ‘Plots’ is
selected, you can see
the plots which you
have created.
(it’s empty now, but
will have plots later).
When ‘Packages’ is
selected, you can see
your installed
packages (think
python modules) and
see information and
whether or not they
are loaded.
When ‘Help’ is
selected, you access
help resources.
PACKAGES
 R comes with a base set of packages. You can
think of packages as similar to modules in python.
 To obtain packages, you must INSTALL them. They
usually install from a CRAN mirror (CRAN is the
Comprehensive R Archive Network, the hosting
system for R).
 To use packages, you must LOAD them, using the
library() command.
PACKAGES (CON.)
To list installed packages, type:
installed.packages()
To load the list into a data set for ease of use,
type: package_list <-
as.data.frame(installed.packages())
installed.packages()
You can browse data frames
here.
Objects are listed here.
You can type commands into the
console.
BASICS OF R LANGUAGE
R is case sensitive.
Counting starts at 1 (x[1] is the first element of
x).
class(x) will return the class (type) of x.
BASICS OF R LANGUAGE - VECTOR
R’s basic unit is the vector (column).
It is one dimensional, with a length.
To create a vector: x <- c(1,2,3,4) [the arrow
is the ‘assign’ statement]
A single element (e.g., created by ‘X <- 1’) is a
vector of length 1.
All elements of a vector have to be of the
same type (numeric, logical, Boolean).
BASICS OF R LANGUAGE - LIST
A list is an ordered set of elements, which can
be of different types.
An element can be itself a list.
To create a list:
 y <-list(1,2,3,'four', c(6,7))
 class(y) returns [1] "list“
For many purposes, a list is a single element,
unless something iterates through it.
BASICS OF R LANGUAGE – DATA
FRAME
A data frame is a list of vectors, with the same length.
The vectors can be of different types.
To create a data frame:
Import a file.
Coerce another type of object (i.e., a matrix).
Combine some vectors:
z <- c(1,4,9,16)
data_xz <- data.frame(x,z)
ACCESSING ELEMENTS OF A VECTOR
(HTTPS://WWW.R-BLOGGERS.COM/R-ACCESSORS-EXPLAINED/
WILL EXPLAIN WHAT SORT OF OBJECTS ARE RETURNED)
For a vector, single brackets can be used:
class(x) returns: [1] "numeric“
x[2] returns: [1] 2 the [1] means ‘first element’.
Elements can also be accessed by Boolean logic:
 x <- c(1,2,3,4)
x >3 returns: [1] FALSE FALSE FALSE TRUE
x[x>3] returns: [1] 4
class(x) [1] "numeric"
[1] 2
ACCESSING ELEMENTS OF A LIST
Since each element of a list can itself be a list, it’s a bit
more complicated.
Using single brackets on a list returns a list.
Using double brackets on a list returns a vector.
 y[1] returns a list, with the first element being 1.
 y[[1]] returns a vector, with the first element being 1.
ACCESSING ELEMENTS OF A DATA
FRAME
The classic way to select a column (vector, variable,
field) of a data frame is using the ‘$’ operator:
 data_xz$x will return x, which is the column ‘1,2,3,4’.
 You can also use brackets for [row, column]:
 data_xz[1,2] returns the first row, second column,
 data_xz[1,] returns the first row, all columns (and vice
versa for row).
 data_xz[data_xz$x>2,] would return the rows of data_xz
where x > 2.
IMPORTING DATA
There is a package ‘tidyverse’ which contains a
number of useful data manipulation and plotting
commands; it is the heart of common modern R usage.
To load it, give the command ‘library(tidyverse’).
You will get some notes as it loads, including what
commands from other packages are masked.
Masked means that a command ‘doit’ from a
package A might be overridden by another command
with the same name, ‘doit’, from package B.
IMPORTING DATA (CON)
The command ‘read_csv’ is used to import data from
.csv files.
It has many optional arguments; in this case they are
not needed.
To create a data frame called ‘sheet1’:
sheet1 <- read_csv("/home/rstudio/Dropbox/Barry
DeCicco's shared workspace/Documents/R Studio
Projects/7 Visualizations You Should Learn in R/Big Mart
Datasets/Big Mart Dataset - Sheet1.csv“)
Command echoed here, along with specifications chosen by r
Command run from here
Data frame ‘sheet1’ listed in objects
POKING AND PRODDING OBJECTS
In the upper right-hand window (‘Global Environment’) ,
objects are listed, in categories such as ‘Data’ and ‘Values
(variables, lists, etc.) .
To view them, you can click on the object name for data, or
use the command: View(object).
A spreadsheet-like view will pop-up. You can filter and sort
this view.
POKING AND PRODDING OBJECTS
POKING AND PRODDING OBJECTS (CON)
 To find out what type of object an object is, use the command:
class(object).:
 Class(x) will return ‘numeric’, since x is a numeric vector.
 Dim(sheet1) will return the number of rows and columns.
 Length(y) will return the length of the list y.
 Length(sheet1) will return the number of columns of sheet1
(since a data frame is a list of vectors, which are the coumns).
 Typing an object’s name will cause R to print that object out, in
varying degrees of usefulness. The output will depend on the
class of the object.
 Since the tidyverse command ‘read_csv()’ was used to
import the .csv file, ‘sheet1’ is a Tibble, which is a data
frame with extra features. One of those is better printing:
EDA
To find out what the column types of a data
frame/tibble are, use the command: lapply(sheet1,
class) .
‘Lapply’ is member of a set of ‘apply’ functions with
different prefixes, which iterate a function through an
object, and then return one of variety of object types
back]
To see the first/last few rows, use ‘head()’ and ‘tail()’.
Because sheet1 is a tibble, those functions will return a
tibble (a subset of the original tibble).
EDA
str(x) will return the ‘structure’ of x, which is a set of
parts and names for those parts.
To see the first/last few rows, use ‘head()’ and ‘tail()’.
Because sheet1 is a tibble, those functions will return a
tibble (a subset of the original tibble).
Summary(object) will
return a summary,
which will vary
depending on the
class of the object.
str(object) will return
the ‘structure’ of the
object, which is a set
of parts and names
for those parts.
GRAPHS
➢ In the Long Long Ago Time, the reason that people
started using R was for its graphics.
➢ The most common graphics package is ggplot2,
which is based on the book ‘The Grammar of Graphs’.
➢ The method is to build a graph up of basic
commands and modifiers. This can allow exploration.
➢ These graph objects can be stored, then have
commands add to them, and then be printed.
➢ There are many, many, many extension packages to
ggplot2.
GGPLOT EXAMPLE
For this example, we will use the mpg data frame,
which comes with the ggplot2 package (most
packages have sample data frames).
It is directly accessed by ‘ggplot2::mpg
(package::item, which also works for functions within
packages).
First, issue the ggplot() command, specifying the data
set: ggplot(data=mpg) + [the ‘+’ means continuing]
Second, specify a GEOM (what to plot) and a AES
(aesthetic – what columns to use)
ggplot(data = mpg) +
+
geom_point(mapping =
aes(x = displ, y = hwy))
These plots can have
more features, either
from the start or added
on later:
ggplot(data = mpg) +
geom_point(mapping
= aes(x = displ, y =
hwy, color = class))
ggplot(data = mpg) +
geom_point(mapping =
aes(x = displ, y = hwy,
color = class)) +
facet_wrap(~ cyl, nrow =
2)
ONE ADDITIONAL THING
Most output from functions can be assigned to objects:
> model_for_mpg = lm(cty ~ cyl +trans, data=ggplot2::mpg)
> model_for_mpg
Call:
lm(formula = cty ~ cyl + trans, data = ggplot2::mpg)
Coefficients:
(Intercept) cyl transauto(l3) transauto(l4) transauto(l5) transauto(l6)
30.469 -2.013 -1.416 -2.159 -2.536 -2.039
transauto(s4) transauto(s5) transauto(s6) transmanual(m5) transmanual(m6)
-1.065 -1.056 -1.014 -1.144 -1.495
ONE ADDITIONAL THING
➢ At this point there is an object
containing the linear model
results.
➢ This has parts, which can be
extracted as variables or
objects.
➢ There are packages (such as
‘broom’ which can convert this
into a data frame, allowing the
analysis of sets of models, and
each presentations of results,
RMARKDOWN AND KNITR
➢ Everything until now has been done in an r script file.
➢ You can also create an Rmarkdown file (File > New File > R
Markdown).
➢ These files mix code chunks, text and the output from the code
chunks.
➢ The resulting output can be in .rtf, .html or .pdf (extra setup is required
for the latter, on your desktop).
➢ The code chunks can be in over 20 languages!
‘Knit’ button causes
document to be generated
Block of text
Code chunk – can be printed
or not in the final document.
Introduction to r studio on aws 2020 05_06
QUESTIONS?
REFERENCES
 ‘Advanced Data Analysis from an Elementary Point of View’ (a
statistics class taught in R):
https://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/
 R for Data Science https://r4ds.had.co.nz/
 The tidyverse: https://www.tidyverse.org/
 Examples of ggplot graphs:
 Top 50 ggplot2 Visualizations - The Master List (With Full R Code):
http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-
Code.html
 Dynamic Documents with R and Knitr (by Yihui Xie)
https://yihui.name/knitr/
REFERENCES (CON.)
 Getting R (on your own computer): https://cran.r-project.org/
 Getting Rstudio (on your own computer): https://rstudio.com/

More Related Content

PDF
R Introduction
PDF
R reference card
PDF
R Reference Card for Data Mining
PPT
collections
PPTX
R language introduction
PPTX
Multiple file programs, inheritance, templates
PDF
Python for R Users
PDF
20130215 Reading data into R
R Introduction
R reference card
R Reference Card for Data Mining
collections
R language introduction
Multiple file programs, inheritance, templates
Python for R Users
20130215 Reading data into R

What's hot (20)

PDF
Reading Data into R
PDF
PDF
Practical cats
PPTX
Python for R users
PDF
4 R Tutorial DPLYR Apply Function
PDF
3 R Tutorial Data Structure
PDF
Python for R developers and data scientists
PDF
Next Generation Programming in R
PDF
R short-refcard
PPTX
Array in c++
PDF
STL in C++
PDF
SICP_2.5 일반화된 연산시스템
PDF
Grouping & Summarizing Data in R
PPT
Visula C# Programming Lecture 5
PPTX
PPT ON MACHINE LEARNING by Ragini Ratre
PPT
2CPP16 - STL
PDF
Is there a perfect data-parallel programming language? (Experiments with More...
PDF
Morel, a Functional Query Language
PDF
20130222 Data structures and manipulation in R
Reading Data into R
Practical cats
Python for R users
4 R Tutorial DPLYR Apply Function
3 R Tutorial Data Structure
Python for R developers and data scientists
Next Generation Programming in R
R short-refcard
Array in c++
STL in C++
SICP_2.5 일반화된 연산시스템
Grouping & Summarizing Data in R
Visula C# Programming Lecture 5
PPT ON MACHINE LEARNING by Ragini Ratre
2CPP16 - STL
Is there a perfect data-parallel programming language? (Experiments with More...
Morel, a Functional Query Language
20130222 Data structures and manipulation in R
Ad

Similar to Introduction to r studio on aws 2020 05_06 (20)

PPTX
Big Data Mining in Indian Economic Survey 2017
PDF
R-Language-Lab-Manual-lab-1.pdf
PDF
R-Language-Lab-Manual-lab-1.pdf
PDF
R-Language-Lab-Manual-lab-1.pdf
PPTX
Introduction to R programming Language.pptx
PDF
R basics
PPTX
Introduction to R _IMPORTANT FOR DATA ANALYTICS
PDF
R Programming Reference Card
PPT
Advanced Data Analytics with R Programming.ppt
PPT
Introduction to R for Data Science Technology
PPT
How to obtain and install R.ppt
PDF
Machine Learning in R
PDF
Data analystics with R module 3 cseds vtu
PPT
introduction to R with example, Data science
PPTX
DataStructures.pptx
PPTX
R programming
PPT
PPT
Slides on introduction to R by ArinBasu MD
PPT
17641.ppt
PPT
Basics of R-Progranmming with instata.ppt
Big Data Mining in Indian Economic Survey 2017
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
Introduction to R programming Language.pptx
R basics
Introduction to R _IMPORTANT FOR DATA ANALYTICS
R Programming Reference Card
Advanced Data Analytics with R Programming.ppt
Introduction to R for Data Science Technology
How to obtain and install R.ppt
Machine Learning in R
Data analystics with R module 3 cseds vtu
introduction to R with example, Data science
DataStructures.pptx
R programming
Slides on introduction to R by ArinBasu MD
17641.ppt
Basics of R-Progranmming with instata.ppt
Ad

More from Barry DeCicco (7)

PDF
Easy HTML Tables in RStudio with Tabyl and kableExtra
PDF
Beginning text analysis
PDF
Up and running with python
PPTX
Using RStudio on AWS
PPTX
Calling python from r
PDF
Draft sas and r and sas (may, 2018 asa meeting)
PPTX
Calling r from sas (msug meeting, feb 17, 2018) revised
Easy HTML Tables in RStudio with Tabyl and kableExtra
Beginning text analysis
Up and running with python
Using RStudio on AWS
Calling python from r
Draft sas and r and sas (may, 2018 asa meeting)
Calling r from sas (msug meeting, feb 17, 2018) revised

Recently uploaded (20)

PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
Mega Projects Data Mega Projects Data
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPT
Quality review (1)_presentation of this 21
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Lecture1 pattern recognition............
PPTX
1_Introduction to advance data techniques.pptx
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPT
Miokarditis (Inflamasi pada Otot Jantung)
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Mega Projects Data Mega Projects Data
.pdf is not working space design for the following data for the following dat...
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Business Acumen Training GuidePresentation.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Clinical guidelines as a resource for EBP(1).pdf
Quality review (1)_presentation of this 21
Reliability_Chapter_ presentation 1221.5784
Introduction-to-Cloud-ComputingFinal.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Lecture1 pattern recognition............
1_Introduction to advance data techniques.pptx
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Miokarditis (Inflamasi pada Otot Jantung)

Introduction to r studio on aws 2020 05_06

  • 1. INTRODUCTION TO RSTUDIO ON AWS Barry DeCicco Ann Arbor Tech Meetup, May 6, 2020
  • 2. CONTENTS Setting up the AWS Instance. Orientation on RStudio.
  • 4. RSTUDIO ON AWS You need to use the right AMI on AWS, one set up for RStudio. Set that up, and connect to Dropbox.
  • 7. SELECT ‘NEXT’ ON EACH STEP AND ACCEPT THE DEFAULTS UNTIL YOU HIT ‘CONFIGURE SECURITY GROUP’
  • 23. This is where you can write and save commands . Console – issue commands; output appears here. Environment – list of objects, including data sets. Multip-purpose windows: List of packages, File structure, Help, Plot Viewer Menu Bar
  • 24. When ‘Files’ is selected, you can navigate through the file system.
  • 25. When ‘Plots’ is selected, you can see the plots which you have created. (it’s empty now, but will have plots later).
  • 26. When ‘Packages’ is selected, you can see your installed packages (think python modules) and see information and whether or not they are loaded.
  • 27. When ‘Help’ is selected, you access help resources.
  • 28. PACKAGES  R comes with a base set of packages. You can think of packages as similar to modules in python.  To obtain packages, you must INSTALL them. They usually install from a CRAN mirror (CRAN is the Comprehensive R Archive Network, the hosting system for R).  To use packages, you must LOAD them, using the library() command.
  • 29. PACKAGES (CON.) To list installed packages, type: installed.packages() To load the list into a data set for ease of use, type: package_list <- as.data.frame(installed.packages()) installed.packages()
  • 30. You can browse data frames here. Objects are listed here. You can type commands into the console.
  • 31. BASICS OF R LANGUAGE R is case sensitive. Counting starts at 1 (x[1] is the first element of x). class(x) will return the class (type) of x.
  • 32. BASICS OF R LANGUAGE - VECTOR R’s basic unit is the vector (column). It is one dimensional, with a length. To create a vector: x <- c(1,2,3,4) [the arrow is the ‘assign’ statement] A single element (e.g., created by ‘X <- 1’) is a vector of length 1. All elements of a vector have to be of the same type (numeric, logical, Boolean).
  • 33. BASICS OF R LANGUAGE - LIST A list is an ordered set of elements, which can be of different types. An element can be itself a list. To create a list:  y <-list(1,2,3,'four', c(6,7))  class(y) returns [1] "list“ For many purposes, a list is a single element, unless something iterates through it.
  • 34. BASICS OF R LANGUAGE – DATA FRAME A data frame is a list of vectors, with the same length. The vectors can be of different types. To create a data frame: Import a file. Coerce another type of object (i.e., a matrix). Combine some vectors: z <- c(1,4,9,16) data_xz <- data.frame(x,z)
  • 35. ACCESSING ELEMENTS OF A VECTOR (HTTPS://WWW.R-BLOGGERS.COM/R-ACCESSORS-EXPLAINED/ WILL EXPLAIN WHAT SORT OF OBJECTS ARE RETURNED) For a vector, single brackets can be used: class(x) returns: [1] "numeric“ x[2] returns: [1] 2 the [1] means ‘first element’. Elements can also be accessed by Boolean logic:  x <- c(1,2,3,4) x >3 returns: [1] FALSE FALSE FALSE TRUE x[x>3] returns: [1] 4 class(x) [1] "numeric" [1] 2
  • 36. ACCESSING ELEMENTS OF A LIST Since each element of a list can itself be a list, it’s a bit more complicated. Using single brackets on a list returns a list. Using double brackets on a list returns a vector.  y[1] returns a list, with the first element being 1.  y[[1]] returns a vector, with the first element being 1.
  • 37. ACCESSING ELEMENTS OF A DATA FRAME The classic way to select a column (vector, variable, field) of a data frame is using the ‘$’ operator:  data_xz$x will return x, which is the column ‘1,2,3,4’.  You can also use brackets for [row, column]:  data_xz[1,2] returns the first row, second column,  data_xz[1,] returns the first row, all columns (and vice versa for row).  data_xz[data_xz$x>2,] would return the rows of data_xz where x > 2.
  • 38. IMPORTING DATA There is a package ‘tidyverse’ which contains a number of useful data manipulation and plotting commands; it is the heart of common modern R usage. To load it, give the command ‘library(tidyverse’). You will get some notes as it loads, including what commands from other packages are masked. Masked means that a command ‘doit’ from a package A might be overridden by another command with the same name, ‘doit’, from package B.
  • 39. IMPORTING DATA (CON) The command ‘read_csv’ is used to import data from .csv files. It has many optional arguments; in this case they are not needed. To create a data frame called ‘sheet1’: sheet1 <- read_csv("/home/rstudio/Dropbox/Barry DeCicco's shared workspace/Documents/R Studio Projects/7 Visualizations You Should Learn in R/Big Mart Datasets/Big Mart Dataset - Sheet1.csv“)
  • 40. Command echoed here, along with specifications chosen by r Command run from here Data frame ‘sheet1’ listed in objects
  • 41. POKING AND PRODDING OBJECTS In the upper right-hand window (‘Global Environment’) , objects are listed, in categories such as ‘Data’ and ‘Values (variables, lists, etc.) . To view them, you can click on the object name for data, or use the command: View(object). A spreadsheet-like view will pop-up. You can filter and sort this view.
  • 43. POKING AND PRODDING OBJECTS (CON)  To find out what type of object an object is, use the command: class(object).:  Class(x) will return ‘numeric’, since x is a numeric vector.  Dim(sheet1) will return the number of rows and columns.  Length(y) will return the length of the list y.  Length(sheet1) will return the number of columns of sheet1 (since a data frame is a list of vectors, which are the coumns).  Typing an object’s name will cause R to print that object out, in varying degrees of usefulness. The output will depend on the class of the object.
  • 44.  Since the tidyverse command ‘read_csv()’ was used to import the .csv file, ‘sheet1’ is a Tibble, which is a data frame with extra features. One of those is better printing:
  • 45. EDA To find out what the column types of a data frame/tibble are, use the command: lapply(sheet1, class) . ‘Lapply’ is member of a set of ‘apply’ functions with different prefixes, which iterate a function through an object, and then return one of variety of object types back] To see the first/last few rows, use ‘head()’ and ‘tail()’. Because sheet1 is a tibble, those functions will return a tibble (a subset of the original tibble).
  • 46. EDA str(x) will return the ‘structure’ of x, which is a set of parts and names for those parts. To see the first/last few rows, use ‘head()’ and ‘tail()’. Because sheet1 is a tibble, those functions will return a tibble (a subset of the original tibble).
  • 47. Summary(object) will return a summary, which will vary depending on the class of the object.
  • 48. str(object) will return the ‘structure’ of the object, which is a set of parts and names for those parts.
  • 49. GRAPHS ➢ In the Long Long Ago Time, the reason that people started using R was for its graphics. ➢ The most common graphics package is ggplot2, which is based on the book ‘The Grammar of Graphs’. ➢ The method is to build a graph up of basic commands and modifiers. This can allow exploration. ➢ These graph objects can be stored, then have commands add to them, and then be printed. ➢ There are many, many, many extension packages to ggplot2.
  • 50. GGPLOT EXAMPLE For this example, we will use the mpg data frame, which comes with the ggplot2 package (most packages have sample data frames). It is directly accessed by ‘ggplot2::mpg (package::item, which also works for functions within packages). First, issue the ggplot() command, specifying the data set: ggplot(data=mpg) + [the ‘+’ means continuing] Second, specify a GEOM (what to plot) and a AES (aesthetic – what columns to use)
  • 51. ggplot(data = mpg) + + geom_point(mapping = aes(x = displ, y = hwy))
  • 52. These plots can have more features, either from the start or added on later: ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class))
  • 53. ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = class)) + facet_wrap(~ cyl, nrow = 2)
  • 54. ONE ADDITIONAL THING Most output from functions can be assigned to objects: > model_for_mpg = lm(cty ~ cyl +trans, data=ggplot2::mpg) > model_for_mpg Call: lm(formula = cty ~ cyl + trans, data = ggplot2::mpg) Coefficients: (Intercept) cyl transauto(l3) transauto(l4) transauto(l5) transauto(l6) 30.469 -2.013 -1.416 -2.159 -2.536 -2.039 transauto(s4) transauto(s5) transauto(s6) transmanual(m5) transmanual(m6) -1.065 -1.056 -1.014 -1.144 -1.495
  • 55. ONE ADDITIONAL THING ➢ At this point there is an object containing the linear model results. ➢ This has parts, which can be extracted as variables or objects. ➢ There are packages (such as ‘broom’ which can convert this into a data frame, allowing the analysis of sets of models, and each presentations of results,
  • 56. RMARKDOWN AND KNITR ➢ Everything until now has been done in an r script file. ➢ You can also create an Rmarkdown file (File > New File > R Markdown). ➢ These files mix code chunks, text and the output from the code chunks. ➢ The resulting output can be in .rtf, .html or .pdf (extra setup is required for the latter, on your desktop). ➢ The code chunks can be in over 20 languages!
  • 57. ‘Knit’ button causes document to be generated Block of text Code chunk – can be printed or not in the final document.
  • 60. REFERENCES  ‘Advanced Data Analysis from an Elementary Point of View’ (a statistics class taught in R): https://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/  R for Data Science https://r4ds.had.co.nz/  The tidyverse: https://www.tidyverse.org/  Examples of ggplot graphs:  Top 50 ggplot2 Visualizations - The Master List (With Full R Code): http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R- Code.html  Dynamic Documents with R and Knitr (by Yihui Xie) https://yihui.name/knitr/
  • 61. REFERENCES (CON.)  Getting R (on your own computer): https://cran.r-project.org/  Getting Rstudio (on your own computer): https://rstudio.com/