SlideShare a Scribd company logo
2
Most read
Data Wrangling
with dplyr and tidyr
Cheat Sheet
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com
Syntax - Helpful conventions for wrangling
dplyr::tbl_df(iris)
Converts data to tbl class. tbl’s are easier to examine than
data frames. R displays only the data that fits onscreen:
dplyr::glimpse(iris)
Information dense summary of tbl data.
utils::View(iris)
View data set in spreadsheet-like display (note capital V).
Source: local data frame [150 x 5]
Sepal.Length Sepal.Width Petal.Length
1 5.1 3.5 1.4
2 4.9 3.0 1.4
3 4.7 3.2 1.3
4 4.6 3.1 1.5
5 5.0 3.6 1.4
.. ... ... ...
Variables not shown: Petal.Width (dbl),
Species (fctr)
dplyr::%>%
Passes object on left hand side as first argument (or .
argument) of function on righthand side.
"Piping" with %>% makes code more readable, e.g.
iris %>%
group_by(Species) %>%
summarise(avg = mean(Sepal.Width)) %>%
arrange(avg)
x %>% f(y) is the same as f(x, y)
y %>% f(x, ., z) is the same as f(x, y, z )
Reshaping Data - Change the layout of a data set
Subset Observations (Rows) Subset Variables (Columns)
F M A
Each variable is saved
in its own column
F M A
Each observation is
saved in its own row
In a tidy
data set: &
Tidy Data - A foundation for wrangling in R
Tidy data complements R’s vectorized
operations. R will automatically preserve
observations as you manipulate variables.
No other format works as intuitively with R.
FAM
M * A
*
tidyr::gather(cases, "year", "n", 2:4)
Gather columns into rows.
tidyr::unite(data, col, ..., sep)
Unite several columns into one.
dplyr::data_frame(a = 1:3, b = 4:6)
Combine vectors into data frame
(optimized).
dplyr::arrange(mtcars, mpg)
Order rows by values of a column
(low to high).
dplyr::arrange(mtcars, desc(mpg))
Order rows by values of a column
(high to low).
dplyr::rename(tb, y = year)
Rename the columns of a data
frame.
tidyr::spread(pollution, size, amount)
Spread rows into columns.
tidyr::separate(storms, date, c("y", "m", "d"))
Separate one column into several.
wwwwwwA1005A1013A1010A1010
wwp110110100745451009
wwp110110100745451009 wwp110110100745451009wwp110110100745451009
wppw11010071007110451009100945
wwwww110110110110110 wwww
dplyr::filter(iris, Sepal.Length > 7)
Extract rows that meet logical criteria.
dplyr::distinct(iris)
Remove duplicate rows.
dplyr::sample_frac(iris, 0.5, replace = TRUE)
Randomly select fraction of rows.
dplyr::sample_n(iris, 10, replace = TRUE)
Randomly select n rows.
dplyr::slice(iris, 10:15)
Select rows by position.
dplyr::top_n(storms, 2, date)
Select and order top n entries (by group if grouped data).
< Less than != Not equal to
> Greater than %in% Group membership
== Equal to is.na Is NA
<= Less than or equal to !is.na Is not NA
>= Greater than or equal to &,|,!,xor,any,all Boolean operators
Logic in R - ?Comparison, ?base::Logic
dplyr::select(iris, Sepal.Width, Petal.Length, Species)
Select columns by name or helper function.
Helper functions for select - ?select
select(iris, contains("."))
Select columns whose name contains a character string.
select(iris, ends_with("Length"))
Select columns whose name ends with a character string.
select(iris, everything())
Select every column.
select(iris, matches(".t."))
Select columns whose name matches a regular expression.
select(iris, num_range("x", 1:5))
Select columns named x1, x2, x3, x4, x5.
select(iris, one_of(c("Species", "Genus")))
Select columns whose names are in a group of names.
select(iris, starts_with("Sepal"))
Select columns whose name starts with a character string.
select(iris, Sepal.Length:Petal.Width)
Select all columns between Sepal.Length and Petal.Width (inclusive).
select(iris, -Species)
Select all columns except Species.
Learn more with browseVignettes(package = c("dplyr", "tidyr")) • dplyr 0.4.0• tidyr 0.2.0 • Updated: 1/15
wwwwwwA1005A1013A1010A1010
devtools::install_github("rstudio/EDAWR") for data sets
dplyr::group_by(iris, Species)
Group data into rows with the same value of Species.
dplyr::ungroup(iris)
Remove grouping information from data frame.
iris %>% group_by(Species) %>% summarise(…)
Compute separate summary row for each group.
Combine Data Sets
Group Data
Summarise Data Make New Variables
ir ir
C
dplyr::summarise(iris, avg = mean(Sepal.Length))
Summarise data into single row of values.
dplyr::summarise_each(iris, funs(mean))
Apply summary function to each column.
dplyr::count(iris, Species, wt = Sepal.Length)
Count number of rows with each unique value of
variable (with or without weights).
dplyr::mutate(iris, sepal = Sepal.Length + Sepal. Width)
Compute and append one or more new columns.
dplyr::mutate_each(iris, funs(min_rank))
Apply window function to each column.
dplyr::transmute(iris, sepal = Sepal.Length + Sepal. Width)
Compute one or more new columns. Drop original columns.
Summarise uses summary functions, functions that
take a vector of values and return a single value, such as:
Mutate uses window functions, functions that take a vector of
values and return another vector of values, such as:
window
function
summary
function
dplyr::first
First value of a vector.
dplyr::last
Last value of a vector.
dplyr::nth
Nth value of a vector.
dplyr::n
# of values in a vector.
dplyr::n_distinct
# of distinct values in
a vector.
IQR
IQR of a vector.
min
Minimum value in a vector.
max
Maximum value in a vector.
mean
Mean value of a vector.
median
Median value of a vector.
var
Variance of a vector.
sd
Standard deviation of a
vector.
dplyr::lead
Copy with values shifted by 1.
dplyr::lag
Copy with values lagged by 1.
dplyr::dense_rank
Ranks with no gaps.
dplyr::min_rank
Ranks. Ties get min rank.
dplyr::percent_rank
Ranks rescaled to [0, 1].
dplyr::row_number
Ranks. Ties got to first value.
dplyr::ntile
Bin vector into n buckets.
dplyr::between
Are values between a and b?
dplyr::cume_dist
Cumulative distribution.
dplyr::cumall
Cumulative all
dplyr::cumany
Cumulative any
dplyr::cummean
Cumulative mean
cumsum
Cumulative sum
cummax
Cumulative max
cummin
Cumulative min
cumprod
Cumulative prod
pmax
Element-wise max
pmin
Element-wise min
iris %>% group_by(Species) %>% mutate(…)
Compute new variables by group.
x1 x2
A 1
B 2
C 3
x1 x3
A T
B F
D T+ =
x1 x2 x3
A 1 T
B 2 F
C 3 NA
x1 x3 x2
A T 1
B F 2
D T NA
x1 x2 x3
A 1 T
B 2 F
x1 x2 x3
A 1 T
B 2 F
C 3 NA
D NA T
x1 x2
A 1
B 2
C 3
x1 x2
B 2
C 3
D 4+ =
x1 x2
B 2
C 3
x1 x2
A 1
B 2
C 3
D 4
x1 x2
A 1
x1 x2
A 1
B 2
C 3
B 2
C 3
D 4
x1 x2 x1 x2
A 1 B 2
B 2 C 3
C 3 D 4
Mutating Joins
Filtering Joins
Binding
Set Operations
dplyr::left_join(a, b, by = "x1")
Join matching rows from b to a.
a b
dplyr::right_join(a, b, by = "x1")
Join matching rows from a to b.
dplyr::inner_join(a, b, by = "x1")
Join data. Retain only rows in both sets.
dplyr::full_join(a, b, by = "x1")
Join data. Retain all values, all rows.
x1 x2
A 1
B 2
x1 x2
C 3
y z
dplyr::semi_join(a, b, by = "x1")
All rows in a that have a match in b.
dplyr::anti_join(a, b, by = "x1")
All rows in a that do not have a match in b.
dplyr::intersect(y, z)
Rows that appear in both y and z.
dplyr::union(y, z)
Rows that appear in either or both y and z.
dplyr::setdiff(y, z)
Rows that appear in y but not z.
dplyr::bind_rows(y, z)
Append z to y as new rows.
dplyr::bind_cols(y, z)
Append z to y as new columns.
Caution: matches rows by position.
RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more with browseVignettes(package = c("dplyr", "tidyr")) • dplyr 0.4.0• tidyr 0.2.0 • Updated: 1/15devtools::install_github("rstudio/EDAWR") for data sets

More Related Content

PDF
Hierarchical Clustering
PPTX
Visualization and Matplotlib using Python.pptx
PPTX
Generative Adversarial Networks (GANs)
PDF
What is the Expectation Maximization (EM) Algorithm?
PDF
Data Visualization in Python
PDF
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Hierarchical Clustering
Visualization and Matplotlib using Python.pptx
Generative Adversarial Networks (GANs)
What is the Expectation Maximization (EM) Algorithm?
Data Visualization in Python
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks

What's hot (20)

PPTX
Graph clustering
PPTX
A survey on graph kernels
PDF
Introduction to Bayesian Methods
PDF
Graph in Data Structure
PPTX
Hierarchical clustering
PDF
Feature selection
PPTX
Dimensionality Reduction and feature extraction.pptx
PDF
Learning Theory 101 ...and Towards Learning the Flat Minima
PDF
Gnn overview
PPTX
getting started with numpy and pandas.pptx
PPTX
Hyperparameter Tuning
PPTX
Community detection algorithms
PPT
3.1 clustering
PDF
PPTX
Variational Auto Encoder and the Math Behind
PPTX
NLP State of the Art | BERT
PPTX
Graph Representation Learning
PPTX
Python pandas Library
PPTX
Minimum spanning tree (mst)
Graph clustering
A survey on graph kernels
Introduction to Bayesian Methods
Graph in Data Structure
Hierarchical clustering
Feature selection
Dimensionality Reduction and feature extraction.pptx
Learning Theory 101 ...and Towards Learning the Flat Minima
Gnn overview
getting started with numpy and pandas.pptx
Hyperparameter Tuning
Community detection algorithms
3.1 clustering
Variational Auto Encoder and the Math Behind
NLP State of the Art | BERT
Graph Representation Learning
Python pandas Library
Minimum spanning tree (mst)
Ad

Similar to Data Wrangling with dplyr and tidyr Cheat Sheet (20)

PDF
Data transformation-cheatsheet
PDF
Broom: Converting Statistical Models to Tidy Data Frames
PDF
Next Generation Programming in R
PDF
R_CheatSheet.pdf
PDF
tidyr.pdf
PDF
R gráfico
PDF
Data Manipulation Using R (& dplyr)
PPTX
Unit I - introduction to r language 2.pptx
PDF
R programming & Machine Learning
PPTX
Data wrangling with dplyr
PDF
Basic R Data Manipulation
PDF
R Programming Reference Card
PPTX
Basic data analysis using R.
PPT
R for Statistical Computing
PDF
Data import-cheatsheet
PDF
R Cheat Sheet – Data Management
PDF
@ R reference
PDF
R command cheatsheet.pdf
PDF
20170509 rand db_lesugent
PDF
Reference card for R
Data transformation-cheatsheet
Broom: Converting Statistical Models to Tidy Data Frames
Next Generation Programming in R
R_CheatSheet.pdf
tidyr.pdf
R gráfico
Data Manipulation Using R (& dplyr)
Unit I - introduction to r language 2.pptx
R programming & Machine Learning
Data wrangling with dplyr
Basic R Data Manipulation
R Programming Reference Card
Basic data analysis using R.
R for Statistical Computing
Data import-cheatsheet
R Cheat Sheet – Data Management
@ R reference
R command cheatsheet.pdf
20170509 rand db_lesugent
Reference card for R
Ad

More from Dr. Volkan OBAN (20)

PDF
Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
PDF
Covid19py Python Package - Example
PDF
Object detection with Python
PDF
Python - Rastgele Orman(Random Forest) Parametreleri
DOCX
Linear Programming wi̇th R - Examples
DOCX
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
DOCX
k-means Clustering in Python
DOCX
Naive Bayes Example using R
DOCX
R forecasting Example
DOCX
k-means Clustering and Custergram with R
PDF
Data Science and its Relationship to Big Data and Data-Driven Decision Making
DOCX
Data Visualization with R.ggplot2 and its extensions examples.
PDF
Scikit-learn Cheatsheet-Python
PDF
Python Pandas for Data Science cheatsheet
PDF
Pandas,scipy,numpy cheatsheet
PPTX
ReporteRs package in R. forming powerpoint documents-an example
PPTX
ReporteRs package in R. forming powerpoint documents-an example
DOCX
R-ggplot2 package Examples
DOCX
R Machine Learning packages( generally used)
DOCX
treemap package in R and examples.
Conference Paper:IMAGE PROCESSING AND OBJECT DETECTION APPLICATION: INSURANCE...
Covid19py Python Package - Example
Object detection with Python
Python - Rastgele Orman(Random Forest) Parametreleri
Linear Programming wi̇th R - Examples
"optrees" package in R and examples.(optrees:finds optimal trees in weighted ...
k-means Clustering in Python
Naive Bayes Example using R
R forecasting Example
k-means Clustering and Custergram with R
Data Science and its Relationship to Big Data and Data-Driven Decision Making
Data Visualization with R.ggplot2 and its extensions examples.
Scikit-learn Cheatsheet-Python
Python Pandas for Data Science cheatsheet
Pandas,scipy,numpy cheatsheet
ReporteRs package in R. forming powerpoint documents-an example
ReporteRs package in R. forming powerpoint documents-an example
R-ggplot2 package Examples
R Machine Learning packages( generally used)
treemap package in R and examples.

Recently uploaded (20)

PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
annual-report-2024-2025 original latest.
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
Business Analytics and business intelligence.pdf
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Foundation of Data Science unit number two notes
PPTX
climate analysis of Dhaka ,Banglades.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Fluorescence-microscope_Botany_detailed content
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
ISS -ESG Data flows What is ESG and HowHow
Supervised vs unsupervised machine learning algorithms
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Mega Projects Data Mega Projects Data
Introduction-to-Cloud-ComputingFinal.pptx
Reliability_Chapter_ presentation 1221.5784
annual-report-2024-2025 original latest.
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Qualitative Qantitative and Mixed Methods.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Business Analytics and business intelligence.pdf
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Foundation of Data Science unit number two notes
climate analysis of Dhaka ,Banglades.pptx

Data Wrangling with dplyr and tidyr Cheat Sheet

  • 1. Data Wrangling with dplyr and tidyr Cheat Sheet RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Syntax - Helpful conventions for wrangling dplyr::tbl_df(iris) Converts data to tbl class. tbl’s are easier to examine than data frames. R displays only the data that fits onscreen: dplyr::glimpse(iris) Information dense summary of tbl data. utils::View(iris) View data set in spreadsheet-like display (note capital V). Source: local data frame [150 x 5] Sepal.Length Sepal.Width Petal.Length 1 5.1 3.5 1.4 2 4.9 3.0 1.4 3 4.7 3.2 1.3 4 4.6 3.1 1.5 5 5.0 3.6 1.4 .. ... ... ... Variables not shown: Petal.Width (dbl), Species (fctr) dplyr::%>% Passes object on left hand side as first argument (or . argument) of function on righthand side. "Piping" with %>% makes code more readable, e.g. iris %>% group_by(Species) %>% summarise(avg = mean(Sepal.Width)) %>% arrange(avg) x %>% f(y) is the same as f(x, y) y %>% f(x, ., z) is the same as f(x, y, z ) Reshaping Data - Change the layout of a data set Subset Observations (Rows) Subset Variables (Columns) F M A Each variable is saved in its own column F M A Each observation is saved in its own row In a tidy data set: & Tidy Data - A foundation for wrangling in R Tidy data complements R’s vectorized operations. R will automatically preserve observations as you manipulate variables. No other format works as intuitively with R. FAM M * A * tidyr::gather(cases, "year", "n", 2:4) Gather columns into rows. tidyr::unite(data, col, ..., sep) Unite several columns into one. dplyr::data_frame(a = 1:3, b = 4:6) Combine vectors into data frame (optimized). dplyr::arrange(mtcars, mpg) Order rows by values of a column (low to high). dplyr::arrange(mtcars, desc(mpg)) Order rows by values of a column (high to low). dplyr::rename(tb, y = year) Rename the columns of a data frame. tidyr::spread(pollution, size, amount) Spread rows into columns. tidyr::separate(storms, date, c("y", "m", "d")) Separate one column into several. wwwwwwA1005A1013A1010A1010 wwp110110100745451009 wwp110110100745451009 wwp110110100745451009wwp110110100745451009 wppw11010071007110451009100945 wwwww110110110110110 wwww dplyr::filter(iris, Sepal.Length > 7) Extract rows that meet logical criteria. dplyr::distinct(iris) Remove duplicate rows. dplyr::sample_frac(iris, 0.5, replace = TRUE) Randomly select fraction of rows. dplyr::sample_n(iris, 10, replace = TRUE) Randomly select n rows. dplyr::slice(iris, 10:15) Select rows by position. dplyr::top_n(storms, 2, date) Select and order top n entries (by group if grouped data). < Less than != Not equal to > Greater than %in% Group membership == Equal to is.na Is NA <= Less than or equal to !is.na Is not NA >= Greater than or equal to &,|,!,xor,any,all Boolean operators Logic in R - ?Comparison, ?base::Logic dplyr::select(iris, Sepal.Width, Petal.Length, Species) Select columns by name or helper function. Helper functions for select - ?select select(iris, contains(".")) Select columns whose name contains a character string. select(iris, ends_with("Length")) Select columns whose name ends with a character string. select(iris, everything()) Select every column. select(iris, matches(".t.")) Select columns whose name matches a regular expression. select(iris, num_range("x", 1:5)) Select columns named x1, x2, x3, x4, x5. select(iris, one_of(c("Species", "Genus"))) Select columns whose names are in a group of names. select(iris, starts_with("Sepal")) Select columns whose name starts with a character string. select(iris, Sepal.Length:Petal.Width) Select all columns between Sepal.Length and Petal.Width (inclusive). select(iris, -Species) Select all columns except Species. Learn more with browseVignettes(package = c("dplyr", "tidyr")) • dplyr 0.4.0• tidyr 0.2.0 • Updated: 1/15 wwwwwwA1005A1013A1010A1010 devtools::install_github("rstudio/EDAWR") for data sets
  • 2. dplyr::group_by(iris, Species) Group data into rows with the same value of Species. dplyr::ungroup(iris) Remove grouping information from data frame. iris %>% group_by(Species) %>% summarise(…) Compute separate summary row for each group. Combine Data Sets Group Data Summarise Data Make New Variables ir ir C dplyr::summarise(iris, avg = mean(Sepal.Length)) Summarise data into single row of values. dplyr::summarise_each(iris, funs(mean)) Apply summary function to each column. dplyr::count(iris, Species, wt = Sepal.Length) Count number of rows with each unique value of variable (with or without weights). dplyr::mutate(iris, sepal = Sepal.Length + Sepal. Width) Compute and append one or more new columns. dplyr::mutate_each(iris, funs(min_rank)) Apply window function to each column. dplyr::transmute(iris, sepal = Sepal.Length + Sepal. Width) Compute one or more new columns. Drop original columns. Summarise uses summary functions, functions that take a vector of values and return a single value, such as: Mutate uses window functions, functions that take a vector of values and return another vector of values, such as: window function summary function dplyr::first First value of a vector. dplyr::last Last value of a vector. dplyr::nth Nth value of a vector. dplyr::n # of values in a vector. dplyr::n_distinct # of distinct values in a vector. IQR IQR of a vector. min Minimum value in a vector. max Maximum value in a vector. mean Mean value of a vector. median Median value of a vector. var Variance of a vector. sd Standard deviation of a vector. dplyr::lead Copy with values shifted by 1. dplyr::lag Copy with values lagged by 1. dplyr::dense_rank Ranks with no gaps. dplyr::min_rank Ranks. Ties get min rank. dplyr::percent_rank Ranks rescaled to [0, 1]. dplyr::row_number Ranks. Ties got to first value. dplyr::ntile Bin vector into n buckets. dplyr::between Are values between a and b? dplyr::cume_dist Cumulative distribution. dplyr::cumall Cumulative all dplyr::cumany Cumulative any dplyr::cummean Cumulative mean cumsum Cumulative sum cummax Cumulative max cummin Cumulative min cumprod Cumulative prod pmax Element-wise max pmin Element-wise min iris %>% group_by(Species) %>% mutate(…) Compute new variables by group. x1 x2 A 1 B 2 C 3 x1 x3 A T B F D T+ = x1 x2 x3 A 1 T B 2 F C 3 NA x1 x3 x2 A T 1 B F 2 D T NA x1 x2 x3 A 1 T B 2 F x1 x2 x3 A 1 T B 2 F C 3 NA D NA T x1 x2 A 1 B 2 C 3 x1 x2 B 2 C 3 D 4+ = x1 x2 B 2 C 3 x1 x2 A 1 B 2 C 3 D 4 x1 x2 A 1 x1 x2 A 1 B 2 C 3 B 2 C 3 D 4 x1 x2 x1 x2 A 1 B 2 B 2 C 3 C 3 D 4 Mutating Joins Filtering Joins Binding Set Operations dplyr::left_join(a, b, by = "x1") Join matching rows from b to a. a b dplyr::right_join(a, b, by = "x1") Join matching rows from a to b. dplyr::inner_join(a, b, by = "x1") Join data. Retain only rows in both sets. dplyr::full_join(a, b, by = "x1") Join data. Retain all values, all rows. x1 x2 A 1 B 2 x1 x2 C 3 y z dplyr::semi_join(a, b, by = "x1") All rows in a that have a match in b. dplyr::anti_join(a, b, by = "x1") All rows in a that do not have a match in b. dplyr::intersect(y, z) Rows that appear in both y and z. dplyr::union(y, z) Rows that appear in either or both y and z. dplyr::setdiff(y, z) Rows that appear in y but not z. dplyr::bind_rows(y, z) Append z to y as new rows. dplyr::bind_cols(y, z) Append z to y as new columns. Caution: matches rows by position. RStudio® is a trademark of RStudio, Inc. • CC BY RStudio • info@rstudio.com • 844-448-1212 • rstudio.com Learn more with browseVignettes(package = c("dplyr", "tidyr")) • dplyr 0.4.0• tidyr 0.2.0 • Updated: 1/15devtools::install_github("rstudio/EDAWR") for data sets