SlideShare a Scribd company logo
The R Language
A Hands-on Introduction
Venkatesh-Prasad Ranganath
http://about.me/rvprasad
What is R?
• A dynamical typed programming language
• http://cran.r-project.org/
• Open source and free
• Provides common programming language constructs/features
• Multiple programming paradigms
• Numerous libraries focused on various data-rich topics
• http://cran.r-project.org/web/views/
• Ideal for statistical calculation; lately, the go-to tool for data analysis
• Accompanied by RStudio, a simple and powerful IDE
• http://rstudio.org
Data Types (Modes)
• Numeric
• Character
• Logical (TRUE / FALSE)
• Complex
• Raw (bytes)
Data Structures
• Vectors
• Matrices
• Arrays
• Lists
• Data Frames
• Factors
• Tables
Data Structures: Vectors
• A sequence of objects of the same (atomic) data type
• Creation
• x <- b c [ <- is the assignment operator ]
• y <- seq(5, 9, 2) = c(5, 7, 9)
• y <- 5:7 = c(5, 6, 7) [ m:n is equivalent to seq(m, n, 1) ]
• y <- c(1, 4:6) = c(1, 4, 5, 6) [ no nesting / always flattened ]
• z <- rep(1, 3) = c(1, 1, 1)
Data Structures: Vectors
• Accessing
• x[1] [ 1-based indexing ]
• x[2:3]
• x[c(2,3)] = x[2:3]
• x[-1] [ Negative subscripts imply exclusion ]
• Naming
• names(x) <- [ Makes equivalent to x[1] ]
Data Structures: Vectors
• Operations
• x <- c(5, 6, 7)
• x + 2 = c(7, 8, 9) [ Vectorized operations ]
• x > 5 = c(FALSE, TRUE, TRUE)
• subset(x, x > 5) = c(6, 7)
• which(x > 5) = c(2, 3)
• ifelse(x > 5, NaN, x) = c(5, NaN, NaN)
• sqr <- function (n) { n * n }
• sapply(x,sqr) = c(25 ,36, 49)
• sqr(x) = c(25, 36, 49)
Data Structures: Vectors
• Operations
• x <- c(5, 6, 7)
• any(x > 5) = TRUE [ How about all(x > 5)? ]
• sum(c(1, 2, 3, NA), na.rm = TRUE) = 6 [ Why is na.rm required? ]
• sort(c(7, 6, 5)) = c(5, 6, 7)
• order(c(7, 6, 5)) = ???
• subset(x, x > 5) = c(6, 7)
• head(1:100) = ???
• tail(1:100) = ???
• How is x == c(5, 6, 7) different from identical(x, c(5, 6, 7))?
• Try str(x)
Data Structures: Matrices
• A two dimensional matrix of objects of the same (atomic) data type
• Creation
• y <- matrix(nrow=2, ncol=3) [ empty matrix ]
• y <- matrix(c(1, 2, 3, 4, 5, 6), nrow=2) =
• y <- matrix(c(1, 2, 3, 4, 5, 6), nrow=2, byrow=T) =
• Accessing
• y[1,2] = 2
• y[,2:3] = [ How about y[1,]? ]
• What’s the difference between y[2,] and y[2,, drop=FALSE]?
1 3 5
2 4 6
1 2 3
4 5 6
2 3
5 6
Data Structures: Matrices
• Naming
• rownames() and colnames()
• Operations
• nrow(y) = 2 [ number of rows ]
• ncol(y) = 3 [ number of columns ]
• apply(y, 1, sum) = c(6, 15) [ apply sum to each row ]
• apply(y, 2, sum) = c(5, 7, 9) [ apply sum to each column ]
• t(y) = [ transpose a matrix ]1 4
2 5
3 6
Data Structures: Matrices
• Operations
• rbind(y, c(7, 8, 9)) =
• cbind(y, c(7, 8)) =
• Try str(y)
1 2 3
4 5 6
7 8 9
1 2 3 7
4 5 6 8
Data Structures: Matrices
• What will this yield?
m <- matrix(nrow=4, ncol=4)
m <- ifelse(row(m) == col(m), 1, 0.3)
Data Structures: Lists
• A sequence of objects (of possibly different data types)
• Creation
• k <- list(c(1, 2, 3),
• l <- [ f1 and f2 are tags ]
• Accessing
• k[2:3]
• k[[2]] [ How about k[2]? ]
• l$f1 = c(1, 2, 3) [ Is it same as l[1] or l[[1]]? ]
Data Structures: Lists
• Naming
• names(k) <-
• Operations
• lapply(list(1:2, 9:10), sum) = list(3, 19)
• sapply(list(1:2, 9:10), sum) = c(3, 19)
• l$f1 <- NULL = ???
• str(l) = ???
Data Structures: Data Frames
• A two dimensional matrix of objects where different columns can be of
different types.
• Creation
• x <- data.frame jill
• Accessing
• x$names jill [ How about x[[1]]? ]
• x[1] = ???
• x[c(1,2)] = ???
• x[1,] = ???
• x[,1] = ???
Data Structures: Data Frames
• Naming
• rownames() and colnames()
• Operations
• x[x$age > 5,] = data.frame jill ))
• subset(x, age > 5) = ???
• apply(x, 1, sum) = ???
• y <- data.frame(1:3, 5:7)
• apply(y, 1, mean) = ???
• lapply(y, mean) = ???
• sapply(y, mean) = ???
• Try str(y)
Factors (and Tables)
• Type for categorical/nominal values.
• Example
• xf <- factor(c(1:3, 2, 4:5))
• Try xf and str(xf)
• Operations
• table(xf) = ???
• with(mtcars, split(mpg, cyl)) = ???
• with(mtcars, tapply(mpg, cyl, mean)) = ???
• by(mtcars, mtcars$cyl, function(m) { median(m$mpg) } = ???
• aggregate(mtcars, list(mtcars$cyl), median) = ???
• You can use cut to bin values and create factors. Try it.
Basic Graphs
• with(mtcars, boxplot(mpg))
• hist(mtcars$mpg)
• with(mtcars, plot(hp, mpg))
• dotchart(VADeaths)
• Try plot(aggregate(mtcars, list(mtcars$cyl), median))
You can get the list of datasets via ls package.datasets
Stats 101 using R
• mean
• median
• What about mode?
• fivenum
• quantile
• sd
• var
• cov
• cor
Data Exploration using R
Let’s get out hands dirty!!

More Related Content

PDF
Learning R and Teaching R
PDF
Files and data storage
PPT
lecture 13
PPT
Hub102 - Lesson4 - Data Structure
PDF
Datastructureitstypes
PPT
3.6 radix sort
PPTX
Radix Sort
PDF
The List Data Model
Learning R and Teaching R
Files and data storage
lecture 13
Hub102 - Lesson4 - Data Structure
Datastructureitstypes
3.6 radix sort
Radix Sort
The List Data Model

What's hot (20)

PPTX
Motivation and Mechanics behind some aspects of Shapeless
PPTX
Rattle Graphical Interface for R Language
PDF
R programmingmilano
PDF
Kof2008 Itll
PPT
702 present
PPTX
Introduction to array and string
PDF
NTCIR11-Math2-PattaniyilN_poster
PPTX
R program
PPT
Data Structure and Algorithms Binary Search Tree
PDF
Purely Functional Data Structures
PDF
Data science : R Basics Harvard University
PDF
Type-Aware Entity Retrieval
PPTX
The Very ^ 2 Basics of R
PDF
Functional Programming
PPTX
Wrokflow programming and provenance query model
PPTX
Trees (data structure)
PPTX
Python data structures (Lists and tuples) presentation
PPTX
Balanced Tree (AVL Tree & Red-Black Tree)
PDF
R statistics with mongo db
PDF
Tree Data Structure by Daniyal Khan
Motivation and Mechanics behind some aspects of Shapeless
Rattle Graphical Interface for R Language
R programmingmilano
Kof2008 Itll
702 present
Introduction to array and string
NTCIR11-Math2-PattaniyilN_poster
R program
Data Structure and Algorithms Binary Search Tree
Purely Functional Data Structures
Data science : R Basics Harvard University
Type-Aware Entity Retrieval
The Very ^ 2 Basics of R
Functional Programming
Wrokflow programming and provenance query model
Trees (data structure)
Python data structures (Lists and tuples) presentation
Balanced Tree (AVL Tree & Red-Black Tree)
R statistics with mongo db
Tree Data Structure by Daniyal Khan
Ad

Similar to R language, an introduction (20)

PPTX
A quick introduction to R
PDF
3 R Tutorial Data Structure
PDF
Day 1b R structures objects.pptx
PPTX
Introduction to R _IMPORTANT FOR DATA ANALYTICS
PPTX
Language R
PPTX
Introduction to R.pptx
PPTX
Big Data Mining in Indian Economic Survey 2017
PDF
Day 1d R structures & objects: matrices and data frames.pptx
PPTX
R programming Fundamentals
PDF
3 Data Structure in R
PPTX
Introduction to R programming Language.pptx
PPTX
BA lab1.pptx
PDF
R training3
PPTX
R교육1
PDF
A short tutorial on r
PPTX
Data Analytics with R and SQL Server
PPTX
R programming
PDF
R learning by examples
PPTX
Introduction to R
PPTX
Control statements
A quick introduction to R
3 R Tutorial Data Structure
Day 1b R structures objects.pptx
Introduction to R _IMPORTANT FOR DATA ANALYTICS
Language R
Introduction to R.pptx
Big Data Mining in Indian Economic Survey 2017
Day 1d R structures & objects: matrices and data frames.pptx
R programming Fundamentals
3 Data Structure in R
Introduction to R programming Language.pptx
BA lab1.pptx
R training3
R교육1
A short tutorial on r
Data Analytics with R and SQL Server
R programming
R learning by examples
Introduction to R
Control statements
Ad

More from Venkatesh Prasad Ranganath (17)

PDF
SeMA: A Design Methodology for Building Secure Android Apps
PDF
Are free Android app security analysis tools effective in detecting known vul...
PDF
Benchpress: Analyzing Android App Vulnerability Benchmark Suites
PDF
Why do Users kill HPC Jobs?
PDF
Behavior Driven Development [10] - Software Testing Techniques (CIS640)
PDF
Code Coverage [9] - Software Testing Techniques (CIS640)
PDF
Equivalence Class Testing [8] - Software Testing Techniques (CIS640)
PDF
Boundary Value Testing [7] - Software Testing Techniques (CIS640)
PDF
Property Based Testing [5] - Software Testing Techniques (CIS640)
PDF
Intro to Python3 [2] - Software Testing Techniques (CIS640)
PDF
Unit testing [4] - Software Testing Techniques (CIS640)
PDF
Testing concepts [3] - Software Testing Techniques (CIS640)
PDF
Introduction [1] - Software Testing Techniques (CIS640)
PDF
Compatibility Testing using Patterns-based Trace Comparison
PDF
My flings with data analysis
PDF
Data analytics, a (short) tour
PPTX
Pattern-based Features
SeMA: A Design Methodology for Building Secure Android Apps
Are free Android app security analysis tools effective in detecting known vul...
Benchpress: Analyzing Android App Vulnerability Benchmark Suites
Why do Users kill HPC Jobs?
Behavior Driven Development [10] - Software Testing Techniques (CIS640)
Code Coverage [9] - Software Testing Techniques (CIS640)
Equivalence Class Testing [8] - Software Testing Techniques (CIS640)
Boundary Value Testing [7] - Software Testing Techniques (CIS640)
Property Based Testing [5] - Software Testing Techniques (CIS640)
Intro to Python3 [2] - Software Testing Techniques (CIS640)
Unit testing [4] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)
Introduction [1] - Software Testing Techniques (CIS640)
Compatibility Testing using Patterns-based Trace Comparison
My flings with data analysis
Data analytics, a (short) tour
Pattern-based Features

Recently uploaded (20)

PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Institutional Correction lecture only . . .
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
RMMM.pdf make it easy to upload and study
PPTX
master seminar digital applications in india
PDF
Complications of Minimal Access Surgery at WLH
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
Pharma ospi slides which help in ospi learning
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
TR - Agricultural Crops Production NC III.pdf
O7-L3 Supply Chain Operations - ICLT Program
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
human mycosis Human fungal infections are called human mycosis..pptx
Institutional Correction lecture only . . .
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Supply Chain Operations Speaking Notes -ICLT Program
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPH.pptx obstetrics and gynecology in nursing
RMMM.pdf make it easy to upload and study
master seminar digital applications in india
Complications of Minimal Access Surgery at WLH
Sports Quiz easy sports quiz sports quiz
Pharma ospi slides which help in ospi learning
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf

R language, an introduction

  • 1. The R Language A Hands-on Introduction Venkatesh-Prasad Ranganath http://about.me/rvprasad
  • 2. What is R? • A dynamical typed programming language • http://cran.r-project.org/ • Open source and free • Provides common programming language constructs/features • Multiple programming paradigms • Numerous libraries focused on various data-rich topics • http://cran.r-project.org/web/views/ • Ideal for statistical calculation; lately, the go-to tool for data analysis • Accompanied by RStudio, a simple and powerful IDE • http://rstudio.org
  • 3. Data Types (Modes) • Numeric • Character • Logical (TRUE / FALSE) • Complex • Raw (bytes)
  • 4. Data Structures • Vectors • Matrices • Arrays • Lists • Data Frames • Factors • Tables
  • 5. Data Structures: Vectors • A sequence of objects of the same (atomic) data type • Creation • x <- b c [ <- is the assignment operator ] • y <- seq(5, 9, 2) = c(5, 7, 9) • y <- 5:7 = c(5, 6, 7) [ m:n is equivalent to seq(m, n, 1) ] • y <- c(1, 4:6) = c(1, 4, 5, 6) [ no nesting / always flattened ] • z <- rep(1, 3) = c(1, 1, 1)
  • 6. Data Structures: Vectors • Accessing • x[1] [ 1-based indexing ] • x[2:3] • x[c(2,3)] = x[2:3] • x[-1] [ Negative subscripts imply exclusion ] • Naming • names(x) <- [ Makes equivalent to x[1] ]
  • 7. Data Structures: Vectors • Operations • x <- c(5, 6, 7) • x + 2 = c(7, 8, 9) [ Vectorized operations ] • x > 5 = c(FALSE, TRUE, TRUE) • subset(x, x > 5) = c(6, 7) • which(x > 5) = c(2, 3) • ifelse(x > 5, NaN, x) = c(5, NaN, NaN) • sqr <- function (n) { n * n } • sapply(x,sqr) = c(25 ,36, 49) • sqr(x) = c(25, 36, 49)
  • 8. Data Structures: Vectors • Operations • x <- c(5, 6, 7) • any(x > 5) = TRUE [ How about all(x > 5)? ] • sum(c(1, 2, 3, NA), na.rm = TRUE) = 6 [ Why is na.rm required? ] • sort(c(7, 6, 5)) = c(5, 6, 7) • order(c(7, 6, 5)) = ??? • subset(x, x > 5) = c(6, 7) • head(1:100) = ??? • tail(1:100) = ??? • How is x == c(5, 6, 7) different from identical(x, c(5, 6, 7))? • Try str(x)
  • 9. Data Structures: Matrices • A two dimensional matrix of objects of the same (atomic) data type • Creation • y <- matrix(nrow=2, ncol=3) [ empty matrix ] • y <- matrix(c(1, 2, 3, 4, 5, 6), nrow=2) = • y <- matrix(c(1, 2, 3, 4, 5, 6), nrow=2, byrow=T) = • Accessing • y[1,2] = 2 • y[,2:3] = [ How about y[1,]? ] • What’s the difference between y[2,] and y[2,, drop=FALSE]? 1 3 5 2 4 6 1 2 3 4 5 6 2 3 5 6
  • 10. Data Structures: Matrices • Naming • rownames() and colnames() • Operations • nrow(y) = 2 [ number of rows ] • ncol(y) = 3 [ number of columns ] • apply(y, 1, sum) = c(6, 15) [ apply sum to each row ] • apply(y, 2, sum) = c(5, 7, 9) [ apply sum to each column ] • t(y) = [ transpose a matrix ]1 4 2 5 3 6
  • 11. Data Structures: Matrices • Operations • rbind(y, c(7, 8, 9)) = • cbind(y, c(7, 8)) = • Try str(y) 1 2 3 4 5 6 7 8 9 1 2 3 7 4 5 6 8
  • 12. Data Structures: Matrices • What will this yield? m <- matrix(nrow=4, ncol=4) m <- ifelse(row(m) == col(m), 1, 0.3)
  • 13. Data Structures: Lists • A sequence of objects (of possibly different data types) • Creation • k <- list(c(1, 2, 3), • l <- [ f1 and f2 are tags ] • Accessing • k[2:3] • k[[2]] [ How about k[2]? ] • l$f1 = c(1, 2, 3) [ Is it same as l[1] or l[[1]]? ]
  • 14. Data Structures: Lists • Naming • names(k) <- • Operations • lapply(list(1:2, 9:10), sum) = list(3, 19) • sapply(list(1:2, 9:10), sum) = c(3, 19) • l$f1 <- NULL = ??? • str(l) = ???
  • 15. Data Structures: Data Frames • A two dimensional matrix of objects where different columns can be of different types. • Creation • x <- data.frame jill • Accessing • x$names jill [ How about x[[1]]? ] • x[1] = ??? • x[c(1,2)] = ??? • x[1,] = ??? • x[,1] = ???
  • 16. Data Structures: Data Frames • Naming • rownames() and colnames() • Operations • x[x$age > 5,] = data.frame jill )) • subset(x, age > 5) = ??? • apply(x, 1, sum) = ??? • y <- data.frame(1:3, 5:7) • apply(y, 1, mean) = ??? • lapply(y, mean) = ??? • sapply(y, mean) = ??? • Try str(y)
  • 17. Factors (and Tables) • Type for categorical/nominal values. • Example • xf <- factor(c(1:3, 2, 4:5)) • Try xf and str(xf) • Operations • table(xf) = ??? • with(mtcars, split(mpg, cyl)) = ??? • with(mtcars, tapply(mpg, cyl, mean)) = ??? • by(mtcars, mtcars$cyl, function(m) { median(m$mpg) } = ??? • aggregate(mtcars, list(mtcars$cyl), median) = ??? • You can use cut to bin values and create factors. Try it.
  • 18. Basic Graphs • with(mtcars, boxplot(mpg)) • hist(mtcars$mpg) • with(mtcars, plot(hp, mpg)) • dotchart(VADeaths) • Try plot(aggregate(mtcars, list(mtcars$cyl), median)) You can get the list of datasets via ls package.datasets
  • 19. Stats 101 using R • mean • median • What about mode? • fivenum • quantile • sd • var • cov • cor
  • 20. Data Exploration using R Let’s get out hands dirty!!