What is R?
    What can you do with R?
               Getting Help
               Conclusions




The Statistical Significance of R

                Premal P. Vora 1
             1 Penn State Harrisburg

         School of Business Administration
             Middletown, PA 17057.
                   fpv at psu.edu


               October 19, 2009




            Premal P. Vora    The Statistical Significance of R
What is R?
                What can you do with R?
                           Getting Help
                           Conclusions


What is R?

     R is a system for statistical computation and graphics.
     It consists of a language plus a run-time environment with
     graphics, a debugger, access to certain system functions,
     and the ability to run programs stored in script files.
     Released under GNU GPL.
     Officially a part of GNU.
     OS: Linux, Unix, Windows and Mac
     CPUs: i386, alpha, arm, hppa, ia64, m68k, mips/mipsel,
     powerpc, s390, x86_64, powerpc-apple-darwin,
     mips-sgi-irix, i386-freebsd, rs6000-ibm-aix, and
     sparc-sun-solaris.

                        Premal P. Vora    The Statistical Significance of R
What is R?
                 What can you do with R?
                            Getting Help
                            Conclusions


Why should anyone care?

     “I keep saying that the sexy job in the next 10 years will be
     statiticians” said Hal Varian, chief economist at Google.
     “And I’m not kidding.”
     From a NY Times article published on August 6, 2009.
     Available at
     http://www.nytimes.com/2009/08/06/technology/06stats.html
     Drowning in data.
     Many, many closed commercial statistics packages
     available but not clear whether there is one winner.
     R has been widely and enthusiastically embraced in
     academia and in industry.
     R is open source, fast, solid, extensible.

                         Premal P. Vora    The Statistical Significance of R
What is R?
                 What can you do with R?
                            Getting Help
                            Conclusions


Origins of R


     Created by Ross Ihaka and Robert Gentleman both at the
     University of Auckland (New Zealand).
     Follows the language definition of S as much as possible.
     S created at Bell Labs (Becker, Chambers, Wilks).
     R has lexical scoping – S does not.
     A lot like Scheme (Steele and Sussman) under the hood.
     Now developed by the R Development Core Team.
     Current stable release 2.9.2 (released on 2009-08-04).
     Available at http://www.r-project.org



                         Premal P. Vora    The Statistical Significance of R
What is R?
                What can you do with R?
                           Getting Help
                           Conclusions


Fundamental purpose of R




     A software tool for making inferences from data.
     Syntax and structure of language allows researcher to
     focus on asking the right questions and on coaxing
     answers from the data.
     Answers are trustworthy.




                        Premal P. Vora    The Statistical Significance of R
What is R?
                What can you do with R?
                           Getting Help
                           Conclusions


Philosophy of R


     A language for expressions and for assignments.
     Expressions are evaluated and the result is immediately
     displayed.
     Assignments also evaluate an expression, but the result is
     assigned to an object and not printed.
     All assigned namespaces are held in memory.
     A facility to save and to load assigned namespaces is
     available.




                        Premal P. Vora    The Statistical Significance of R
What is R?
                  What can you do with R?
                             Getting Help
                             Conclusions


Simple numerical computation



     Operators: +, -, *, /, ^, ...
     Logical operators: ==, >, >=, !=, &, |, ...
     Hundreds of mathematical, statistical, and other functions:
     sqrt, log, log10, cos, tan, sum, min, max, mean, median,
     sort, ...
     Functions operate on numbers and a variety of data
     objects.




                          Premal P. Vora    The Statistical Significance of R
What is R?
                 What can you do with R?
                            Getting Help
                            Conclusions


(Data) objects

     Data objects: scalars, vectors, matrices, lists, dataframes.
     Objects can contain numbers, strings, logical quantities, or
     other objects.
     All elements in vectors and matrices must be of the same
     “mode” (R converts non-conforming elements on the fly
     when necessary).
     A list is a flexible data object that can contain other data
     objects, each of a different mode.
     A dataframe is like a matrix but each vector in that matrix is
     of a particular “mode”. Allows a collection of data of
     different modes to be treated as one object.


                         Premal P. Vora    The Statistical Significance of R
What is R?
                 What can you do with R?
                            Getting Help
                            Conclusions


Visualization/Graphics


     Base R automatically loads packages for creating visual
     displays of data.
     Very strong in this area.
     Graphics are customizable.
     Many open-source add-on packages for graphics are
     available.
     You can always write your own using the graphics
     primitives in R if dissatisfied with what’s available.




                         Premal P. Vora    The Statistical Significance of R
What is R?
                What can you do with R?
                           Getting Help
                           Conclusions


Extension




     Language for creating your own functions.
     Collect a group of functions into a package and share it
     with others if you like.
     Get feedback from the user base.




                        Premal P. Vora    The Statistical Significance of R
What is R?
                What can you do with R?
                           Getting Help
                           Conclusions


Live Demo


    Simple numerical computations.
    Math functions.
    Logical comparisons.
    Vectors, vector arithmetic.
    Reading a dataframe.
    Making tabular summaries of data.
    Visualization of x,y data.
    Running linear regression: y = a + bx + e.



                        Premal P. Vora    The Statistical Significance of R
What is R?
                 What can you do with R?
                            Getting Help
                            Conclusions


Help Sources

     Built-in help: questionmarkfunction
     Built-in help: help.search(“subject”)
     Built-in help: example(functionname)
     Online help: At the project website...several well-written
     manuals
     Online help: Several mailing lists including R-help
     Some mailing lists are devoted to specific special-interest
     groups such as R-SIG-Finance, R-SIG-ecology, etc.
     Support from third-party commercial firms is also available
     (for a fee).


                         Premal P. Vora    The Statistical Significance of R
What is R?
                 What can you do with R?
                            Getting Help
                            Conclusions


Strengths


     Fast, reliable, powerful, flexible, solid alternative to
     commercial packages.
     Well-known in academia and in industry.
     Many packages (2,000?) for different tasks available.
     Active development, eager support.
     Availability of support from third-party commercial firms
     makes it viable for proprietary use.
     Open source, GPL License.




                         Premal P. Vora    The Statistical Significance of R
What is R?
                 What can you do with R?
                            Getting Help
                            Conclusions


Weaknesses (All personal observations)




     Memory management restricts data set size to size of(RAM
     + swap space - space occupied by other processes).
     Learning curve for basic to intermediate usage is relatively
     flat, but thereafter steep.




                         Premal P. Vora    The Statistical Significance of R

More Related Content

PPTX
Reason To learn & use r
PPT
R programming
PDF
Basic introduction into R
PPTX
How to get started with R programming
PDF
Class ppt intro to r
PPTX
R programming
PPTX
R programming presentation
PDF
A short tutorial on r
Reason To learn & use r
R programming
Basic introduction into R
How to get started with R programming
Class ppt intro to r
R programming
R programming presentation
A short tutorial on r

What's hot (18)

PDF
Python vs. r for data science
PDF
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
PPTX
Introduction to statistical software R
PPTX
Introduction to R programming
PDF
Introduction to R
PPTX
Introduction to r
PDF
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
PPTX
R vs python. Which one is best for data science
PPTX
Python vs R for Data Science: What’s the Difference? How can they automate?
PDF
SAC 2019 ester giallonardo
PPTX
Aspect Based Sentiment Analysis
PPT
多媒體資料庫(New)3rd
PDF
Aspects of NLP Practice
PDF
IE: Named Entity Recognition (NER)
PDF
Requirements Engineering: focus on Natural Language Processing, Lecture 2
PPTX
R language tutorial
PDF
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
Python vs. r for data science
ESWC 2013 Poster: Representing and Querying Negative Knowledge in RDF
Introduction to statistical software R
Introduction to R programming
Introduction to R
Introduction to r
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
R vs python. Which one is best for data science
Python vs R for Data Science: What’s the Difference? How can they automate?
SAC 2019 ester giallonardo
Aspect Based Sentiment Analysis
多媒體資料庫(New)3rd
Aspects of NLP Practice
IE: Named Entity Recognition (NER)
Requirements Engineering: focus on Natural Language Processing, Lecture 2
R language tutorial
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
Ad

Viewers also liked (6)

PPTX
Coffee script
PPT
Meie Unistuste Kool
PDF
Warthog Photography - Models, Bands/Music, Original Portraiture.
PPT
ADHD
PDF
Ggplot2
PPSX
GROW presentation
Coffee script
Meie Unistuste Kool
Warthog Photography - Models, Bands/Music, Original Portraiture.
ADHD
Ggplot2
GROW presentation
Ad

Similar to The Statistical Significance of "R" (20)

PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
R programming language
PDF
UNIT-4 Start Learning R and installation .pdf
PDF
2 it unit-1 start learning r
PDF
UNIT-1 Start Learning R.pdf
PPTX
R vs SPSS: Which One is The Best Statistical Language
PPTX
DOC-20240829-WA0001 power point presentation
PPTX
R and Rcmdr Statistical Software
PDF
Intro to R for SAS and SPSS User Webinar
PDF
Introtor
PPTX
R programming for psychometrics
PPTX
R for data analytics
PDF
Introduction To R
PPTX
R_L1-Aug-2022.pptx
PPT
An introduction to R is a document useful
PDF
Fresher's guide to Preparing for a Big Data Interview
PPTX
Introduction to R Programming
PPTX
R programming
PDF
SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
PDF
Introduction to the R Programming Language
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
R programming language
UNIT-4 Start Learning R and installation .pdf
2 it unit-1 start learning r
UNIT-1 Start Learning R.pdf
R vs SPSS: Which One is The Best Statistical Language
DOC-20240829-WA0001 power point presentation
R and Rcmdr Statistical Software
Intro to R for SAS and SPSS User Webinar
Introtor
R programming for psychometrics
R for data analytics
Introduction To R
R_L1-Aug-2022.pptx
An introduction to R is a document useful
Fresher's guide to Preparing for a Big Data Interview
Introduction to R Programming
R programming
SEO Asset (PDF) Comparing Python, R, and SAS Overcoming Training Data Set Cha...
Introduction to the R Programming Language

Recently uploaded (20)

PDF
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
PPTX
Module on health assessment of CHN. pptx
PPTX
Computer Architecture Input Output Memory.pptx
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PDF
Mucosal Drug Delivery system_NDDS_BPHARMACY__SEM VII_PCI.pdf
PDF
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI .pdf
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
AI-driven educational solutions for real-life interventions in the Philippine...
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
Race Reva University – Shaping Future Leaders in Artificial Intelligence
PDF
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PDF
Hazard Identification & Risk Assessment .pdf
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PDF
HVAC Specification 2024 according to central public works department
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PDF
English Textual Question & Ans (12th Class).pdf
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
Module on health assessment of CHN. pptx
Computer Architecture Input Output Memory.pptx
What if we spent less time fighting change, and more time building what’s rig...
Uderstanding digital marketing and marketing stratergie for engaging the digi...
Mucosal Drug Delivery system_NDDS_BPHARMACY__SEM VII_PCI.pdf
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI .pdf
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
FORM 1 BIOLOGY MIND MAPS and their schemes
AI-driven educational solutions for real-life interventions in the Philippine...
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
Race Reva University – Shaping Future Leaders in Artificial Intelligence
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
Hazard Identification & Risk Assessment .pdf
Paper A Mock Exam 9_ Attempt review.pdf.
Unit 4 Computer Architecture Multicore Processor.pptx
HVAC Specification 2024 according to central public works department
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
English Textual Question & Ans (12th Class).pdf

The Statistical Significance of "R"

  • 1. What is R? What can you do with R? Getting Help Conclusions The Statistical Significance of R Premal P. Vora 1 1 Penn State Harrisburg School of Business Administration Middletown, PA 17057. fpv at psu.edu October 19, 2009 Premal P. Vora The Statistical Significance of R
  • 2. What is R? What can you do with R? Getting Help Conclusions What is R? R is a system for statistical computation and graphics. It consists of a language plus a run-time environment with graphics, a debugger, access to certain system functions, and the ability to run programs stored in script files. Released under GNU GPL. Officially a part of GNU. OS: Linux, Unix, Windows and Mac CPUs: i386, alpha, arm, hppa, ia64, m68k, mips/mipsel, powerpc, s390, x86_64, powerpc-apple-darwin, mips-sgi-irix, i386-freebsd, rs6000-ibm-aix, and sparc-sun-solaris. Premal P. Vora The Statistical Significance of R
  • 3. What is R? What can you do with R? Getting Help Conclusions Why should anyone care? “I keep saying that the sexy job in the next 10 years will be statiticians” said Hal Varian, chief economist at Google. “And I’m not kidding.” From a NY Times article published on August 6, 2009. Available at http://www.nytimes.com/2009/08/06/technology/06stats.html Drowning in data. Many, many closed commercial statistics packages available but not clear whether there is one winner. R has been widely and enthusiastically embraced in academia and in industry. R is open source, fast, solid, extensible. Premal P. Vora The Statistical Significance of R
  • 4. What is R? What can you do with R? Getting Help Conclusions Origins of R Created by Ross Ihaka and Robert Gentleman both at the University of Auckland (New Zealand). Follows the language definition of S as much as possible. S created at Bell Labs (Becker, Chambers, Wilks). R has lexical scoping – S does not. A lot like Scheme (Steele and Sussman) under the hood. Now developed by the R Development Core Team. Current stable release 2.9.2 (released on 2009-08-04). Available at http://www.r-project.org Premal P. Vora The Statistical Significance of R
  • 5. What is R? What can you do with R? Getting Help Conclusions Fundamental purpose of R A software tool for making inferences from data. Syntax and structure of language allows researcher to focus on asking the right questions and on coaxing answers from the data. Answers are trustworthy. Premal P. Vora The Statistical Significance of R
  • 6. What is R? What can you do with R? Getting Help Conclusions Philosophy of R A language for expressions and for assignments. Expressions are evaluated and the result is immediately displayed. Assignments also evaluate an expression, but the result is assigned to an object and not printed. All assigned namespaces are held in memory. A facility to save and to load assigned namespaces is available. Premal P. Vora The Statistical Significance of R
  • 7. What is R? What can you do with R? Getting Help Conclusions Simple numerical computation Operators: +, -, *, /, ^, ... Logical operators: ==, >, >=, !=, &, |, ... Hundreds of mathematical, statistical, and other functions: sqrt, log, log10, cos, tan, sum, min, max, mean, median, sort, ... Functions operate on numbers and a variety of data objects. Premal P. Vora The Statistical Significance of R
  • 8. What is R? What can you do with R? Getting Help Conclusions (Data) objects Data objects: scalars, vectors, matrices, lists, dataframes. Objects can contain numbers, strings, logical quantities, or other objects. All elements in vectors and matrices must be of the same “mode” (R converts non-conforming elements on the fly when necessary). A list is a flexible data object that can contain other data objects, each of a different mode. A dataframe is like a matrix but each vector in that matrix is of a particular “mode”. Allows a collection of data of different modes to be treated as one object. Premal P. Vora The Statistical Significance of R
  • 9. What is R? What can you do with R? Getting Help Conclusions Visualization/Graphics Base R automatically loads packages for creating visual displays of data. Very strong in this area. Graphics are customizable. Many open-source add-on packages for graphics are available. You can always write your own using the graphics primitives in R if dissatisfied with what’s available. Premal P. Vora The Statistical Significance of R
  • 10. What is R? What can you do with R? Getting Help Conclusions Extension Language for creating your own functions. Collect a group of functions into a package and share it with others if you like. Get feedback from the user base. Premal P. Vora The Statistical Significance of R
  • 11. What is R? What can you do with R? Getting Help Conclusions Live Demo Simple numerical computations. Math functions. Logical comparisons. Vectors, vector arithmetic. Reading a dataframe. Making tabular summaries of data. Visualization of x,y data. Running linear regression: y = a + bx + e. Premal P. Vora The Statistical Significance of R
  • 12. What is R? What can you do with R? Getting Help Conclusions Help Sources Built-in help: questionmarkfunction Built-in help: help.search(“subject”) Built-in help: example(functionname) Online help: At the project website...several well-written manuals Online help: Several mailing lists including R-help Some mailing lists are devoted to specific special-interest groups such as R-SIG-Finance, R-SIG-ecology, etc. Support from third-party commercial firms is also available (for a fee). Premal P. Vora The Statistical Significance of R
  • 13. What is R? What can you do with R? Getting Help Conclusions Strengths Fast, reliable, powerful, flexible, solid alternative to commercial packages. Well-known in academia and in industry. Many packages (2,000?) for different tasks available. Active development, eager support. Availability of support from third-party commercial firms makes it viable for proprietary use. Open source, GPL License. Premal P. Vora The Statistical Significance of R
  • 14. What is R? What can you do with R? Getting Help Conclusions Weaknesses (All personal observations) Memory management restricts data set size to size of(RAM + swap space - space occupied by other processes). Learning curve for basic to intermediate usage is relatively flat, but thereafter steep. Premal P. Vora The Statistical Significance of R