SlideShare a Scribd company logo
R vs Python vs SAS
Oliver Frost
Wednesday, 18 January 2017
18/1/2017 Copyright Consolidata Ltd 2017 1
Today’s session:
• A (very quick) introduction to business intelligence and the big data
industry.
• The role of the analyst.
• What is R? What is Python? What is SAS?
• Why should I learn them?
• What can I use them for?
18/1/2017 Copyright Consolidata Ltd 2017 2
Oliver Frost
GitHub: https://github.com/olfrost
Twitter: @Consolidata
LinkedIn: https://uk.linkedin.com/in/olliefrost
Consolidata Ltd
Twitter: @ConsolidataLtd
http://www.consolidata.co.uk
18/1/2017 Copyright Consolidata Ltd 2017 3
Background
• Cognitive Neuroscience BSc
• Multiple disciplines – biology, chemistry,
psychology, sociology:
• Designing experiments
• Data collection and research methods
• Testing for significance, power calculations,
predictive modelling
• Data protection, data ethics
• Now working as a data engineer:
• Cleaning, reshaping and normalising survey
data for a marketresearch company
• Developing the ConsolidataData Platform.
• Active member of the data analytics
community
18/1/2017 Copyright Consolidata Ltd 2017 4
Working as an analyst
• You may be familiar with some tools already, depending
where you’ve come from:
• Excel and Office tools
• SPSS, MATLAB
• SQL
• BI and analytics are a bit of a continuous process:
• Cleaning data – missing values? Bad data?
• Reshape data – is the data in the right format?
• Loading – how much is there?
• Find patterns – do these patterns add value?
• Presentation – can you tell a story?
18/1/2017 Copyright Consolidata Ltd 2017 5
What is R?
• R is an open-source programming language, developed by academics
and statisticians
• Originally for maths and statistical analysis, but is slowly becoming an
all-purpose language:
• Collect and analyse social media data
• Text analytics
• Predict trends
• Train machines to make predictions
• Scrape data from websites
• Also a great visualisation tool!
18/1/2017 Copyright Consolidata Ltd 2017 6
• It’s easy to learn
• It’s free to use
• R skills are in demand
• The language is becoming increasingly
popular
• Open-source means you know exactly
what your program is doing
• Integration with other tools like Excel, SQL
Server and pretty much any data analysis
tool!
• Shorter development cycles because new
modules and packages are being released
all the time
What is R?
18/1/2017 Copyright Consolidata Ltd 2017 7
What is Python?
• An all-purpose, general language that works on multiple platforms
• High level and easy to learn like R
• More commonly used for machine
learning and predictive modelling
(particularly good for academics and
data scientists)
• Open source and free to learn and use
• More commonly by developers Source: http://spectrum.ieee.org/computing/software/the-
2016-top-programming-languages (IEEE - Institute of
Electrical and Electronics Engineers)
18/1/2017 Copyright Consolidata Ltd 2017 8
What is SAS?
• Statistical Analysis System
• Stored data in tables and can be used for:
• Writing reports
• Developing applications
• Data warehousing
• Data mining
• You don’t have to be technical…
18/1/2017 Copyright Consolidata Ltd 2017 9
What do businesses use these tools for?
• Building “data pipelines”:
• New data is coming in all the time
• Needs to be extracted, transformed and loaded
• Needs to be fast
18/1/2017 Copyright Consolidata Ltd 2017 10
What do businesses use these tools for?
• Descriptive Analytics
• These skills are in demand.
• Businesses want to know about their
historical data.
• They also want to know what is happening
right now.
• New marketing opportunities? Save time
and money in current processes?
• Machine learning and data science?
• Can our customers be divided into clusters?
• Can we predict what a customer is likely to
buy and make recommendations?
• Can we detect fraud? Can we predict risk?
18/1/2017 Copyright Consolidata Ltd 2017 11
• Learning a language can be intimidating, especially from a
non-technical background.
• But from my experience, it was absolutely worth it.
• No need to pick one tool over the other, they are all great.
• I would recommend R, though…
Conclusions
18/1/2017 Copyright Consolidata Ltd 2017 12

More Related Content

PPTX
Loaders ( system programming )
PPTX
Cryptography
PPT
RSA Algorithm - Public Key Cryptography
PDF
Advantages of Cloud Computing for Business
PPTX
Microsoft Cloud Computing - Windows Azure Platform
PPTX
Blockchain and AI
PDF
Introduction to Microsoft Azure Cloud
Loaders ( system programming )
Cryptography
RSA Algorithm - Public Key Cryptography
Advantages of Cloud Computing for Business
Microsoft Cloud Computing - Windows Azure Platform
Blockchain and AI
Introduction to Microsoft Azure Cloud

What's hot (20)

PPTX
Google Cloud Platform
PPTX
Cloud Adoption Plan - Planning phase
PPTX
Top 10 cloud service providers
PPTX
Data science in health care
PPT
ECG ANALYSIS IN CLOUD COMPUTING
PPTX
Cloud computing (IT-703) UNIT 1 & 2
PDF
Big data and analytics
PPTX
Market oriented Cloud Computing
PDF
Cloud Migration Strategy and Best Practices
PDF
Data Framework Design
PDF
2023 Trends in Enterprise Analytics
PPTX
Cloud Computing Introduction
PDF
Lattice-Based Cryptography: CRYPTANALYSIS OF COMPACT-LWE
PPTX
Cloud security
PPT
Network security cryptographic hash function
PPTX
What is dotnet (.NET) ?
PPTX
Homomorphic Encryption
PPTX
Task programming
PPTX
Presentation on Big Data Analytics
PDF
Loading your Life into a Vector Database
Google Cloud Platform
Cloud Adoption Plan - Planning phase
Top 10 cloud service providers
Data science in health care
ECG ANALYSIS IN CLOUD COMPUTING
Cloud computing (IT-703) UNIT 1 & 2
Big data and analytics
Market oriented Cloud Computing
Cloud Migration Strategy and Best Practices
Data Framework Design
2023 Trends in Enterprise Analytics
Cloud Computing Introduction
Lattice-Based Cryptography: CRYPTANALYSIS OF COMPACT-LWE
Cloud security
Network security cryptographic hash function
What is dotnet (.NET) ?
Homomorphic Encryption
Task programming
Presentation on Big Data Analytics
Loading your Life into a Vector Database
Ad

Viewers also liked (20)

PPTX
Data structures in python
PPTX
An upcoming technology
PDF
Random Forests R vs Python by Linda Uruchurtu
PDF
Pyshark in Network Packet analysis
PDF
Machine Learning and Deep Learning from Foundations to Applications Excel, R,...
PDF
Interactive Visualization With Bokeh (SF Python Meetup)
PDF
Data Structures for Statistical Computing in Python
PDF
pandas: a Foundational Python Library for Data Analysis and Statistics
PDF
Python Data Wrangling: Preparing for the Future
PPTX
Software Programs for Data Analysis
PDF
Python for Financial Data Analysis with pandas
PPTX
disadvantages of modern technology
PDF
Road to Analytics
PPTX
10 R Packages to Win Kaggle Competitions
PDF
Myths and Mathemagical Superpowers of Data Scientists
PDF
How to Become a Data Scientist
PPTX
Artificial neural network
PPTX
Artificial Intelligence Presentation
PDF
Tips for data science competitions
Data structures in python
An upcoming technology
Random Forests R vs Python by Linda Uruchurtu
Pyshark in Network Packet analysis
Machine Learning and Deep Learning from Foundations to Applications Excel, R,...
Interactive Visualization With Bokeh (SF Python Meetup)
Data Structures for Statistical Computing in Python
pandas: a Foundational Python Library for Data Analysis and Statistics
Python Data Wrangling: Preparing for the Future
Software Programs for Data Analysis
Python for Financial Data Analysis with pandas
disadvantages of modern technology
Road to Analytics
10 R Packages to Win Kaggle Competitions
Myths and Mathemagical Superpowers of Data Scientists
How to Become a Data Scientist
Artificial neural network
Artificial Intelligence Presentation
Tips for data science competitions
Ad

Similar to R vs Python vs SAS (20)

PDF
How to be data savvy manager
PPTX
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
PPTX
1 data science with python
PDF
Fried data summit big data for lob content
PDF
SIM RTP Meeting - So Who's Using Open Source Anyway?
PDF
Artificial Intelligence and the Cognitive Revolution – the next frontier?
PPTX
Small Talk at Tsing Hua University
PPTX
Big data analytics
PDF
Natural language Processing for Smart contracts in Blockchain
PPTX
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
PDF
1355 appliedsciencestrack dershewitz
PDF
Python for Data: Past, Present, Future (PyCon JP 2017 Keynote)
PPTX
Building Data Scientists
PPTX
Data scientist What is inside it?
PDF
Introduction to Data Science - Fundamentals
PDF
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
PPTX
Cloud-native Enterprise Data Science Teams
PPTX
The Truth About SharePoint
PDF
The 10th ACC Audience survey report
PPTX
Cloud-native Enterprise Data Science Teams
How to be data savvy manager
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
1 data science with python
Fried data summit big data for lob content
SIM RTP Meeting - So Who's Using Open Source Anyway?
Artificial Intelligence and the Cognitive Revolution – the next frontier?
Small Talk at Tsing Hua University
Big data analytics
Natural language Processing for Smart contracts in Blockchain
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
1355 appliedsciencestrack dershewitz
Python for Data: Past, Present, Future (PyCon JP 2017 Keynote)
Building Data Scientists
Data scientist What is inside it?
Introduction to Data Science - Fundamentals
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Cloud-native Enterprise Data Science Teams
The Truth About SharePoint
The 10th ACC Audience survey report
Cloud-native Enterprise Data Science Teams

More from Outreach Digital (20)

PPTX
PR101 A Guide to Public Relations
PPTX
Outreach Digital - PPC & CRO for Lead Acquisition - Killer Tactics You Would ...
PPTX
Outreach Digital: Recipe for Creating High-converting Landing Pages
PDF
Split Testing for Fun Profit (Beginner-Intermediate) - Stephen Pratley
PDF
Data Visualisation & Analytics with Tableau (Beginner) - by Maria Koumandraki
PDF
Data visualisation & analytics with Tableau
PPTX
Web And App Design
PPTX
Data analysis with pandas
PDF
SEO PPC CRO hacks and anomalies that you would die to know
PDF
Programmatic Advertising
PDF
The Four Steps to SEO Domination
PDF
How to Integrate Social Media in Your Marketing Mix
PDF
How Hospitality Is Embracing Technology
PPTX
Introduction to Voucher Marketing
PDF
Measuring Cross-Channel Attribution & Programmatic Ads
PPTX
Startup Growth & Effective Marketing
PDF
Marketing to the Younger Generation
PPTX
Discovering Customer Love
PDF
Machine Learning for Digital Advertising
PDF
Building & Scaling Data Teams
PR101 A Guide to Public Relations
Outreach Digital - PPC & CRO for Lead Acquisition - Killer Tactics You Would ...
Outreach Digital: Recipe for Creating High-converting Landing Pages
Split Testing for Fun Profit (Beginner-Intermediate) - Stephen Pratley
Data Visualisation & Analytics with Tableau (Beginner) - by Maria Koumandraki
Data visualisation & analytics with Tableau
Web And App Design
Data analysis with pandas
SEO PPC CRO hacks and anomalies that you would die to know
Programmatic Advertising
The Four Steps to SEO Domination
How to Integrate Social Media in Your Marketing Mix
How Hospitality Is Embracing Technology
Introduction to Voucher Marketing
Measuring Cross-Channel Attribution & Programmatic Ads
Startup Growth & Effective Marketing
Marketing to the Younger Generation
Discovering Customer Love
Machine Learning for Digital Advertising
Building & Scaling Data Teams

Recently uploaded (20)

PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
DOCX
Euro SEO Services 1st 3 General Updates.docx
PDF
How to Get Business Funding for Small Business Fast
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PDF
How to Get Funding for Your Trucking Business
PDF
Laughter Yoga Basic Learning Workshop Manual
PDF
Ôn tập tiếng anh trong kinh doanh nâng cao
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
PDF
A Brief Introduction About Julia Allison
DOCX
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PPTX
HR Introduction Slide (1).pptx on hr intro
PPTX
New Microsoft PowerPoint Presentation - Copy.pptx
PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PDF
Roadmap Map-digital Banking feature MB,IB,AB
PDF
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
PDF
Unit 1 Cost Accounting - Cost sheet
PPT
Data mining for business intelligence ch04 sharda
ICG2025_ICG 6th steering committee 30-8-24.pptx
Euro SEO Services 1st 3 General Updates.docx
How to Get Business Funding for Small Business Fast
unit 1 COST ACCOUNTING AND COST SHEET
How to Get Funding for Your Trucking Business
Laughter Yoga Basic Learning Workshop Manual
Ôn tập tiếng anh trong kinh doanh nâng cao
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
340036916-American-Literature-Literary-Period-Overview.ppt
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
A Brief Introduction About Julia Allison
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
DOC-20250806-WA0002._20250806_112011_0000.pdf
HR Introduction Slide (1).pptx on hr intro
New Microsoft PowerPoint Presentation - Copy.pptx
Power and position in leadershipDOC-20250808-WA0011..pdf
Roadmap Map-digital Banking feature MB,IB,AB
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
Unit 1 Cost Accounting - Cost sheet
Data mining for business intelligence ch04 sharda

R vs Python vs SAS

  • 1. R vs Python vs SAS Oliver Frost Wednesday, 18 January 2017 18/1/2017 Copyright Consolidata Ltd 2017 1
  • 2. Today’s session: • A (very quick) introduction to business intelligence and the big data industry. • The role of the analyst. • What is R? What is Python? What is SAS? • Why should I learn them? • What can I use them for? 18/1/2017 Copyright Consolidata Ltd 2017 2
  • 3. Oliver Frost GitHub: https://github.com/olfrost Twitter: @Consolidata LinkedIn: https://uk.linkedin.com/in/olliefrost Consolidata Ltd Twitter: @ConsolidataLtd http://www.consolidata.co.uk 18/1/2017 Copyright Consolidata Ltd 2017 3
  • 4. Background • Cognitive Neuroscience BSc • Multiple disciplines – biology, chemistry, psychology, sociology: • Designing experiments • Data collection and research methods • Testing for significance, power calculations, predictive modelling • Data protection, data ethics • Now working as a data engineer: • Cleaning, reshaping and normalising survey data for a marketresearch company • Developing the ConsolidataData Platform. • Active member of the data analytics community 18/1/2017 Copyright Consolidata Ltd 2017 4
  • 5. Working as an analyst • You may be familiar with some tools already, depending where you’ve come from: • Excel and Office tools • SPSS, MATLAB • SQL • BI and analytics are a bit of a continuous process: • Cleaning data – missing values? Bad data? • Reshape data – is the data in the right format? • Loading – how much is there? • Find patterns – do these patterns add value? • Presentation – can you tell a story? 18/1/2017 Copyright Consolidata Ltd 2017 5
  • 6. What is R? • R is an open-source programming language, developed by academics and statisticians • Originally for maths and statistical analysis, but is slowly becoming an all-purpose language: • Collect and analyse social media data • Text analytics • Predict trends • Train machines to make predictions • Scrape data from websites • Also a great visualisation tool! 18/1/2017 Copyright Consolidata Ltd 2017 6
  • 7. • It’s easy to learn • It’s free to use • R skills are in demand • The language is becoming increasingly popular • Open-source means you know exactly what your program is doing • Integration with other tools like Excel, SQL Server and pretty much any data analysis tool! • Shorter development cycles because new modules and packages are being released all the time What is R? 18/1/2017 Copyright Consolidata Ltd 2017 7
  • 8. What is Python? • An all-purpose, general language that works on multiple platforms • High level and easy to learn like R • More commonly used for machine learning and predictive modelling (particularly good for academics and data scientists) • Open source and free to learn and use • More commonly by developers Source: http://spectrum.ieee.org/computing/software/the- 2016-top-programming-languages (IEEE - Institute of Electrical and Electronics Engineers) 18/1/2017 Copyright Consolidata Ltd 2017 8
  • 9. What is SAS? • Statistical Analysis System • Stored data in tables and can be used for: • Writing reports • Developing applications • Data warehousing • Data mining • You don’t have to be technical… 18/1/2017 Copyright Consolidata Ltd 2017 9
  • 10. What do businesses use these tools for? • Building “data pipelines”: • New data is coming in all the time • Needs to be extracted, transformed and loaded • Needs to be fast 18/1/2017 Copyright Consolidata Ltd 2017 10
  • 11. What do businesses use these tools for? • Descriptive Analytics • These skills are in demand. • Businesses want to know about their historical data. • They also want to know what is happening right now. • New marketing opportunities? Save time and money in current processes? • Machine learning and data science? • Can our customers be divided into clusters? • Can we predict what a customer is likely to buy and make recommendations? • Can we detect fraud? Can we predict risk? 18/1/2017 Copyright Consolidata Ltd 2017 11
  • 12. • Learning a language can be intimidating, especially from a non-technical background. • But from my experience, it was absolutely worth it. • No need to pick one tool over the other, they are all great. • I would recommend R, though… Conclusions 18/1/2017 Copyright Consolidata Ltd 2017 12