SlideShare a Scribd company logo
2
Most read
4
Most read
17
Most read
The Colorful World of
Data Science
Sreejith C
Data Scientist
Calpine Labs
UVJ Technologies
Kochi
Overview
- Presentaion:
Introduction to Data Science
- Demonstration :
Loan Prediction Problem
- Exploratory data analysis in Python
- Data Munging in Python
- Building a Predictive Model in Python
Logistic Regression
Decision Tree
Random Forest
What is Data Science ?
The Science of
- Discovering what we don’t know from data
- Obtaining predictive, actionable insight from data
- Creating Data Products that have business impact
now
- Communicating relevant business stories from data
- Building confidence in decisions that drive business
value
“ Data science is clearly a blend of the hackers’ arts,
statistics and machine learning...
and the expertise in mathematics and the domain of
the data for the analysis to be interpretable...
It requires creative decisions and open-mindedness in
a scientific context “
Hilary Mason and Chris Wiggins
Hilary Mason is an American data scientist and the founder of technology startup Fast Forward Labs as well as Data Scientist in Residence at Accel Partners. She
was the Chief Scientist at bitly.
Christopher H. Wiggins is an associate professor of applied mathematics at Columbia University, the first Chief Data Scientist at The New York Times, and co-
founder and co-organizer of hackNY hackNY.org
THE DATA SCIENCE VENN DIAGRAM
Who is a Data Scientist ?
“ We realized that as our organizations grew, we both had to figure
out what to call the people on our teams.
Business analyst and Data analyst seemed too limiting.
The focus of our teams was to work on data applications that would
have an immediate and massive impact on the business.
The term that seemed to fit best was data scientist:
those who use both data and science to create something new “
DJ Patil
Chief Data Scientist of the United States Office of Science and Technology Policy, Patil is credited for coining the term "data science"
Data science
What Does a Data Scientist
Do?
“... on any given day, a team member could author a multistage
processing pipeline in Python,
design a hypothesis test, perform a regression analysis over data
samples with R,
design and implement an algorithm for some data-intensive product
or service in Hadoop,
communicate the results of our analyses to other members of the
organization “
Jeff Hammerbacher
Data scientist as well as chief scientist and cofounder at Cloudera.Along with Along with Jeff Hammerbacher, Patil is credited with coining the term "data science", Jeff
Hammerbacher is credited with coining the term "data science"
Data science
Machine Learning
- Regression
- Classification
- Clustering
Big Data Analytics
How to become a data scientist ?
Data scientists need to know how to code
Python
R
Julia
Java
Scala
Sql / NoSql
Spark / Hadoop
Data scientists need to be comfortable with
mathematics & statistics.
Data scientists need know machine learning &
software engineering.
Putting the pieces together .....
SIMPLE (Students' Innovations in Morphology Phonology and
Language Engineering) groups
CLEAR (Computational Linguistics in Engineering And
Research) magazine
- Blog / Write about your experience
- Build sample projects
- Share ideas
Puzzle
A huntsman can hit a target with a probability of 0.8
He sees a flock of birds (150 birds) atop a banyan tree.
He takes aim and fires 5 continuos shots.
Question : How many birds remain on the tree ?
Don't lose the big picture !!
0 !
Loan Prediction Problem
challenge is to predict approval status of loan
(Approved/ Reject)
Link :
https://github.com/sreejithc321/ML_Regression/tree/master/loan
_prediction
Demonstration
References
http://www.slideshare.net/ryanorban/how-to-become-a-data-
scientist
http://www.slideshare.net/datasciencelondon/big-data-sorry-data-
science-what-does-a-data-scientist-do
https://speakerdeck.com/bargava/introduction-to-machine-learning
https://www.analyticsvidhya.com/blog/2016/01/complete-tutorial-
learn-data-science-python-scratch-2/
Connect me at : http://in.linkedin.com/in/sreejithc321
Follow me at : https://twitter.com/sreejithc321

More Related Content

PPTX
Career in Data Science
PDF
Introduction on Data Science
PDF
Introduction to Data Science
PDF
Data science - An Introduction
PPTX
data science & machine learning prasentation
PPTX
Data science
PDF
How to Become a Data Scientist
PDF
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Career in Data Science
Introduction on Data Science
Introduction to Data Science
Data science - An Introduction
data science & machine learning prasentation
Data science
How to Become a Data Scientist
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...

What's hot (20)

PDF
Data science
PDF
Data science presentation
PPTX
Introduction to data science
PPTX
Introduction to data science club
PDF
Introduction to data science
PDF
Data Engineering Basics
PDF
Introduction To Data Science
PPTX
Introduction to Data Science
PDF
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
PPTX
Data science & data scientist
PPTX
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
PPTX
Introduction to data science.pptx
PPTX
Data science applications and usecases
PPTX
Data Science
PPTX
Data Science Training | Data Science For Beginners | Data Science With Python...
PDF
Data science presentation 2nd CI day
PPTX
Data Science With Python | Python For Data Science | Python Data Science Cour...
PPTX
Introduction to Data Engineering
PPTX
introduction to data science
PPTX
Data science
Data science
Data science presentation
Introduction to data science
Introduction to data science club
Introduction to data science
Data Engineering Basics
Introduction To Data Science
Introduction to Data Science
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data science & data scientist
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
Introduction to data science.pptx
Data science applications and usecases
Data Science
Data Science Training | Data Science For Beginners | Data Science With Python...
Data science presentation 2nd CI day
Data Science With Python | Python For Data Science | Python Data Science Cour...
Introduction to Data Engineering
introduction to data science
Data science
Ad

Similar to Data science (20)

PDF
Making an impact with data science
PPTX
Data Science
PDF
Data science tutorial
PPTX
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
PDF
What data scientists really do, according to 50 data scientists
PDF
iTrain Malaysia: Data Science by Tarun Sukhani
PDF
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
PPTX
intro to data science Clustering and visualization of data science subfields ...
PDF
Introduction-to-Data-Science.pdf
PDF
Introduction-to-Data-Science.pdf
PDF
Who is a data scientist
PPTX
data science introduction defination,app
PPTX
Impact of Data Science
PDF
Introduction to Data Science.pdf
PPTX
Introduction to Big Data and Data Science
PPT
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
PDF
Defining Data Science: A Comprehensive Overview
PDF
From Rocket Science to Data Science
PDF
Data science and Machine learning Booklet
PPTX
What is data_science_by_khawar_shehzad
Making an impact with data science
Data Science
Data science tutorial
Altron presentation on Emerging Technologies: Data Science and Artificial Int...
What data scientists really do, according to 50 data scientists
iTrain Malaysia: Data Science by Tarun Sukhani
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
intro to data science Clustering and visualization of data science subfields ...
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdf
Who is a data scientist
data science introduction defination,app
Impact of Data Science
Introduction to Data Science.pdf
Introduction to Big Data and Data Science
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Defining Data Science: A Comprehensive Overview
From Rocket Science to Data Science
Data science and Machine learning Booklet
What is data_science_by_khawar_shehzad
Ad

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
KodekX | Application Modernization Development
PPT
Teaching material agriculture food technology
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Electronic commerce courselecture one. Pdf
PPTX
A Presentation on Artificial Intelligence
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
cuic standard and advanced reporting.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
KodekX | Application Modernization Development
Teaching material agriculture food technology
Empathic Computing: Creating Shared Understanding
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Unlocking AI with Model Context Protocol (MCP)
Electronic commerce courselecture one. Pdf
A Presentation on Artificial Intelligence
The Rise and Fall of 3GPP – Time for a Sabbatical?
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Network Security Unit 5.pdf for BCA BBA.
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
Chapter 3 Spatial Domain Image Processing.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Spectral efficient network and resource selection model in 5G networks
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx

Data science

  • 1. The Colorful World of Data Science Sreejith C Data Scientist Calpine Labs UVJ Technologies Kochi
  • 2. Overview - Presentaion: Introduction to Data Science - Demonstration : Loan Prediction Problem - Exploratory data analysis in Python - Data Munging in Python - Building a Predictive Model in Python Logistic Regression Decision Tree Random Forest
  • 3. What is Data Science ?
  • 4. The Science of - Discovering what we don’t know from data - Obtaining predictive, actionable insight from data - Creating Data Products that have business impact now - Communicating relevant business stories from data - Building confidence in decisions that drive business value
  • 5. “ Data science is clearly a blend of the hackers’ arts, statistics and machine learning... and the expertise in mathematics and the domain of the data for the analysis to be interpretable... It requires creative decisions and open-mindedness in a scientific context “ Hilary Mason and Chris Wiggins Hilary Mason is an American data scientist and the founder of technology startup Fast Forward Labs as well as Data Scientist in Residence at Accel Partners. She was the Chief Scientist at bitly. Christopher H. Wiggins is an associate professor of applied mathematics at Columbia University, the first Chief Data Scientist at The New York Times, and co- founder and co-organizer of hackNY hackNY.org
  • 6. THE DATA SCIENCE VENN DIAGRAM
  • 7. Who is a Data Scientist ?
  • 8. “ We realized that as our organizations grew, we both had to figure out what to call the people on our teams. Business analyst and Data analyst seemed too limiting. The focus of our teams was to work on data applications that would have an immediate and massive impact on the business. The term that seemed to fit best was data scientist: those who use both data and science to create something new “ DJ Patil Chief Data Scientist of the United States Office of Science and Technology Policy, Patil is credited for coining the term "data science"
  • 10. What Does a Data Scientist Do?
  • 11. “... on any given day, a team member could author a multistage processing pipeline in Python, design a hypothesis test, perform a regression analysis over data samples with R, design and implement an algorithm for some data-intensive product or service in Hadoop, communicate the results of our analyses to other members of the organization “ Jeff Hammerbacher Data scientist as well as chief scientist and cofounder at Cloudera.Along with Along with Jeff Hammerbacher, Patil is credited with coining the term "data science", Jeff Hammerbacher is credited with coining the term "data science"
  • 13. Machine Learning - Regression - Classification - Clustering
  • 15. How to become a data scientist ?
  • 16. Data scientists need to know how to code Python R Julia Java Scala Sql / NoSql Spark / Hadoop
  • 17. Data scientists need to be comfortable with mathematics & statistics.
  • 18. Data scientists need know machine learning & software engineering.
  • 19. Putting the pieces together ..... SIMPLE (Students' Innovations in Morphology Phonology and Language Engineering) groups CLEAR (Computational Linguistics in Engineering And Research) magazine - Blog / Write about your experience - Build sample projects - Share ideas
  • 20. Puzzle A huntsman can hit a target with a probability of 0.8 He sees a flock of birds (150 birds) atop a banyan tree. He takes aim and fires 5 continuos shots. Question : How many birds remain on the tree ?
  • 21. Don't lose the big picture !! 0 !
  • 22. Loan Prediction Problem challenge is to predict approval status of loan (Approved/ Reject) Link : https://github.com/sreejithc321/ML_Regression/tree/master/loan _prediction Demonstration
  • 24. Connect me at : http://in.linkedin.com/in/sreejithc321 Follow me at : https://twitter.com/sreejithc321