SlideShare a Scribd company logo
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Introduction
to Data Science
Dr Ahmed Rebai, Phd in nuclear physics
Dr Lotfi Ncib, PhD in applied maths
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Plan
▪ The Data Explosion
▪ The Data Hystory
▪ Why Data Science?
▪ What is Data Science
▪ Steps in The Data Science Process
▪ Career in Data Science
▪ Data Science Tools
1
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
The Data Explosion
How Much Data Is Collected Every Minute of The Day in 2019 ?
2
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
The Data History
Since the dawn of time… up until 2005
Humans had created 130 EXABYTES of Data
2005-130 EXABYTES
2010-1200 EXABYTES
2015-7900 EXABYTES
2020-40900 EXABYTES
Byte
Kilobyte(KB) 1.000=103
Megabyte(MB) 1.000.000=106
Gigabyte(GB) 1.000.000.000=109
Terabyte(TB) 1.000.000.000.000=1012
Petabyte(PB) 1.000.000.000.000.000=1015
Exabyte(XB) 1.000.000.000.000.000.000=1018
3
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
The Data History
A 1 BYTE of space
if we zoom out 1000 times we
will get a page of letter (1 kB)
about 500 characters
Now let zoom another 1000
times and we will get a book -
about 500 pages to take 1MB
Now lets zoom another times and
we will get 1GB(1 GB is sufficient
to fit all human genomes once
coded (Usually it takes 725MB)
If we zoom another 1000 times we will
get into TB(enough to fit some one’s life
recorded for 8 years(everything they do-
every minute or second
If we zoom another 1000 times we will get
into PB(Amazon rain forest is 1.4 Billion acres
about 500 tree per acre / 700 billion trees). If
you shup all these trees down and turn them
in to papers and fill the papers with letters
both side- close to 1PB.
If we zoom another 1000 times we
will get into XB(1000 TB)
4
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
The Data History
1 ZettaByte=1000 ExaByte
5
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Why Data Science?
Salary trends have followed the impact of data science. With a national
average salary of $118.000(which increase to $126.000 in Silicon Valley), data
science has become a lucrative career path where you can solve hard
problems and drive social impact.
Data scientist is the sexiest career of
the 21st century
Statistical Analysis and Data Mining wher the
hottest skills that got recruiter’s attention in
2014/2015/2016/2017/2018
The US alone faces a shortage of more than
150.000 data analyst and an additional 1.5
million data savy managers
6
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Why Data Science?
7
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
What is Data Science?
“The ability to take data — to be able to understand it, to process it, to extract value
from it, to visualize it, to communicate it — that’s going to be a hugely important
skill in the next decades.”
- Hal Varian, chief economist at Google and UC Berkeley professor of information
sciences, business, and economics
DATA SCIENCE is the area of study which involves extracting insights from vast
amounts of data by the use of various scientific methods, algorithms, and processes.
Data Science is the science wich uses computer science, statistics and machine
learning, visualization and human-computer interactions to collect, clean integrate,
analyze, visualize, interact with data to create data products,
8
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Steps in The Data Science Process
ACQUIRE PREPARE ACTREPORTANALYZE
Data Engineering Computational Data Science
9
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Steps in The Data Science Process
ACQUIRE
Step 1: Acquire Data
▪ Identify data sets
▪ Retrieve data
▪ Query data
10
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Step 2: Prepare Data
▪ Explore Data
➢ Understand nature of data
➢ Preliminary analysis
▪ Pre-process Data
➢Clean
➢Transform
PREPARE
Steps in The Data Science Process
11
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Step 3: Analyze Data
▪ Select analytical techniques
▪ Build modelsANALYZE
SPAM
Dimensionality Reduction Clustering
Regression
Classification
Steps in The Data Science Process
12
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
REPORT
Step 4: Communicate Results
▪ What to present
▪ How to present
Steps in The Data Science Process
13
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Steps in The Data Science Process
ACT
Step 5: Turning Insights into Action
▪ Results
▪ Purpose
14
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Career in Data Science
Domain Expertise
Programing
languages
Math/ statistic/
Probability
Lingo/Fondations Projects
❖ Health care
❖ Retail
❖ Finance
❖ Eduction
❖ …
❖ Python
❖ R
❖ C++
❖ Java
❖ Julia
❖ Scala
❖ …
❖ Sentiment analysis
❖ Card Fraud detection
❖ Customer
segmentation
❖ Image classification
❖ Loan default
detection
❖ …
❖ Machine Learning
❖ Deep Learning
❖ Classification
❖ Rgression
❖ Clustering
❖ Decision trees
❖ KNN
❖ SVM
❖ Kmeans
❖ PAC
❖ …
❖ Linear algebera
❖ Bayes theorem
❖ Mean, Median
and Mode
❖ Covariance and
correlation
❖ Central Limite
Theorem
❖ Normal
Distrubtion
❖ …
15
ahmed.rebai@esprit.tnLotfi.ncib@esprit.tn
Data Science Tools
Visualization ToolsModeling Tools
16

More Related Content

PPTX
A Brief History Of Data
PPTX
Overview of bigdata
PPT
Big data hadoop
PPT
Data science training institute in hyderabad
PDF
Big Data & Machine Learning
PPTX
Big Data and the Art of Data Science
PPT
Big Data And Hadoop
PDF
How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...
A Brief History Of Data
Overview of bigdata
Big data hadoop
Data science training institute in hyderabad
Big Data & Machine Learning
Big Data and the Art of Data Science
Big Data And Hadoop
How to Become a Data Scientist – By Ryan Orban, VP of Operations and Expansio...

What's hot (16)

PPTX
Big data PPT
PPTX
Big Data + Big Sim: Query Processing over Unstructured CFD Models
PPTX
Big data
PDF
Data Science Popup Austin: Meet the PyData Community
PDF
Big data VN-INFO meet-up
PPTX
Big data
PDF
Big Data, Big Deal: For Future Big Data Scientists
PPTX
Data mining on big data
PPTX
Overview of Big data(ppt)
PDF
Myths and Mathemagical Superpowers of Data Scientists
PPT
Data mining with big data
PPTX
Hadoop journey
PPTX
Data Science: Past, Present, and Future
PDF
Data science
PDF
Big Data introduction - Café Numérique Bruxelles
Big data PPT
Big Data + Big Sim: Query Processing over Unstructured CFD Models
Big data
Data Science Popup Austin: Meet the PyData Community
Big data VN-INFO meet-up
Big data
Big Data, Big Deal: For Future Big Data Scientists
Data mining on big data
Overview of Big data(ppt)
Myths and Mathemagical Superpowers of Data Scientists
Data mining with big data
Hadoop journey
Data Science: Past, Present, and Future
Data science
Big Data introduction - Café Numérique Bruxelles
Ad

Similar to Introduction to Data Science (20)

PDF
IICT-Big Data.pdf slideshow information to communication
PDF
IICT-Big Data.pdf slideshow Information to communication technology
PDF
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
PPTX
Big data may 2012
PDF
Different Career Paths in Data Science
PPTX
Introduction to Data Science 1113.pptx
PPTX
Big Data By Vijay Bhaskar Semwal
PPTX
Introduction to Data Science 5-13.pptx
PDF
Introduction to Data Science 5-13 (1).pdf
PPTX
Introduction to Data Science 5-13.pptx
PPTX
mkol.pptx
PPTX
PPT
Introduction to Data Mining and technologies .ppt
PPT
Opportunities in Data Science.ppt
PDF
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
PPTX
Big data
PPSX
PPTX
hjol.pptx
PDF
ODI Overview 2013-04-09
PDF
ODI overview (with audio narration)
IICT-Big Data.pdf slideshow information to communication
IICT-Big Data.pdf slideshow Information to communication technology
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Big data may 2012
Different Career Paths in Data Science
Introduction to Data Science 1113.pptx
Big Data By Vijay Bhaskar Semwal
Introduction to Data Science 5-13.pptx
Introduction to Data Science 5-13 (1).pdf
Introduction to Data Science 5-13.pptx
mkol.pptx
Introduction to Data Mining and technologies .ppt
Opportunities in Data Science.ppt
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Big data
hjol.pptx
ODI Overview 2013-04-09
ODI overview (with audio narration)
Ad

More from Ncib Lotfi (10)

PDF
Auto eda
PDF
Introduction: Intelligence Artificielle, Machine Learning et Deep Learning
PDF
Resume
PDF
Rapport stage
PDF
Cheat sheets for AI
PDF
ARTIFICIAL INTELLIGENCE & MACHINE LEARNING CAREER GUIDE
PDF
Optimisation
PDF
Use case stb
PDF
Regression
PDF
Decision trees
Auto eda
Introduction: Intelligence Artificielle, Machine Learning et Deep Learning
Resume
Rapport stage
Cheat sheets for AI
ARTIFICIAL INTELLIGENCE & MACHINE LEARNING CAREER GUIDE
Optimisation
Use case stb
Regression
Decision trees

Recently uploaded (20)

PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PPTX
Introduction to pro and eukaryotes and differences.pptx
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PDF
Empowerment Technology for Senior High School Guide
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
Trump Administration's workforce development strategy
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
Indian roads congress 037 - 2012 Flexible pavement
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PPTX
20th Century Theater, Methods, History.pptx
PDF
HVAC Specification 2024 according to central public works department
PDF
advance database management system book.pdf
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
A powerpoint presentation on the Revised K-10 Science Shaping Paper
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Introduction to pro and eukaryotes and differences.pptx
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Empowerment Technology for Senior High School Guide
Chinmaya Tiranga quiz Grand Finale.pdf
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Computing-Curriculum for Schools in Ghana
LDMMIA Reiki Yoga Finals Review Spring Summer
Trump Administration's workforce development strategy
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
David L Page_DCI Research Study Journey_how Methodology can inform one's prac...
Virtual and Augmented Reality in Current Scenario
Indian roads congress 037 - 2012 Flexible pavement
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
20th Century Theater, Methods, History.pptx
HVAC Specification 2024 according to central public works department
advance database management system book.pdf
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf

Introduction to Data Science