SlideShare a Scribd company logo
Paradoxes in
Data Science
Pier Paolo Ippolito
Hello!
▰ Data Scientist at SAS Institute
▰ Towards Data Science writer
▰ Freelancer
▰ MSc Artificial Intelligence
(University of Southampton)
2
My Work
3
Paradoxes
Paradoxes are a class of phenomena which arise when, although starting from
premises known as true, we derive some sort of logically unreasonable result. As
Machine Learning models create knowledge from data, this makes them
susceptible to possible cognitive paradoxes between training and testing.
4
Agenda
▰ Data Science without data: Modelling and
Simulations
▰ Simpson's Paradox
▰ Accuracy Paradox
▰ Learnability-Godel Paradox
▰ The Law of Unintended Consequences
5
1.
Modelling and Simulations in
Data Science
Using Data Science and Machine
Learning even when there is no data
available.
“Essentially all models are
wrong, but some are
useful.
7
George E. P. Box, Statistics for Experimenters,
second edition, 2005, page 440
8
Epidemic Modelling: COVID-19
Modelling Approaches
There are two main types of programmable simulation models:
▰ Mathematical Models: make use of mathematical symbols and relationships in
order to summarise processes. Compartmental Models in Epidemiology are a
typical example of mathematical models (e.g. SIR, SEIR, etc…).
▰ Process Models: are based on a list of steps handcrafted by the designer in
order to represent an environment (e.g. Agent-Based Modelling).
9
Compartmental Models
10
Compartmental Models
11
Agent Based Models
12
Agent Based Models
13
Web Application
✔ Streamlit interface
✔ Docker support
✔ Automatic update of COVID-19 stats every 24 hours
✔ International news updated every 2 hours
✔ Docker container hosted on the Azure Container registry, deployed on an Azure
Web App
14
15
Web Application
Practical Demonstration
✔ Complete Publication
✔ Extras
✔ Open Source Code
✔ Web Application
✔ Medium Article
16
Forest Fire Simulation
(Mesa)
17
Forest Fire Simulation
(Hash)
18
2.
Simpson’s Paradox
Simpson’s Paradox
«Simpson's paradox, is a phenomenon in probability and statistics, in which a trend
appears in several different groups of data but disappears or reverses when these
groups are combined. The paradox can be resolved when causal relations are
appropriately addressed in the statistical modelling.»
20
Simpson’s Paradox
21
Simpson’s Paradox
22
3.
Accuracy Paradox
Accuracy Paradox
24
“When a measure becomes
a target, it ceases to be a
good measure.
25
Charles Goodhart
Pareto Principle
26 Image Source
4.
Learnability-Godel Paradox
Learnability-Godel Paradox
Kurt Gödel is one of the most famous mathematicians of the last century. Undisputedly,
one of it’s most interesting theories are the two Gödel’s Incompleteness Theorems.:
▰ According to these theorems, nowadays Mathematics has some intrinsic limitations
which doesn’t allow it to state with certainty if a statement is true or not. The whole
field of Data Science is deeply interconnected with mathematical thinking and
therefore this leads us to a paradox (Learnability-Godel Paradox).
▰ Depending on if Gödel theory is right or wrong, this would demonstrate to be either
possible or not to make extrapolations from a population sample.
28
5.
The Law of Unintended
Consequences
The Law of Unintended
Consequences
30
Image Source
5.
Conclusion
Conclusion
In this presentation, I introduced some of the main paradoxes related to Data Science.
Although, many other common paradoxes could potentially have implications in Data
Science and Artificial Intelligence. Some examples are:
• Friendship paradox in Network Analysis
• Berkson’s Paradox
• Braess’s Paradox
• Moravec Paradox
• Birthday Paradox
32
Thank
you! Questions?
Contacts:
▰ LinkedIn
▰ GitHub
▰ Online Portfolio
▰ Towards Data Science
33

More Related Content

PDF
Lecture1 introduction to machine learning
PDF
Ai notes
PDF
Multi Layer Perceptron & Back Propagation
PPT
AI Lecture 7 (uncertainty)
PPTX
Knowledge representation and reasoning
PDF
Introduction of Knowledge Graphs
PPT
Machine learning
PDF
IBM Data Science Professional Certificate
Lecture1 introduction to machine learning
Ai notes
Multi Layer Perceptron & Back Propagation
AI Lecture 7 (uncertainty)
Knowledge representation and reasoning
Introduction of Knowledge Graphs
Machine learning
IBM Data Science Professional Certificate

What's hot (20)

PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
PDF
Federated Learning
PDF
Federated learning
PDF
L1 fuzzy sets & basic operations
PDF
Large Language Models Bootcamp
PPTX
Edge Artificial Intelligence in smart city development
PPTX
Grid search (parameter tuning)
PPTX
BIG DATA AND MACHINE LEARNING
PDF
Benchmark comparison of Large Language Models
PDF
Fuzzy Systems by using fuzzy set (Soft Computing)
PDF
Crop Yield Prediction using Machine Learning
PPSX
Reddix Group - Quantum AI - Presentation
PDF
Customizing LLMs
PPTX
Privacy by design
DOC
KBS Lecture Notes
PPTX
Machine learning
PDF
Natural Language Processing basics presentation
PPTX
An introduction to quantum machine learning.pptx
PDF
Machine Learning in 10 Minutes | What is Machine Learning? | Edureka
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Federated Learning
Federated learning
L1 fuzzy sets & basic operations
Large Language Models Bootcamp
Edge Artificial Intelligence in smart city development
Grid search (parameter tuning)
BIG DATA AND MACHINE LEARNING
Benchmark comparison of Large Language Models
Fuzzy Systems by using fuzzy set (Soft Computing)
Crop Yield Prediction using Machine Learning
Reddix Group - Quantum AI - Presentation
Customizing LLMs
Privacy by design
KBS Lecture Notes
Machine learning
Natural Language Processing basics presentation
An introduction to quantum machine learning.pptx
Machine Learning in 10 Minutes | What is Machine Learning? | Edureka
Ad

Similar to Paradoxes in Data Science (20)

PPTX
Lottery paradox csail-dec-2020
PDF
Lottery paradox csail-dec-2020.pptx
PPT
ppt
PPT
ppt
PDF
Numerical Analysis and Epistemology of Information
PDF
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
PDF
Bloom Agency at The Chief Analytics Officer Forum, Europe
PDF
Warmup_New.pdf
PDF
Human in the loop: Bayesian Rules Enabling Explainable AI
PPTX
Class 35: Self-Reference
PDF
Hunermund causal inference in ml and ai
PDF
Unlocking the Potential of Data Science: A Comprehensive Guide
PPTX
Top 10 Data Science Practitioner Pitfalls
PPTX
A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...
PPTX
artificial intelligence and uncertain reasoning
PDF
High-dimensional dynamics of generalization error in neural networks (Explained)
PDF
Data Science and Machine Learning for Non Programmers | Edureka
PDF
Imprecision in learning: an overview
PDF
Data Science Full Course | Edureka
PPTX
Data Science and Goodhart's Law
Lottery paradox csail-dec-2020
Lottery paradox csail-dec-2020.pptx
ppt
ppt
Numerical Analysis and Epistemology of Information
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Bloom Agency at The Chief Analytics Officer Forum, Europe
Warmup_New.pdf
Human in the loop: Bayesian Rules Enabling Explainable AI
Class 35: Self-Reference
Hunermund causal inference in ml and ai
Unlocking the Potential of Data Science: A Comprehensive Guide
Top 10 Data Science Practitioner Pitfalls
A Thinking Person's Guide to Using Big Data for Development: Myths, Opportuni...
artificial intelligence and uncertain reasoning
High-dimensional dynamics of generalization error in neural networks (Explained)
Data Science and Machine Learning for Non Programmers | Edureka
Imprecision in learning: an overview
Data Science Full Course | Edureka
Data Science and Goodhart's Law
Ad

More from Alexey Grigorev (20)

PDF
MLOps week 1 intro
PDF
Codementor - Data Science at OLX
PDF
Data Monitoring with whylogs
PDF
Data engineering zoomcamp introduction
PDF
AI in Fashion - Size & Fit - Nour Karessli
PDF
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
PDF
ML Zoomcamp 10 - Kubernetes
PDF
ML Zoomcamp 8 - Neural networks and deep learning
PDF
Algorithmic fairness
PDF
MLOps at OLX
PDF
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
PDF
ML Zoomcamp 5 - Model deployment
PDF
Introduction to Transformers for NLP - Olga Petrova
PDF
ML Zoomcamp 4 - Evaluation Metrics for Classification
PDF
ML Zoomcamp 3 - Machine Learning for Classification
PDF
ML Zoomcamp Week #2 Office Hours
PDF
AMLD2021 - ML in online marketplaces
PDF
ML Zoomcamp 2 - Slides
PDF
ML Zoomcamp 2.1 - Car Price Prediction Project
PDF
ML Zoomcamp - Course Overview and Logistics
MLOps week 1 intro
Codementor - Data Science at OLX
Data Monitoring with whylogs
Data engineering zoomcamp introduction
AI in Fashion - Size & Fit - Nour Karessli
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
ML Zoomcamp 10 - Kubernetes
ML Zoomcamp 8 - Neural networks and deep learning
Algorithmic fairness
MLOps at OLX
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
ML Zoomcamp 5 - Model deployment
Introduction to Transformers for NLP - Olga Petrova
ML Zoomcamp 4 - Evaluation Metrics for Classification
ML Zoomcamp 3 - Machine Learning for Classification
ML Zoomcamp Week #2 Office Hours
AMLD2021 - ML in online marketplaces
ML Zoomcamp 2 - Slides
ML Zoomcamp 2.1 - Car Price Prediction Project
ML Zoomcamp - Course Overview and Logistics

Recently uploaded (20)

PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Basic Mud Logging Guide for educational purpose
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
RMMM.pdf make it easy to upload and study
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Classroom Observation Tools for Teachers
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
VCE English Exam - Section C Student Revision Booklet
Anesthesia in Laparoscopic Surgery in India
Basic Mud Logging Guide for educational purpose
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Renaissance Architecture: A Journey from Faith to Humanism
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
O7-L3 Supply Chain Operations - ICLT Program
Microbial diseases, their pathogenesis and prophylaxis
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Final Presentation General Medicine 03-08-2024.pptx
RMMM.pdf make it easy to upload and study
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPH.pptx obstetrics and gynecology in nursing
O5-L3 Freight Transport Ops (International) V1.pdf
Classroom Observation Tools for Teachers
GDM (1) (1).pptx small presentation for students
Abdominal Access Techniques with Prof. Dr. R K Mishra
VCE English Exam - Section C Student Revision Booklet

Paradoxes in Data Science