SlideShare a Scribd company logo
Data Scientists:Myths &
Mathemagical Powers
      James Kobielus
James Kobielus shoots down
10 myths about Data Scientists



      “Data Scientists: Myths and Mathemagical Powers,”
    James Kobielus, Thinking Inside the Box, June 29, 2012
Myth #1




Data scientists are mythical
 beings, like the unicorns.
IBMbigdatahub.com
IBMbigdatahub.com
Myth #2




 Data scientists are an elite
bunch of precious eggheads.
Data scientists get their fingernails
  dirty dumping piles of data into
 analytical sandboxes, cleansing,
  and sifting through it for useful
patterns that may or may not exist.
  Then, they do it all over again.



              Reality #2    IBMbigdatahub.com
Data scientists get their fingernails
                  It’s ofte
               nu piles n mind- into
  dirty dumpingm
                     bingly
                           of data
 analytical sandboxes, detailed
                 grunt       cleansing,
             the sp      work,
                     ort of a n useful
  and sifting through it for ot
                             rm
              data por may chairexist.
patterns that may hiloso not
                             phers.
  Then, they do it all over again.



              Reality #2     IBMbigdatahub.com
Myth #3




Data scientists are a nouveau
   fad that will soon fade.
The term “data scientist” has been
around for years, and the various
   advanced analytics specialties
  that fall under it are even older.
Recently, the term has been used
 in the convergence of disciplines
    that have become super-hot.


             Reality #3    IBMbigdatahub.com
The term “data scientist” has been
around for years, and the various
   advanced analytics specialties
  that fall growth
               under      n job
                        iit are even older.
     Ste  ady the academic been used
Recently,and term has.
      st i ngs              iable
                   unden
    lithe convergence of disciplines
 in ricula is
    c ur               fad.
    that Thi   s is no
             have become super-hot.


                Reality #3       IBMbigdatahub.com
Myth #4




Data scientists are all just
  PhD statisticians who
 failed to make tenure.
Many data scientists acquired
 their quantitative and statistical
   modeling skills in college, but
   pursued degrees in business
  administration, economics and
engineering. They actually know
    about business problems.


            Reality #4     IBMbigdatahub.com
M ny
  Many dataascientists acquired
                   data s
                                c entis
            you’ll and istatistical
 their quantitativenco
                   e                    ts
            the wo           unter
   modeling skills rking
                    in college, but  in
          are bu                world
                 sine in business
   pursued degreesss dom
               sp e c ia            ain
  administration, economics and
                         l i st s !
engineering. They actually know
    about business problems.


               Reality #4       IBMbigdatahub.com
Myth #5




  Data scientists are just BI
specialists with fancier titles.
Many longtime BI power users
 are, in fact, data scientists of a
 sort. They are business domain
  specialists whose jobs involve
multivariate analysis, forecasting,
what-if modeling, and simulation.



             Reality #5   IBMbigdatahub.com
nt
                    meBI power users
 Many develop ey
       er longtime
 Care            i f th
                tdata scientists of a
 are,yintall ou speed
    a s fact, to
  m           p
           y uare business domain
 sort.t They e Hadoop
 do n’ sta ik
  on to ictiv
  specialists e mod     e ing.
        pics l whose ljobs involve
      pred
multivariate analysis, forecasting,
and
what-if modeling, and simulation.



             Reality #5     IBMbigdatahub.com
Myth #6




 Data scientists aren’t really
scientists in any meaningful
     sense of the word.
Statistical controls are the
  bedrock of true science—the core
responsibility of the data scientist. If
 data scientists are confirming their
 findings through statistical controls
and real-world experiments, they’re
     scientists, plain and simple.


               Reality #6     IBMbigdatahub.com
Statistical controls are the
  bedrock of true science—the core
responsibility of the data scientist. If
                  True s
                         cience
 data scientistsnare confirming their
                  othing         is
                           withou
 findings throughvstatistical tcontrols
               obser
                     ationa
                             l data
and real-world experiments, .they’re
     scientists, plain and simple.


               Reality #6     IBMbigdatahub.com
Myth #7




 Data scientists need fancy,
 expensive statistical power
tools to get their work done.
The job of the data scientists is to
 look for hidden patterns. They can
accomplish this through user-friendly
  visualization tools, search-driven
 BI tools and other approaches that
   don’t require a deep mastery of
          statistical analysis.


              Reality #7    IBMbigdatahub.com
The job of the data scientists is to
 look for hidden patterns. They can
accomplish rthisfo ory  r cost- user-friendly
               a ket through
      The m explorat
  visualization tools, y
           ctive            n search-driven
      effe           as ma g
 BI tools tools h cludin
        BI and other approaches that
   don’t end    ors, ina deep mastery of
        v require gnos.
             I BM C o
            statistical analysis.


                 Reality #7      IBMbigdatahub.com
Myth #8




Data scientists simply pour
data into Hadoop and pull
out mind-blowing insights.
The data scientist will be the
first to tell you that Hadoop is
just another platform for deep
      exploration into data.




           Reality #8    IBMbigdatahub.com
There
                      i n’t a
 The data scientistswill be the
              Ouija           magic
                     board
first to tell youich
               wh that Hadoop h
                             throug is
                      the big
just anotherspirits sp forddeep
                platform          ata
                        eak to
                 me e m
      exploration rintoodata. s   u
                           rtals.




             Reality #8       IBMbigdatahub.com
Myth #9




 Data scientists are analytics
junkies who couldn’t care less
 about business applications.
If you spend time with any real-
  world data scientist, they’ll bend
    your ear discussing how they
tackled a specific business problem,
 such as reducing customer churn,
  targeting offers across channels,
    and mitigating financial risks.


             Reality #9    IBMbigdatahub.com
If you spend time withnany real-
                              e t i st s
                       ta sci
  world data ost da rds. They bend
            Mscientist, they’ll
             are  n’t ne
    your ear discussing how    egarthey d
                       e ople r ingo
            kn  ow pbusinessl problem,
tackled a specific big data on.
            al l th is       g jarg churn,
                       u si n
 such as reducing fcustomer
             as con
  targeting offers across channels,
    and mitigating financial risks.


               Reality #9      IBMbigdatahub.com
Myth #10




Data scientists don’t have any
responsibilities that force them
   out of their ivory towers.
That used to be the case. However,
 as next best action and real-world
experiments become ubiquitous, the
  data scientist is evolving into the
  role that stokes, tweaks and fuels
        the operational engine.



             Reality #10   IBMbigdatahub.com
That used to be the case. However,
       Da best action and real-world
 as nextta scien
      analy        tists te
                            s the
            tic become t ubiquitous, the
experiments- cent
       at the        ric mo
                              dels
  data scientistrt oevolving into the
               hea is
       busine           f agile
               ss pro tweaks and fuels
  role that stokes,cess
                            es.
        the operational engine.



              Reality #10     IBMbigdatahub.com
For more from James Kobielus and
  other big data thought leaders,
     visit The Big Data Hub at
       IBMbigdatahub.com

More Related Content

PDF
generative-ai-fundamentals and Large language models
PPTX
PDF
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
PDF
Introduction to Data Science
PDF
Content In The Age of AI
PPTX
What is big data?
PPTX
Data science
PDF
10 Limitations of Large Language Models and Mitigation Options
generative-ai-fundamentals and Large language models
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Introduction to Data Science
Content In The Age of AI
What is big data?
Data science
10 Limitations of Large Language Models and Mitigation Options

What's hot (20)

PPTX
Data science & data scientist
PDF
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
PDF
Machine Learning for Fraud Detection
PPTX
Our big data
PDF
Large Language Models - Chat AI.pdf
PDF
LLMs Bootcamp
PDF
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
PPTX
Text features
PDF
Prompt Engineering
PDF
Data science
PDF
Estimating the Total Costs of Your Cloud Analytics Platform 
PPTX
PDF
On the Application of AI for Failure Management: Problems, Solutions and Algo...
PPTX
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
KEY
Intro to Data Science for Enterprise Big Data
PDF
Using Large Language Models in 10 Lines of Code
PPT
Natural language procssing
PDF
PMI를 활용한 twitter 데이터에서의 이슈 키워드 추출
PPTX
Data science & data scientist
The Rise of the LLMs - How I Learned to Stop Worrying & Love the GPT!
Machine Learning for Fraud Detection
Our big data
Large Language Models - Chat AI.pdf
LLMs Bootcamp
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Text features
Prompt Engineering
Data science
Estimating the Total Costs of Your Cloud Analytics Platform 
On the Application of AI for Failure Management: Problems, Solutions and Algo...
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Intro to Data Science for Enterprise Big Data
Using Large Language Models in 10 Lines of Code
Natural language procssing
PMI를 활용한 twitter 데이터에서의 이슈 키워드 추출
Ad

Viewers also liked (20)

PPTX
Artificial Intelligence Presentation
PDF
Hands-on Deep Learning in Python
PDF
A Statistician's View on Big Data and Data Science (Version 1)
PDF
How to Interview a Data Scientist
PDF
Data By The People, For The People
PPTX
Hadoop and Machine Learning
PDF
10 Lessons Learned from Building Machine Learning Systems
PDF
How to Become a Data Scientist
PDF
A tutorial on deep learning at icml 2013
PPTX
Deep Learning for Natural Language Processing
PDF
Introduction to Mahout and Machine Learning
PDF
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
PDF
Machine Learning and Data Mining: 12 Classification Rules
PPTX
Tutorial on Deep learning and Applications
PDF
Tips for data science competitions
PPTX
Deep neural networks
PPTX
Introduction to Big Data/Machine Learning
PPTX
Artificial neural network
PPTX
10 R Packages to Win Kaggle Competitions
PDF
Robots
Artificial Intelligence Presentation
Hands-on Deep Learning in Python
A Statistician's View on Big Data and Data Science (Version 1)
How to Interview a Data Scientist
Data By The People, For The People
Hadoop and Machine Learning
10 Lessons Learned from Building Machine Learning Systems
How to Become a Data Scientist
A tutorial on deep learning at icml 2013
Deep Learning for Natural Language Processing
Introduction to Mahout and Machine Learning
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Machine Learning and Data Mining: 12 Classification Rules
Tutorial on Deep learning and Applications
Tips for data science competitions
Deep neural networks
Introduction to Big Data/Machine Learning
Artificial neural network
10 R Packages to Win Kaggle Competitions
Robots
Ad

Similar to Myths and Mathemagical Superpowers of Data Scientists (20)

PDF
Myths and Mathemagical Superpowers of Data Scientists
PDF
The REAL face of Big Data
PDF
How can Data Science benefit your business?
PPTX
Ds article ppt
PPTX
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
PDF
Data Science: lesson01_intro-to-ds-and-ml.pdf
PPTX
f6fdb0a728638af5d8684a32b3dc2ee83259.pptx
PPTX
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
PDF
Data Scientist By: Professor Lili Saghafi
PDF
Industry and academic partnerships july 2015 final
PDF
What is Data Science? Daniel D Gutierrez
PDF
Career in Data Science (July 2017, DTLA)
PPTX
intro to data science Clustering and visualization of data science subfields ...
PDF
Intro to Data Science
PPTX
In-Depth Data Analytics
PDF
Getting started in Data Science (April 2017, Los Angeles)
PDF
Getting started in data science (4:3)
PDF
Getting started in data science (4:3)
PPTX
Week1day2 (1)
PDF
Introduction on Data Science
Myths and Mathemagical Superpowers of Data Scientists
The REAL face of Big Data
How can Data Science benefit your business?
Ds article ppt
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
Data Science: lesson01_intro-to-ds-and-ml.pdf
f6fdb0a728638af5d8684a32b3dc2ee83259.pptx
Lecture 1.13 & 1.14 &1.15_Business Profiles in Big Data.pptx
Data Scientist By: Professor Lili Saghafi
Industry and academic partnerships july 2015 final
What is Data Science? Daniel D Gutierrez
Career in Data Science (July 2017, DTLA)
intro to data science Clustering and visualization of data science subfields ...
Intro to Data Science
In-Depth Data Analytics
Getting started in Data Science (April 2017, Los Angeles)
Getting started in data science (4:3)
Getting started in data science (4:3)
Week1day2 (1)
Introduction on Data Science

More from David Pittman (9)

PDF
Cloud Infrastructure & IT Optimization Expo Highlights
PDF
Data, Analytics and the Insurance Industry
PDF
Big Data & Analytics and the Retail Industry: Luxottica
PDF
Seattle Children's Hospital turns Big Data into better care
PDF
First Tennessee Bank: applying analytics to drive higher ROI from market prog...
PPTX
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
PDF
Infographic: Big Data Exploration
PDF
Big Data in Retail - Examples in Action
PDF
Analytics: The Real-world Use of Big Data
Cloud Infrastructure & IT Optimization Expo Highlights
Data, Analytics and the Insurance Industry
Big Data & Analytics and the Retail Industry: Luxottica
Seattle Children's Hospital turns Big Data into better care
First Tennessee Bank: applying analytics to drive higher ROI from market prog...
Acquire, grow and retain customers with IBM Big Data & Analytics - Client Exa...
Infographic: Big Data Exploration
Big Data in Retail - Examples in Action
Analytics: The Real-world Use of Big Data

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Advanced IT Governance
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
KodekX | Application Modernization Development
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Modernizing your data center with Dell and AMD
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Advanced Soft Computing BINUS July 2025.pdf
PPT
Teaching material agriculture food technology
Machine learning based COVID-19 study performance prediction
MYSQL Presentation for SQL database connectivity
Chapter 3 Spatial Domain Image Processing.pdf
Unlocking AI with Model Context Protocol (MCP)
Advanced IT Governance
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
KodekX | Application Modernization Development
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Reach Out and Touch Someone: Haptics and Empathic Computing
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Spectral efficient network and resource selection model in 5G networks
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Modernizing your data center with Dell and AMD
“AI and Expert System Decision Support & Business Intelligence Systems”
NewMind AI Weekly Chronicles - August'25 Week I
Understanding_Digital_Forensics_Presentation.pptx
Advanced Soft Computing BINUS July 2025.pdf
Teaching material agriculture food technology

Myths and Mathemagical Superpowers of Data Scientists

  • 1. Data Scientists:Myths & Mathemagical Powers James Kobielus
  • 2. James Kobielus shoots down 10 myths about Data Scientists “Data Scientists: Myths and Mathemagical Powers,” James Kobielus, Thinking Inside the Box, June 29, 2012
  • 3. Myth #1 Data scientists are mythical beings, like the unicorns.
  • 6. Myth #2 Data scientists are an elite bunch of precious eggheads.
  • 7. Data scientists get their fingernails dirty dumping piles of data into analytical sandboxes, cleansing, and sifting through it for useful patterns that may or may not exist. Then, they do it all over again. Reality #2 IBMbigdatahub.com
  • 8. Data scientists get their fingernails It’s ofte nu piles n mind- into dirty dumpingm bingly of data analytical sandboxes, detailed grunt cleansing, the sp work, ort of a n useful and sifting through it for ot rm data por may chairexist. patterns that may hiloso not phers. Then, they do it all over again. Reality #2 IBMbigdatahub.com
  • 9. Myth #3 Data scientists are a nouveau fad that will soon fade.
  • 10. The term “data scientist” has been around for years, and the various advanced analytics specialties that fall under it are even older. Recently, the term has been used in the convergence of disciplines that have become super-hot. Reality #3 IBMbigdatahub.com
  • 11. The term “data scientist” has been around for years, and the various advanced analytics specialties that fall growth under n job iit are even older. Ste ady the academic been used Recently,and term has. st i ngs iable unden lithe convergence of disciplines in ricula is c ur fad. that Thi s is no have become super-hot. Reality #3 IBMbigdatahub.com
  • 12. Myth #4 Data scientists are all just PhD statisticians who failed to make tenure.
  • 13. Many data scientists acquired their quantitative and statistical modeling skills in college, but pursued degrees in business administration, economics and engineering. They actually know about business problems. Reality #4 IBMbigdatahub.com
  • 14. M ny Many dataascientists acquired data s c entis you’ll and istatistical their quantitativenco e ts the wo unter modeling skills rking in college, but in are bu world sine in business pursued degreesss dom sp e c ia ain administration, economics and l i st s ! engineering. They actually know about business problems. Reality #4 IBMbigdatahub.com
  • 15. Myth #5 Data scientists are just BI specialists with fancier titles.
  • 16. Many longtime BI power users are, in fact, data scientists of a sort. They are business domain specialists whose jobs involve multivariate analysis, forecasting, what-if modeling, and simulation. Reality #5 IBMbigdatahub.com
  • 17. nt meBI power users Many develop ey er longtime Care i f th tdata scientists of a are,yintall ou speed a s fact, to m p y uare business domain sort.t They e Hadoop do n’ sta ik on to ictiv specialists e mod e ing. pics l whose ljobs involve pred multivariate analysis, forecasting, and what-if modeling, and simulation. Reality #5 IBMbigdatahub.com
  • 18. Myth #6 Data scientists aren’t really scientists in any meaningful sense of the word.
  • 19. Statistical controls are the bedrock of true science—the core responsibility of the data scientist. If data scientists are confirming their findings through statistical controls and real-world experiments, they’re scientists, plain and simple. Reality #6 IBMbigdatahub.com
  • 20. Statistical controls are the bedrock of true science—the core responsibility of the data scientist. If True s cience data scientistsnare confirming their othing is withou findings throughvstatistical tcontrols obser ationa l data and real-world experiments, .they’re scientists, plain and simple. Reality #6 IBMbigdatahub.com
  • 21. Myth #7 Data scientists need fancy, expensive statistical power tools to get their work done.
  • 22. The job of the data scientists is to look for hidden patterns. They can accomplish this through user-friendly visualization tools, search-driven BI tools and other approaches that don’t require a deep mastery of statistical analysis. Reality #7 IBMbigdatahub.com
  • 23. The job of the data scientists is to look for hidden patterns. They can accomplish rthisfo ory r cost- user-friendly a ket through The m explorat visualization tools, y ctive n search-driven effe as ma g BI tools tools h cludin BI and other approaches that don’t end ors, ina deep mastery of v require gnos. I BM C o statistical analysis. Reality #7 IBMbigdatahub.com
  • 24. Myth #8 Data scientists simply pour data into Hadoop and pull out mind-blowing insights.
  • 25. The data scientist will be the first to tell you that Hadoop is just another platform for deep exploration into data. Reality #8 IBMbigdatahub.com
  • 26. There i n’t a The data scientistswill be the Ouija magic board first to tell youich wh that Hadoop h throug is the big just anotherspirits sp forddeep platform ata eak to me e m exploration rintoodata. s u rtals. Reality #8 IBMbigdatahub.com
  • 27. Myth #9 Data scientists are analytics junkies who couldn’t care less about business applications.
  • 28. If you spend time with any real- world data scientist, they’ll bend your ear discussing how they tackled a specific business problem, such as reducing customer churn, targeting offers across channels, and mitigating financial risks. Reality #9 IBMbigdatahub.com
  • 29. If you spend time withnany real- e t i st s ta sci world data ost da rds. They bend Mscientist, they’ll are n’t ne your ear discussing how egarthey d e ople r ingo kn ow pbusinessl problem, tackled a specific big data on. al l th is g jarg churn, u si n such as reducing fcustomer as con targeting offers across channels, and mitigating financial risks. Reality #9 IBMbigdatahub.com
  • 30. Myth #10 Data scientists don’t have any responsibilities that force them out of their ivory towers.
  • 31. That used to be the case. However, as next best action and real-world experiments become ubiquitous, the data scientist is evolving into the role that stokes, tweaks and fuels the operational engine. Reality #10 IBMbigdatahub.com
  • 32. That used to be the case. However, Da best action and real-world as nextta scien analy tists te s the tic become t ubiquitous, the experiments- cent at the ric mo dels data scientistrt oevolving into the hea is busine f agile ss pro tweaks and fuels role that stokes,cess es. the operational engine. Reality #10 IBMbigdatahub.com
  • 33. For more from James Kobielus and other big data thought leaders, visit The Big Data Hub at IBMbigdatahub.com