data science @ The New York Times
chris.wiggins@columbia.edu
chris.wiggins@nytimes.com
@chrishwiggins
references: bit.ly/icerm
data science @ The New York Times
references: bit.ly/icerm
data science @ The New York Times
references: bit.ly/icerm
data science @ The New York Times
references: bit.ly/icerm
data science @ The New York Times
references: bit.ly/icerm
“data science”
jobs, jobs, jobs
references: bit.ly/icerm
“data science”
jobs, jobs, jobs
references: bit.ly/icerm
“data science”
jobs, jobs, jobs
references: bit.ly/icerm
data science: mindset & toolset
drew conway, 2010
references: bit.ly/icerm
modern history:
2009
references: bit.ly/icerm
“data science”
blogs, blogs, blogs
references: bit.ly/icerm
“data science”
blogs, blogs, blogs
The first time I heard "data science" was in 2007 while reading a proposal that my adviser had passed along, outlining an academic program similar to what we think of as data
science.
The first time I heard "data science" was in 2007 while
reading a proposal that my adviser had passed along,
outlining an academic program similar to what we think of
as data science.
references: bit.ly/icerm
“data science”
blogs, blogs, blogs
references: bit.ly/icerm
“data science”
ancient history: 2001
references: bit.ly/icerm
“data science”
ancient history: 2001
references: bit.ly/icerm
data science
context
references: bit.ly/icerm
home schooled
references: bit.ly/icerm
PhD in topology
references: bit.ly/icerm
“By the end of late 1945, I was a
statistician rather than a topologist”
references: bit.ly/icerm
invented: “bit”
references: bit.ly/icerm
invented: “software”
references: bit.ly/icerm
invented: “FFT”
references: bit.ly/icerm
“the progenitor of data science.” - @mshron
references: bit.ly/icerm
“The Future of Data Analysis,” 1962
John W. Tukey
references: bit.ly/icerm
introduces:
“Exploratory data anlaysis”
references: bit.ly/icerm
Tukey 1965, via John Chambers
references: bit.ly/icerm
TUKEY BEGAT S WHICH BEGAT R
references: bit.ly/icerm
Tukey 1972
references: bit.ly/icerm
? 1972
references: bit.ly/icerm
Jerome H. Friedman
references: bit.ly/icerm
Tukey 1975
In 1975, while at Princeton, Tufte was asked to teach a
statistics course to a group of journalists who were visiting
the school to study economics. He developed a set of
readings and lectures on statistical graphics, which he
further developed in joint seminars he subsequently taught
with renowned statistician John Tukey (a pioneer in the field
of information design). These course materials became the
foundation for his first book on information design, The
Visual Display of Quantitative Information
references: bit.ly/icerm
TUKEY BEGAT VDQI
references: bit.ly/icerm
Tukey 1977
references: bit.ly/icerm
TUKEY BEGAT EDA
references: bit.ly/icerm
fast forward -> 2001
references: bit.ly/icerm
“The primary agents for change should be
university departments themselves.”
references: bit.ly/icerm
data science @ The New York Timeshistories
1. in academia -> Bell: as heretical
statistics (see also Breiman)
2. in industry: as job description
historical rant: bit.ly/data-rant
data science @ The New York Times
chris.wiggins@columbia.edu
chris.wiggins@nytimes.com
@chrishwiggins
references: bit.ly/icerm
biology: 1892 vs. 1995
biology changed for good.
references: bit.ly/icerm
genetics: 1837 vs. 2012
ML toolset; data science mindset
references: bit.ly/icerm
genetics: 1837 vs. 2012
references: bit.ly/icerm
genetics: 1837 vs. 2012
ML toolset; data science mindset
arxiv.org/abs/1105.5821 ; github.com/rajanil/mkboost
data science: mindset & toolset
references: bit.ly/icerm
1851
references: bit.ly/icerm
news: 20th century
church state
references: bit.ly/icerm
church
references: bit.ly/icerm
church
references: bit.ly/icerm
church
news: 20th century
church state
references: bit.ly/icerm
news: 21st century
church state
engineering
references: bit.ly/icerm
1851 1996
newspapering: 1851 vs. 1996
references: bit.ly/icerm
example:
millions of views per hour2015
references: bit.ly/icerm
data science: the web
references: bit.ly/icerm
data science: the web
is your “online presence”
references: bit.ly/icerm
data science: the web
is a microscope
references: bit.ly/icerm
data science: the web
is an experimental tool
references: bit.ly/icerm
data science: the web
is an optimization tool
references: bit.ly/icerm
1851 1996
newspapering: 1851 vs. 1996 vs. 2008
2008
references: bit.ly/icerm
“a startup is a temporary organization in search of a
repeatable and scalable business model” —Steve Blank
references: bit.ly/icerm
every publisher is now a startup
references: bit.ly/icerm
data history / data science @ NYT
news: 21st century
church state
engineering
references: bit.ly/icerm
news: 21st century
church state
engineering
references: bit.ly/icerm
learnings
references: bit.ly/icerm
learnings
- supervised learning
- unsupervised learning
- reinforcement learning
references: bit.ly/icerm
learnings
- supervised learning
- unsupervised learning
- reinforcement learning
cf. modelingsocialdata.org
references: bit.ly/icerm
stats.stackexchange.com
references: bit.ly/icerm
from “are you a bayesian or a frequentist”
—michael jordan
L =
NX
i=1
' (yif(xi; )) + || ||
supervised learning, e.g.,
cf. modelingsocialdata.org
supervised learning, e.g.,
“the funnel”
cf. modelingsocialdata.org
interpretable supervised learning
supercoolstuff
cf. modelingsocialdata.org
interpretable supervised learning
supercoolstuff
cf. modelingsocialdata.org
arxiv.org/abs/q-bio/0701021
optimization & learning, e.g.,
“How The New York Times Works “popular mechanics, 2015
optimization & prediction, e.g.,
“How The New York Times Works “popular mechanics, 2015
(some models)
(somemoneys)
recommendation as supervised learning
recommendation as predictive modeling
bit.ly/AlexCTM
unsupervised learning, e.g,
cf. daeilkim.com ; import bnpy
modeling your audience
bit.ly/Hughes-Kim-Sudderth-AISTATS15
modeling your audience
(optimization, ultimately)
also allows recommendation as inference
modeling your audience
prescriptive modeling, e.g,
prescriptive modeling, e.g,
Reporting
Learning
Test
Optimizing
Exploreunsupervised:
supervised:
reinforcement:
Reporting
Learning
Test
Optimizing
Exploreunsupervised:
supervised:
reinforcement:
common requirements in
data science:
common requirements in
data science:
1. people
2. ideas
3. things
cf. USAF
things:
what does DS team deliver?
things:
what does DS team deliver?
- build data prototypes
- build APIs
- impact roadmaps
- build data prototypes
- build data prototypes
cf. daeilkim.com
- build data prototypes
cf. daeilkim.com
- in puppet, w/python2.7
- collaboration w/pers. team
- build APIs
- impact roadmaps
flickr/McJex
data science: ideas
data skills
- data engineering
- data science
- data visualization
- data product
- data multiliteracies
- data embeds
cf. “data scientists at work”, ch 1
data skills
- data engineering
- data science
- data visualization
- data product
- data multiliteracies
- data embeds
cf. “data scientists at work”, ch 1
data science: people
- new mindset > new toolset
data science: people
summary:
pay attention to:
1. people
2. ideas
3. things
cf. USAF
thanks to the data science team!
data science @ The New York Times
chris.wiggins@columbia.edu
chris.wiggins@nytimes.com
@chrishwiggins

More Related Content

PDF
data science history / data science @ NYT
PDF
intro data science at NYT 2015-01-22
PDF
data science in academia and the real world
PDF
data science: past present & future [American Statistical Association (ASA) C...
PDF
Chris Wiggins: "engagement & reality"
PDF
data science: past, present, and future
PDF
DataEngConf: Data Science at the New York Times by Chris Wiggins
PDF
Short and Long of Data Driven Innovation
data science history / data science @ NYT
intro data science at NYT 2015-01-22
data science in academia and the real world
data science: past present & future [American Statistical Association (ASA) C...
Chris Wiggins: "engagement & reality"
data science: past, present, and future
DataEngConf: Data Science at the New York Times by Chris Wiggins
Short and Long of Data Driven Innovation

What's hot (17)

PDF
Dharmendra Rama
PDF
Top 5 Web Trends Of 2009 Internet Of Things
PPTX
Open Data Journalism
PPT
Computer assisted research and reporting
PPTX
Roger hoerl say award presentation 2013
ODP
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
PDF
10 ways AI can be used for investigations
PDF
Teaching AI in data journalism
PPT
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
PDF
The Management Accountant in a Digital World The interface of strategy, tech...
PPTX
Data'n'Press - Editors Lab projects
PDF
Storytelling in the database era: uncertainty and science reporting
PDF
Data! Action! Data journalism issues to watch in the next 10 years
PPT
It's the people's data
PDF
"data: past, present, and future" day 1 lecture 2020-01-20
PPT
Digital Scholarship Seminar: Implications of Data for the 21st-century Humanist
PPTX
How news organizations are using data to tell
Dharmendra Rama
Top 5 Web Trends Of 2009 Internet Of Things
Open Data Journalism
Computer assisted research and reporting
Roger hoerl say award presentation 2013
Big Data, Smart Algorithms, and Market Power - A Computer Scientist's Perspec...
10 ways AI can be used for investigations
Teaching AI in data journalism
Big Data, Smart Algorithms, and Market Power - A Computer Scientist’s Perspec...
The Management Accountant in a Digital World The interface of strategy, tech...
Data'n'Press - Editors Lab projects
Storytelling in the database era: uncertainty and science reporting
Data! Action! Data journalism issues to watch in the next 10 years
It's the people's data
"data: past, present, and future" day 1 lecture 2020-01-20
Digital Scholarship Seminar: Implications of Data for the 21st-century Humanist
How news organizations are using data to tell
Ad

Viewers also liked (10)

PDF
高中B1地圖投影判讀 操作學習單
PPTX
You Are Responsible For Your Own Life
DOCX
Penguatan biro hukum dalam melindungi aparat pemerintah
PPTX
Watch motocross budds creek national live
PDF
高中B1農業主題地圖製作
PPT
Les 3 b___verdelen
PPTX
10A | Grupo 5
PDF
VC Workshop - two days
PDF
Gcc switchgear market 2011 - 2021 brochure
PDF
Interactive and Digital Media Start-up Incubation in Singapore
高中B1地圖投影判讀 操作學習單
You Are Responsible For Your Own Life
Penguatan biro hukum dalam melindungi aparat pemerintah
Watch motocross budds creek national live
高中B1農業主題地圖製作
Les 3 b___verdelen
10A | Grupo 5
VC Workshop - two days
Gcc switchgear market 2011 - 2021 brochure
Interactive and Digital Media Start-up Incubation in Singapore
Ad

Similar to data history / data science @ NYT (20)

PDF
data science @NYT ; inaugural Data Science Initiative Lecture
PDF
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
PPTX
Pushing Machine Learning Down the Security Stack to Make It More Effective fo...
PDF
50YearsDataScience.pdf
PDF
Data Science: Notes and Toolkits
PDF
Why Data Science is a Science
PDF
2019 June 27 - Big data and data science
PDF
Predictive Analytics - BarCamp Boston 2011
PDF
Data Center Computing for Data Science: an evolution of machines, middleware,...
PPTX
Real-time applications of Data Science.pptx
PPTX
Data_Science_Applications_&_Use_Cases.pptx
PDF
Data_Science_Applications_&_Use_Cases.pdf
PDF
AI for Marking Industry application for.pdf
PPTX
Roles of Datascience.pptx
PPTX
Data_Science_Applications_&_Use_Cases.pptx
PDF
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
PPTX
Workshop_Presentation.pptx
PDF
Causal networks, learning and inference - Introduction
PPTX
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
PPTX
Joe keating - world legal summit - ethical data science
data science @NYT ; inaugural Data Science Initiative Lecture
ICPSR - Complex Systems Models in the Social Sciences - Lecture 6 - Professor...
Pushing Machine Learning Down the Security Stack to Make It More Effective fo...
50YearsDataScience.pdf
Data Science: Notes and Toolkits
Why Data Science is a Science
2019 June 27 - Big data and data science
Predictive Analytics - BarCamp Boston 2011
Data Center Computing for Data Science: an evolution of machines, middleware,...
Real-time applications of Data Science.pptx
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pdf
AI for Marking Industry application for.pdf
Roles of Datascience.pptx
Data_Science_Applications_&_Use_Cases.pptx
Data Science for Beginner by Chetan Khatri and Deptt. of Computer Science, Ka...
Workshop_Presentation.pptx
Causal networks, learning and inference - Introduction
NOVA Data Science Meetup 8-10-2017 Presentation - State of Data Science Educa...
Joe keating - world legal summit - ethical data science

More from chris wiggins (15)

PDF
data science at the new york times
PDF
"data hum: a core approach to the ethics of data"
PDF
a mission-driven approach to personalizing the customer journey
PDF
Data Science at The New York Times: what industry can learn from us; what we ...
PDF
Data Science at The New York Times
PDF
history and ethics of data
PDF
"data: past, present, and future" lecture 1 (intro) 1/22/19
PDF
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
PDF
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
PDF
Data: Past, Present, and Future (Lecture 1, Spring 2018)
PDF
Machine Learning Summer School 2016
PDF
lean + design thinking in building data products
PDF
Lean workbench 2013-07-24
PDF
Wiggins 2013 05-29
PDF
variational bayes in biophysics
data science at the new york times
"data hum: a core approach to the ethics of data"
a mission-driven approach to personalizing the customer journey
Data Science at The New York Times: what industry can learn from us; what we ...
Data Science at The New York Times
history and ethics of data
"data: past, present, and future" lecture 1 (intro) 1/22/19
"data: past, present, and future" lab 2 (EDA) notes by Prof. Matt Jones
Data: Past, Present, and Future (Cornell Digital Life Seminar on Data Literac...
Data: Past, Present, and Future (Lecture 1, Spring 2018)
Machine Learning Summer School 2016
lean + design thinking in building data products
Lean workbench 2013-07-24
Wiggins 2013 05-29
variational bayes in biophysics

Recently uploaded (20)

PPTX
ai_satellite_crop_management_20250815030350.pptx
DOC
T Pandian CV Madurai pandi kokkaf illaya
PPT
Chapter 1 - Introduction to Manufacturing Technology_2.ppt
PDF
distributed database system" (DDBS) is often used to refer to both the distri...
PDF
Unit I -OPERATING SYSTEMS_SRM_KATTANKULATHUR.pptx.pdf
PDF
Soil Improvement Techniques Note - Rabbi
PPTX
Module 8- Technological and Communication Skills.pptx
PDF
August 2025 - Top 10 Read Articles in Network Security & Its Applications
PDF
Cryptography and Network Security-Module-I.pdf
PDF
20250617 - IR - Global Guide for HR - 51 pages.pdf
PDF
First part_B-Image Processing - 1 of 2).pdf
PDF
UEFA_Carbon_Footprint_Calculator_Methology_2.0.pdf
PDF
LOW POWER CLASS AB SI POWER AMPLIFIER FOR WIRELESS MEDICAL SENSOR NETWORK
PDF
August -2025_Top10 Read_Articles_ijait.pdf
PPTX
Chapter 2 -Technology and Enginerring Materials + Composites.pptx
PDF
Unit1 - AIML Chapter 1 concept and ethics
PPTX
Measurement Uncertainty and Measurement System analysis
PPTX
Principal presentation for NAAC (1).pptx
PPTX
Petroleum Refining & Petrochemicals.pptx
PPTX
mechattonicsand iotwith sensor and actuator
ai_satellite_crop_management_20250815030350.pptx
T Pandian CV Madurai pandi kokkaf illaya
Chapter 1 - Introduction to Manufacturing Technology_2.ppt
distributed database system" (DDBS) is often used to refer to both the distri...
Unit I -OPERATING SYSTEMS_SRM_KATTANKULATHUR.pptx.pdf
Soil Improvement Techniques Note - Rabbi
Module 8- Technological and Communication Skills.pptx
August 2025 - Top 10 Read Articles in Network Security & Its Applications
Cryptography and Network Security-Module-I.pdf
20250617 - IR - Global Guide for HR - 51 pages.pdf
First part_B-Image Processing - 1 of 2).pdf
UEFA_Carbon_Footprint_Calculator_Methology_2.0.pdf
LOW POWER CLASS AB SI POWER AMPLIFIER FOR WIRELESS MEDICAL SENSOR NETWORK
August -2025_Top10 Read_Articles_ijait.pdf
Chapter 2 -Technology and Enginerring Materials + Composites.pptx
Unit1 - AIML Chapter 1 concept and ethics
Measurement Uncertainty and Measurement System analysis
Principal presentation for NAAC (1).pptx
Petroleum Refining & Petrochemicals.pptx
mechattonicsand iotwith sensor and actuator

data history / data science @ NYT