SlideShare a Scribd company logo
Getting Started with Data Science
December 2017
http://bit.ly/data-science-sd
Deskhub-main - stake2017!
Jordan Zurowski
Thinkful Community Manager
MA in Industrial/Organizational
Psychology
About me
About you
You already have a career in data
I'm interested in switching into a data career
I just want to see what all the fuss is about
About Thinkful
Thinkful helps people become developers or data
scientists through 1-on-1 mentorship and project-based
learning
These workshops are built using this approach.
Today's Goals
What is Data Science?
How and why has the field emerged?
What do they do?
Next steps
Getstarteddssd12717sd
Getstarteddssd12717sd
Getstarteddssd12717sd
Example: LinkedIn 2006
“[LinkedIn] was like arriving at a conference
reception and realizing you don’t know
anyone. So you just stand in the corner
sipping your drink—and you probably leave
early.”
-LinkedIn Manager, June 2006
Enter: Data Scientist
Jonathan Goldman
Joined LinkedIn in 2006, only
8M users (450M in 2016)
Started experiments to predict
people’s networks
Engineers were dismissive: “you
can already import your
address book”
The Result
Other Examples
Uber — Where drivers should hang out
Tala — Microfinance loan approval
Why now?
Big Data: datasets whose size is
beyond the ability of typical database
software tools to capture, store,
manage, and analyze
Brief history of "big data"
Trend "started" in 2005
Web 2.0 - Majority of content is created
by users
Mobile accelerates this — data/person
skyrockets
Big Data
90% of the data in the world
today has been created in the
last two years alone
- IBM, May 2013
The Problem
The Solution
Data Scientists - Jack of All Trades
Data Science is just the beginning
“The United States alone faces a shortage
of 140,000 to 190,000 people with deep
analytical skills as well as 1.5 million
managers and analysts to analyze big
data and make decisions based on their
findings.”
- McKinsey
The Process - LinkedIn Example
Frame the question
Collect the raw data
Process the data
Explore the data
Communicate results
Case: Frame the Question
What questions do we want to answer?
Case: Frame the Question
What connections (type and number) lead to
higher user engagement?
Which connections do people want to make
but are currently limited from making?
How might we predict these types of
connections with limited data from the user?
Case: Collect the Data
What data do we need to answer these
questions?
Case: Collect the Data
Connection data (who is who connected to?)
Demographic data (what is the profile of the
connection)
Engagement data (how do they use the site)
Case: Process the Data
How is the data “dirty” and how can we clean
it?
Case: Process the Data
User input
Redundancies
Feature changes
Data model changes
Case: Explore the Data
What are the meaningful patterns in the
data?
Case: Explore the Data
Triangle closing
Time overlaps
Geographic overlaps
Case: Communicate Findings
How do we communicate this? To whom?
Case: Communicate Findings
“People You Know” feature increased
clickthrough by 30% (generating millions
more page views)
Tools
SQL Queries
Business Analytics Software
Machine Learning Algorithms
#1 SQL Queries
SQL is the standard querying language
to access and manipulate databases
#1 SQL Queries
SELECT full_name FROM friends WHERE age>22
#2: Visualization Software
Business analytics software for your database
enabling you to easily find and communicate
insights visually
#2: Visualization Software
#3: Machine Learning Algorithms
Machine learning algorithms provide
computers with the ability to learn
without being explicitly programmed —
“programming by example”
Iris Data Set
Iris Data Set
Iris Data Set
Use Cases for Machine Learning
Classification — Predict categories
Regression — Predict values Anomaly
Fraud Detection — Find unusual occurrences
Clustering — Discover structure
It may seem like a daunting opportunity
But if you're interested...
Knowledge of statistics, algorithms, &
software
Comfort with languages & tools (Python,
SQL, Tableau)
Inquisitiveness and intellectual curiosity
Strong communication skills
It’s all Teachable!
Ways to keep learning
For aspiring developers...
Source: Bureau of Labor Statistics
92%of grads placed in full-time tech jobs
job guarantee
Link for the third party audit jobs report:
https://www.thinkful.com/outcomes
Thinkful's track record of getting students jobs
Our students receive unprecedented support
1-on-1 Learning Mentor
1-on-1 Career MentorProgram Manager
San Diego Community
You
1-on-1 mentorship enables flexible learning
Learn anywhere,
anytime, and at your
own schedule
You don't have to quit
your job to start career
transition
Try us out!
Learn Python, Python’s
data science toolkit,
Statistics intro.
Initial 3-week prep course
includes six mentor
sessions for $250.
Option to continue onto
Data Science bootcamp
Talk to me (or email
jordan@thinkful.com) if
you’re interested

More Related Content

PDF
Data sci sd-11.6.17
PDF
Deck 92-146 (3)
PDF
Startds9.19.17sd
PDF
Getting started in ds (july 17) atlanta
PDF
Thinkful - Intro to Data Science - Washington DC
PPTX
Data science as a professional career
PDF
Who is a data scientist
PDF
What is Data Science
Data sci sd-11.6.17
Deck 92-146 (3)
Startds9.19.17sd
Getting started in ds (july 17) atlanta
Thinkful - Intro to Data Science - Washington DC
Data science as a professional career
Who is a data scientist
What is Data Science

What's hot (20)

PDF
Getting started in data science (4:3)
PDF
Getting started in data science (4:3)
PPTX
Introduction to data science.pptx
PDF
Understanding Cognitive Applications: A Framework - Sue Feldman
PDF
NOVA Data Science Meetup 1/19/2017 - Presentation 1
PDF
Data science vs. Data scientist by Jothi Periasamy
PDF
How to start your journey as a data scientist
PDF
From Rocket Science to Data Science
PDF
data mining
PPTX
1. Data Analytics-introduction
PPTX
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...
PDF
Demystifying Data Science with an introduction to Machine Learning
PPTX
Lecture #01
PDF
Data Analytics: From Basic Skills to Executive Decision-Making
PDF
Data science and_analytics_for_ordinary_people_ebook
PPTX
PPTX
Data science
PDF
Getting Started in Data Science
PPTX
HICSS - 50
PDF
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Getting started in data science (4:3)
Getting started in data science (4:3)
Introduction to data science.pptx
Understanding Cognitive Applications: A Framework - Sue Feldman
NOVA Data Science Meetup 1/19/2017 - Presentation 1
Data science vs. Data scientist by Jothi Periasamy
How to start your journey as a data scientist
From Rocket Science to Data Science
data mining
1. Data Analytics-introduction
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...
Demystifying Data Science with an introduction to Machine Learning
Lecture #01
Data Analytics: From Basic Skills to Executive Decision-Making
Data science and_analytics_for_ordinary_people_ebook
Data science
Getting Started in Data Science
HICSS - 50
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Ad

Similar to Getstarteddssd12717sd (20)

PDF
D92-198gstindspdx
PDF
Career in Data Science (July 2017, DTLA)
PDF
Thinkful DC - Intro to Data Science
PDF
2017 06-14-getting started with data science
PDF
Getting started in Data Science (April 2017, Los Angeles)
PDF
Intro to Data Science
PPTX
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
PDF
PPTX
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
PDF
Untitled document.pdf
PDF
from_physics_to_data_science
PDF
Industrial Data Science
PDF
Board Infinity Data Science Brochure - data science learning path
PPT
data science ppt of emngineering studnets
PDF
Dont wait what 300 ld leaders have learned about building data fluency
PDF
Data+Science : A First Course
PDF
How to become a data scientist
PPTX
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
PDF
Building successful data science teams
PPT
From Developer to Data Scientist
D92-198gstindspdx
Career in Data Science (July 2017, DTLA)
Thinkful DC - Intro to Data Science
2017 06-14-getting started with data science
Getting started in Data Science (April 2017, Los Angeles)
Intro to Data Science
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
Untitled document.pdf
from_physics_to_data_science
Industrial Data Science
Board Infinity Data Science Brochure - data science learning path
data science ppt of emngineering studnets
Dont wait what 300 ld leaders have learned about building data fluency
Data+Science : A First Course
How to become a data scientist
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
Building successful data science teams
From Developer to Data Scientist
Ad

More from Thinkful (20)

PDF
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
PDF
LA 1/31/18 Intro to JavaScript: Fundamentals
PDF
LA 1/31/18 Intro to JavaScript: Fundamentals
PDF
Itjsf129
PDF
Twit botsd1.30.18
PDF
Build your-own-instagram-filters-with-javascript-202-335 (1)
PDF
Baggwjs124
PDF
Become a Data Scientist: A Thinkful Info Session
PDF
Vpet sd-1.25.18
PDF
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
PDF
How to Choose a Programming Language
PDF
Batbwjs117
PDF
1/16/18 Intro to JS Workshop
PDF
LA 1/16/18 Intro to Javascript: Fundamentals
PDF
(LA 1/16/18) Intro to JavaScript: Fundamentals
PDF
Websitesd1.15.17.
PDF
Bavpwjs110
PDF
Byowwhc110
PDF
Getting started-jan-9-2018
PDF
Introjs1.9.18tf
893ff61f-1fb8-4e15-a379-775dfdbcee77-7-14-25-46-115-141-308-324-370
LA 1/31/18 Intro to JavaScript: Fundamentals
LA 1/31/18 Intro to JavaScript: Fundamentals
Itjsf129
Twit botsd1.30.18
Build your-own-instagram-filters-with-javascript-202-335 (1)
Baggwjs124
Become a Data Scientist: A Thinkful Info Session
Vpet sd-1.25.18
LA 1/18/18 Become A Web Developer: A Thinkful Info Session
How to Choose a Programming Language
Batbwjs117
1/16/18 Intro to JS Workshop
LA 1/16/18 Intro to Javascript: Fundamentals
(LA 1/16/18) Intro to JavaScript: Fundamentals
Websitesd1.15.17.
Bavpwjs110
Byowwhc110
Getting started-jan-9-2018
Introjs1.9.18tf

Recently uploaded (20)

PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Cloud computing and distributed systems.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Electronic commerce courselecture one. Pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
cuic standard and advanced reporting.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Network Security Unit 5.pdf for BCA BBA.
Dropbox Q2 2025 Financial Results & Investor Presentation
Understanding_Digital_Forensics_Presentation.pptx
Encapsulation theory and applications.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
MYSQL Presentation for SQL database connectivity
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Cloud computing and distributed systems.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Electronic commerce courselecture one. Pdf
20250228 LYD VKU AI Blended-Learning.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
cuic standard and advanced reporting.pdf
Spectral efficient network and resource selection model in 5G networks
NewMind AI Weekly Chronicles - August'25 Week I
Network Security Unit 5.pdf for BCA BBA.

Getstarteddssd12717sd

  • 1. Getting Started with Data Science December 2017 http://bit.ly/data-science-sd Deskhub-main - stake2017!
  • 2. Jordan Zurowski Thinkful Community Manager MA in Industrial/Organizational Psychology About me
  • 3. About you You already have a career in data I'm interested in switching into a data career I just want to see what all the fuss is about
  • 4. About Thinkful Thinkful helps people become developers or data scientists through 1-on-1 mentorship and project-based learning These workshops are built using this approach.
  • 5. Today's Goals What is Data Science? How and why has the field emerged? What do they do? Next steps
  • 9. Example: LinkedIn 2006 “[LinkedIn] was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you probably leave early.” -LinkedIn Manager, June 2006
  • 10. Enter: Data Scientist Jonathan Goldman Joined LinkedIn in 2006, only 8M users (450M in 2016) Started experiments to predict people’s networks Engineers were dismissive: “you can already import your address book”
  • 12. Other Examples Uber — Where drivers should hang out Tala — Microfinance loan approval
  • 13. Why now? Big Data: datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze
  • 14. Brief history of "big data" Trend "started" in 2005 Web 2.0 - Majority of content is created by users Mobile accelerates this — data/person skyrockets
  • 15. Big Data 90% of the data in the world today has been created in the last two years alone - IBM, May 2013
  • 18. Data Scientists - Jack of All Trades
  • 19. Data Science is just the beginning “The United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts to analyze big data and make decisions based on their findings.” - McKinsey
  • 20. The Process - LinkedIn Example Frame the question Collect the raw data Process the data Explore the data Communicate results
  • 21. Case: Frame the Question What questions do we want to answer?
  • 22. Case: Frame the Question What connections (type and number) lead to higher user engagement? Which connections do people want to make but are currently limited from making? How might we predict these types of connections with limited data from the user?
  • 23. Case: Collect the Data What data do we need to answer these questions?
  • 24. Case: Collect the Data Connection data (who is who connected to?) Demographic data (what is the profile of the connection) Engagement data (how do they use the site)
  • 25. Case: Process the Data How is the data “dirty” and how can we clean it?
  • 26. Case: Process the Data User input Redundancies Feature changes Data model changes
  • 27. Case: Explore the Data What are the meaningful patterns in the data?
  • 28. Case: Explore the Data Triangle closing Time overlaps Geographic overlaps
  • 29. Case: Communicate Findings How do we communicate this? To whom?
  • 30. Case: Communicate Findings “People You Know” feature increased clickthrough by 30% (generating millions more page views)
  • 31. Tools SQL Queries Business Analytics Software Machine Learning Algorithms
  • 32. #1 SQL Queries SQL is the standard querying language to access and manipulate databases
  • 33. #1 SQL Queries SELECT full_name FROM friends WHERE age>22
  • 34. #2: Visualization Software Business analytics software for your database enabling you to easily find and communicate insights visually
  • 36. #3: Machine Learning Algorithms Machine learning algorithms provide computers with the ability to learn without being explicitly programmed — “programming by example”
  • 40. Use Cases for Machine Learning Classification — Predict categories Regression — Predict values Anomaly Fraud Detection — Find unusual occurrences Clustering — Discover structure
  • 41. It may seem like a daunting opportunity
  • 42. But if you're interested... Knowledge of statistics, algorithms, & software Comfort with languages & tools (Python, SQL, Tableau) Inquisitiveness and intellectual curiosity Strong communication skills It’s all Teachable!
  • 43. Ways to keep learning
  • 44. For aspiring developers... Source: Bureau of Labor Statistics
  • 45. 92%of grads placed in full-time tech jobs job guarantee Link for the third party audit jobs report: https://www.thinkful.com/outcomes Thinkful's track record of getting students jobs
  • 46. Our students receive unprecedented support 1-on-1 Learning Mentor 1-on-1 Career MentorProgram Manager San Diego Community You
  • 47. 1-on-1 mentorship enables flexible learning Learn anywhere, anytime, and at your own schedule You don't have to quit your job to start career transition
  • 48. Try us out! Learn Python, Python’s data science toolkit, Statistics intro. Initial 3-week prep course includes six mentor sessions for $250. Option to continue onto Data Science bootcamp Talk to me (or email jordan@thinkful.com) if you’re interested