SlideShare a Scribd company logo
www.upxacademy.com
How to Crack Big Data & Data
Science Roles
Peeyush Taori
London Business School, AQR, AQR Asset Management
Institute, Indian School of Business
Manvender Singh
Founder, UpX Academy
MBA, Indian School of Business, Hyderabad
Agenda of Today’s Infosession
• Why is there buzz about Big Data, Machine Learning & Data Science
• What is the future of Big Data & Data Science as a career?
• Which companies are hiring for Big Data, Machine learning & Data
Science experts?
• How to position yourself to crack these roles?
• Interviews questions for Big Data & Data Science professionals
• Info about upcoming batches
• Q&A
A quick look at some people you will meet
Peeyush Taori Manvender Singh Madhu Reddy Arun Reddy
Chief Instructor Founder Student Services Student Services
What this session is
• Insights that you’ll not get on internet
• Focused on end goal(career opportunities) not starting
point(learning big data & data science)
• Understand big data & data science career opportunities across
geographies & industries
• Understand how to make career transition into Big data & Data
Science
• Address your questions related to career opportunities in Big data &
Data Science
What this session is not
• Not an introductory session on Big Data & Data Science
• Attend Big Data and Data Science trial classes
Big Data Trial class 12-1 pm Sunday 11th Sept
Data Science Trial class 1-2 pm Sunday 11th Sept
The buzz
“The Sexiest job of the 21st century “
“#1 most wanted hires in USA in 2016”
“Shortage of 140k to 190k data scientists in US alone by 2018”
“We’re moving from a mobile first world to AI first world”
How does Big Data analytics affect our daily lives?
More use cases on : http://upxacademy.com/2016/05/31/big-data-use-cases-industries/
The buzz
“The Sexiest job of the 21st century “
“#1 most wanted hires in USA in 2016”
“Shortage of 140k to 190k data scientists in US alone by 2018”
“We’re moving from a mobile first world to AI first world”
Machine learning applications
Self driving cars: Google, Baidu, Tesla
have implemented this technology.
Speech recognition: Google now,
Siri, Cortana
Genetics: Clustering algorithms are
used in genetics to help find genes
associated with a particular disease.
Face recognition: Facebook
automatically tags people in photos
where they appear.
Major acquisitions of ML and Big Data start-ups
2016
Intel acquired AI startup Nervana
Systems for $350 million
Twitter acquired machine learning
startup Magic Pony Technology for $150
million
Apple acquired Machine-Learning
Startup Turi for $200 Million
A non-profit AI research company,
OpenAI is funded by the famous business
magnate Elon Musk
2015
Microsoft acquired Metanautix, a Big
Data Analytics company
Big Data & Data Science - Together
• Fundamentally, part of same team
– Big Data programming and data science go hand in hand
• Firms need to deal with huge amounts of data
– Storage, Computation, Coherent Data View – Big Data
– Analytics, Statistics, Prediction – Data Science
• Let’s consider them in isolation for now
Big Data…What and Why?
 Characterized by 3V
• Volume
• Velocity
1. 3 Exabytes data(3 billion GB) is generated every day
2. 13 million new videos are added/month on Youtube.
3. 300 million photos uploaded/day on Facebook
• Variety
1. Structured, Semi-Structured, Unstructured
 Data is the most valuable asset
• Create insights and value
General Batch
Processing
Pregel
Dremel
Impala
GraphLab
Giraph
Drill
Tez
S4
Storm
Specialized
Systems
(iterative, interactive, ML, streaming, graph,
SQL, etc)
General Unified Engine
(2004 – 2013) (2007 – 2015?) (2014 – ?)
Mahout
Technology Landscape
Career Paths
Big Data Developer
• Excel at Big Data programming
• Hadoop, Pig, Hive, HBase, Spark
• Big Data Engineer, Consultant, Big Data Architect
Big Data Analytics
• Wear data analytics and big data programming hats
• Hadoop, Spark, Statistics, Analytics, Data Science, R, Python
• Big Data Analyst, Consultant, Big Data scientist
Big Data Jobs trends
Now, let us consider Data Science
Data Science…What and Why?
How to crack Big Data and Data Science roles
Skills a recruiters seeks
Typical Workday of a Data Scientist
 Gather data
• Programming, web scraping, DB
 Transform data
• DB Skills, Data Manipulation, Mathematics & Stats
 Data Modeling
• Machine Learning, Stats, Algorithms
 Data Reporting
• Inference, Business Acumen, Visualization
Data Science Job Trends
Demand across geographies
• Hottest market in US and Europe currently
• Demand outstrips supply
• Average salary of $1,00,000 for Big Data Engineers and $1,20,000 for Data
Scientists
• Similarly, £60,000 in UK
• Fastest growing job sector in India
• Average starting salary- INR 10 Lakhs
• Salaries shoot up with skill set and experience
Who is recruiting?
 Basically, everyone!!!
 Thought Leaders
• Google, Facebook, Amazon
 Data driven firms
• Uber, Twitter, NBC, Flipkart
 IT giants
• Catching up to the buzz
• Infosys, Cognizant, IBM, Accenture…..
 Data analytics focused startups/companies
• Arcadia, DataHero, Walmart Labs, Mu sigma, Fractal Analytics, Flutura
 Traditional Businesses
• DNV, Wal-Mart, Sears, DHL
Building a Resume
 Typical CV attention time span ~ 20-30 sec
 Prior Big Data/Data Science experience
• Most recent (Chronological)
• Project
• Clear, concise articulation of responsibilities and tools used
 Keep other experience to a minimum
 Demonstration of Big Data/Data Science Skills
• Certification
• Personal projects/POC/Competitions
 Finally, KISS
• Keep It Simple and Short
No prior experience?
 Demonstration of certified skills takes top priority
 Experience of working on Big Data/Data Science projects
 Experience of distributed computing
 Knowledge of fringe skills
 Intra-organization
• Low barriers to movement
• Certification and POC puts you in spotlight
What not to put in resume
• Recruiters receive lot of CVs
• Formatting and presentation matters
• Many firms use keyword extractor tool
• Buzzwords without knowledge is a strict no-no
• Keep length to max 2 pages
Big Data Top interview questions -
Generic
• Explain Big Data technologies
• Walk us through your previous Big Data project
• What is Hadoop and how is it related to MapReduce
• Hadoop deamons & their roles in Hadoop cluster
• Explain MapReduce
• Difference between Spark and Hadoop
• How do I deal with Streaming data
• Hive, Pig, and MapReduce
Big Data Top Interview Questions -
Specific
• Difference between Hadoop 1.0 and 2.0
• Architecture of Spark
• Indexing process in HDFS
• HDFS Block and Input Split
Data Science Top interview questions -
Generic
• Explain various Machine Learning techniques
• Walk us through your recent data science project
• Difference between supervised and unsupervised
• Assumption for a linear regression
• How do random forests work
• Trade-off between classification and regression
Data Science Top Interview Questions -
Specific
• How do you handle missing data
• Differentiate: Lift, KPI, model fitting
• Collaborative filtering, n-grams, KNN
• Assumptions of LDA and QDA
Class FAQs
• Where do the classes take place & what’s the class timings?
• Can I attend trial classes before attending?
• Do I have to purchase any software?
• What’s the difference between certificate of completion vs certification?
• What if I miss a class?
• How do I ask my doubts after the class?
Payment FAQs
• 20% off on course fee after trial classes. Valid till tomorrow midnight. Use UPX20
coupon code
• One time payment on website
• Credit card EMI option- currently available for ICICI, HDFC, Kotak & Amex
• 3 month interest free EMI option for select corporates.
Coordinates
Manav manav@upxacademy.com
Peeyush peeyush@upxacademy.com
Student Service Team: info@upxacademy.com
1800-123-1260
Fasahath/Madhu : 733-736-0431/37
Q&A

More Related Content

PPTX
Course Information for March 25th Batch
PPTX
Big Data Analytics
PDF
Data Science: Harnessing Open Data for High Impact Solutions
PDF
Paving The Way To Data Driven
PPTX
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
PPTX
NoSQL and Data Modeling for Data Modelers
PDF
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
PDF
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez
Course Information for March 25th Batch
Big Data Analytics
Data Science: Harnessing Open Data for High Impact Solutions
Paving The Way To Data Driven
Big Data Presentation - Data Center Dynamics Sydney 2014 - Dez Blanchfield
NoSQL and Data Modeling for Data Modelers
Big Data Modeling and Analytic Patterns – Beyond Schema on Read
Data Modeling for Big Data & NoSQL Technologies with Karen Lopez

What's hot (20)

PDF
So you want to be a Data Scientist?
PDF
Lecture3 business intelligence
PPTX
Big data(1st presentation)
PPTX
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
PPTX
000 introduction to big data analytics 2021
PDF
Big Data Maturity Model and Governance
PPT
Data mining
PPTX
Metadata
PPTX
Big data 101
PPTX
Data Mining
PDF
Big Data Modeling
PPTX
Big Data Presentation
PPTX
Digital data
PPTX
Introduction to data science
PDF
Data Modeling for Big Data
PPTX
Hadoop Data Modeling
PDF
Walmart Big Data Expo
PDF
Data mining (lecture 1 & 2) conecpts and techniques
PPTX
SMART Seminar Series: "From Big Data to Smart data"
So you want to be a Data Scientist?
Lecture3 business intelligence
Big data(1st presentation)
Keynote talk at Financial Times Forum - BigData and Advanced Analytics at SIB...
000 introduction to big data analytics 2021
Big Data Maturity Model and Governance
Data mining
Metadata
Big data 101
Data Mining
Big Data Modeling
Big Data Presentation
Digital data
Introduction to data science
Data Modeling for Big Data
Hadoop Data Modeling
Walmart Big Data Expo
Data mining (lecture 1 & 2) conecpts and techniques
SMART Seminar Series: "From Big Data to Smart data"
Ad

Viewers also liked (16)

PPTX
Riesgos físicos
DOC
PPTX
Big data roles overview july 2013
PDF
Microsoft power point projet export france-méxique [mode de compatibilité]
PPT
Mobile access to asset information at Shell
DOC
CV Vincent RENOUVIN
DOC
eric warnies cv
PDF
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
PDF
David Wells CV
PPT
Llojet e baterive
PPT
Brif baterish / orë e hapur byirenakotobelli
PPTX
ORË MODEL GJUHË SHQIPE by almahamzallari
PDF
Dimensionnement de structure en verre - Logiciel RFEM & RF-GLASS
PPTX
Criteria Analysis
DOCX
Vleresimi ne fund te tremujorit
Riesgos físicos
Big data roles overview july 2013
Microsoft power point projet export france-méxique [mode de compatibilité]
Mobile access to asset information at Shell
CV Vincent RENOUVIN
eric warnies cv
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
David Wells CV
Llojet e baterive
Brif baterish / orë e hapur byirenakotobelli
ORË MODEL GJUHË SHQIPE by almahamzallari
Dimensionnement de structure en verre - Logiciel RFEM & RF-GLASS
Criteria Analysis
Vleresimi ne fund te tremujorit
Ad

Similar to How to crack Big Data and Data Science roles (20)

PDF
How to become a data scientist
PPT
From Developer to Data Scientist
PPTX
What companies hiring data scientists and hadoop developers are looking for?
PDF
Data Science Career Insights by WeCloudData
PDF
Building successful data science teams
PPTX
Big Data Developer Career Path: Job & Interview Preparation
PPTX
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
PDF
Data+Science : A First Course
PPTX
Career_Jobs_in_Data_Science.pptx
PDF
Decoding Data Science
PDF
Brochure data science learning path board-infinity (1)
PDF
Board Infinity Data Science Brochure - data science learning path
PDF
IIPGH Webinar 1: Getting Started With Data Science
PPTX
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
PPTX
Hiring for data roles - Adwait Bhave (ML Engineer and Data Scientist at Druva
PPTX
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
PPTX
Introduction to Data Science - Overview and application
PDF
Data Science with Python - WeCloudData
PPTX
Data Science Jobs (Fresher & Experienced) Career Guide
PDF
Getting started in data science (4:3)
How to become a data scientist
From Developer to Data Scientist
What companies hiring data scientists and hadoop developers are looking for?
Data Science Career Insights by WeCloudData
Building successful data science teams
Big Data Developer Career Path: Job & Interview Preparation
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
Data+Science : A First Course
Career_Jobs_in_Data_Science.pptx
Decoding Data Science
Brochure data science learning path board-infinity (1)
Board Infinity Data Science Brochure - data science learning path
IIPGH Webinar 1: Getting Started With Data Science
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
Hiring for data roles - Adwait Bhave (ML Engineer and Data Scientist at Druva
Licensed to Analyze? Strata Data NY 2019 IADSS Session - Usama Fayyad, Hamit ...
Introduction to Data Science - Overview and application
Data Science with Python - WeCloudData
Data Science Jobs (Fresher & Experienced) Career Guide
Getting started in data science (4:3)

Recently uploaded (20)

PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
01-Introduction-to-Information-Management.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Introduction-to-Social-Work-by-Leonora-Serafeca-De-Guzman-Group-2.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
master seminar digital applications in india
PDF
The Final Stretch: How to Release a Game and Not Die in the Process.
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Open folder Downloads.pdf yes yes ges yes
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Cell Structure & Organelles in detailed.
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
01-Introduction-to-Information-Management.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Week 4 Term 3 Study Techniques revisited.pptx
Introduction-to-Social-Work-by-Leonora-Serafeca-De-Guzman-Group-2.pdf
O7-L3 Supply Chain Operations - ICLT Program
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Abdominal Access Techniques with Prof. Dr. R K Mishra
FourierSeries-QuestionsWithAnswers(Part-A).pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
master seminar digital applications in india
The Final Stretch: How to Release a Game and Not Die in the Process.
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Open folder Downloads.pdf yes yes ges yes
Anesthesia in Laparoscopic Surgery in India
O5-L3 Freight Transport Ops (International) V1.pdf
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Cell Structure & Organelles in detailed.
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf

How to crack Big Data and Data Science roles

  • 2. How to Crack Big Data & Data Science Roles Peeyush Taori London Business School, AQR, AQR Asset Management Institute, Indian School of Business Manvender Singh Founder, UpX Academy MBA, Indian School of Business, Hyderabad
  • 3. Agenda of Today’s Infosession • Why is there buzz about Big Data, Machine Learning & Data Science • What is the future of Big Data & Data Science as a career? • Which companies are hiring for Big Data, Machine learning & Data Science experts? • How to position yourself to crack these roles? • Interviews questions for Big Data & Data Science professionals • Info about upcoming batches • Q&A
  • 4. A quick look at some people you will meet Peeyush Taori Manvender Singh Madhu Reddy Arun Reddy Chief Instructor Founder Student Services Student Services
  • 5. What this session is • Insights that you’ll not get on internet • Focused on end goal(career opportunities) not starting point(learning big data & data science) • Understand big data & data science career opportunities across geographies & industries • Understand how to make career transition into Big data & Data Science • Address your questions related to career opportunities in Big data & Data Science
  • 6. What this session is not • Not an introductory session on Big Data & Data Science • Attend Big Data and Data Science trial classes Big Data Trial class 12-1 pm Sunday 11th Sept Data Science Trial class 1-2 pm Sunday 11th Sept
  • 7. The buzz “The Sexiest job of the 21st century “ “#1 most wanted hires in USA in 2016” “Shortage of 140k to 190k data scientists in US alone by 2018” “We’re moving from a mobile first world to AI first world”
  • 8. How does Big Data analytics affect our daily lives? More use cases on : http://upxacademy.com/2016/05/31/big-data-use-cases-industries/
  • 9. The buzz “The Sexiest job of the 21st century “ “#1 most wanted hires in USA in 2016” “Shortage of 140k to 190k data scientists in US alone by 2018” “We’re moving from a mobile first world to AI first world”
  • 10. Machine learning applications Self driving cars: Google, Baidu, Tesla have implemented this technology. Speech recognition: Google now, Siri, Cortana Genetics: Clustering algorithms are used in genetics to help find genes associated with a particular disease. Face recognition: Facebook automatically tags people in photos where they appear.
  • 11. Major acquisitions of ML and Big Data start-ups 2016 Intel acquired AI startup Nervana Systems for $350 million Twitter acquired machine learning startup Magic Pony Technology for $150 million Apple acquired Machine-Learning Startup Turi for $200 Million A non-profit AI research company, OpenAI is funded by the famous business magnate Elon Musk 2015 Microsoft acquired Metanautix, a Big Data Analytics company
  • 12. Big Data & Data Science - Together • Fundamentally, part of same team – Big Data programming and data science go hand in hand • Firms need to deal with huge amounts of data – Storage, Computation, Coherent Data View – Big Data – Analytics, Statistics, Prediction – Data Science • Let’s consider them in isolation for now
  • 13. Big Data…What and Why?  Characterized by 3V • Volume • Velocity 1. 3 Exabytes data(3 billion GB) is generated every day 2. 13 million new videos are added/month on Youtube. 3. 300 million photos uploaded/day on Facebook • Variety 1. Structured, Semi-Structured, Unstructured  Data is the most valuable asset • Create insights and value
  • 14. General Batch Processing Pregel Dremel Impala GraphLab Giraph Drill Tez S4 Storm Specialized Systems (iterative, interactive, ML, streaming, graph, SQL, etc) General Unified Engine (2004 – 2013) (2007 – 2015?) (2014 – ?) Mahout Technology Landscape
  • 15. Career Paths Big Data Developer • Excel at Big Data programming • Hadoop, Pig, Hive, HBase, Spark • Big Data Engineer, Consultant, Big Data Architect Big Data Analytics • Wear data analytics and big data programming hats • Hadoop, Spark, Statistics, Analytics, Data Science, R, Python • Big Data Analyst, Consultant, Big Data scientist
  • 16. Big Data Jobs trends
  • 17. Now, let us consider Data Science
  • 21. Typical Workday of a Data Scientist  Gather data • Programming, web scraping, DB  Transform data • DB Skills, Data Manipulation, Mathematics & Stats  Data Modeling • Machine Learning, Stats, Algorithms  Data Reporting • Inference, Business Acumen, Visualization
  • 23. Demand across geographies • Hottest market in US and Europe currently • Demand outstrips supply • Average salary of $1,00,000 for Big Data Engineers and $1,20,000 for Data Scientists • Similarly, £60,000 in UK • Fastest growing job sector in India • Average starting salary- INR 10 Lakhs • Salaries shoot up with skill set and experience
  • 24. Who is recruiting?  Basically, everyone!!!  Thought Leaders • Google, Facebook, Amazon  Data driven firms • Uber, Twitter, NBC, Flipkart  IT giants • Catching up to the buzz • Infosys, Cognizant, IBM, Accenture…..  Data analytics focused startups/companies • Arcadia, DataHero, Walmart Labs, Mu sigma, Fractal Analytics, Flutura  Traditional Businesses • DNV, Wal-Mart, Sears, DHL
  • 25. Building a Resume  Typical CV attention time span ~ 20-30 sec  Prior Big Data/Data Science experience • Most recent (Chronological) • Project • Clear, concise articulation of responsibilities and tools used  Keep other experience to a minimum  Demonstration of Big Data/Data Science Skills • Certification • Personal projects/POC/Competitions  Finally, KISS • Keep It Simple and Short
  • 26. No prior experience?  Demonstration of certified skills takes top priority  Experience of working on Big Data/Data Science projects  Experience of distributed computing  Knowledge of fringe skills  Intra-organization • Low barriers to movement • Certification and POC puts you in spotlight
  • 27. What not to put in resume • Recruiters receive lot of CVs • Formatting and presentation matters • Many firms use keyword extractor tool • Buzzwords without knowledge is a strict no-no • Keep length to max 2 pages
  • 28. Big Data Top interview questions - Generic • Explain Big Data technologies • Walk us through your previous Big Data project • What is Hadoop and how is it related to MapReduce • Hadoop deamons & their roles in Hadoop cluster • Explain MapReduce • Difference between Spark and Hadoop • How do I deal with Streaming data • Hive, Pig, and MapReduce
  • 29. Big Data Top Interview Questions - Specific • Difference between Hadoop 1.0 and 2.0 • Architecture of Spark • Indexing process in HDFS • HDFS Block and Input Split
  • 30. Data Science Top interview questions - Generic • Explain various Machine Learning techniques • Walk us through your recent data science project • Difference between supervised and unsupervised • Assumption for a linear regression • How do random forests work • Trade-off between classification and regression
  • 31. Data Science Top Interview Questions - Specific • How do you handle missing data • Differentiate: Lift, KPI, model fitting • Collaborative filtering, n-grams, KNN • Assumptions of LDA and QDA
  • 32. Class FAQs • Where do the classes take place & what’s the class timings? • Can I attend trial classes before attending? • Do I have to purchase any software? • What’s the difference between certificate of completion vs certification? • What if I miss a class? • How do I ask my doubts after the class?
  • 33. Payment FAQs • 20% off on course fee after trial classes. Valid till tomorrow midnight. Use UPX20 coupon code • One time payment on website • Credit card EMI option- currently available for ICICI, HDFC, Kotak & Amex • 3 month interest free EMI option for select corporates.
  • 34. Coordinates Manav manav@upxacademy.com Peeyush peeyush@upxacademy.com Student Service Team: info@upxacademy.com 1800-123-1260 Fasahath/Madhu : 733-736-0431/37
  • 35. Q&A