SlideShare a Scribd company logo
Data Engineer CareerTalk
WeCloudData
@WeCloudData @WeCloudData tordatascience
weclouddata
WeCloudData tordatascience
2013 2014 2015 20172016 2018 2019
Introduction
Edwin Guo
Introduction
Agenda
Data Engineer
What is Data Engineer?
Data engineers are mainly tasked with transforming data into a format that can be
easily analyzed. They do this by developing, maintaining, and testing infrastructures
for data generation. Data engineers work closely with data scientists and are largely in
charge of architecting solutions for data scientists that enable them to do their jobs.
In addition, data engineers possess a plethora of technical skills and the ability to
approach problems in a creative manner.
Data Engineer
What is Data Engineer?
Data Engineer
What is Data Engineer?
Data Engineer
What does Data Engineer do?
Data Engineer
What does Data Engineer do?
Batch mode
Twitter API
Kinesis
Redshift
Data Engineer
What does Data Engineer do?
Streaming Mode
Data Engineer
What is Data Engineer’s
responsibility?
Data Engineer
What is Data Engineer’s required Skills?
Data Engineer
What is Data Engineer’s required Skills?
Data Engineer
What is Data Engineer’s required Skills?
Data Science
DW vs DL vs DM
Banking Telecom Consulting Startups
Data Engineer
Hiring Companies
Data Engineer
Required skills
Requirements:
! Bachelor's degree in Computer Science/Engineering or equivalent experience. Master’s degree
preferred.Ā 
! Experience with large-scale distributed systems , Microservice and service-oriented
architectures.
! Extensive experience with Amazon AWS, and other cloud offerings.
! Strong development skills in Scala, Java, Python and/or C++.
! Experience with caching technologies using Redis, Memcached.
! Knowledge of various databases / database technologies - Oracle, Postgres, Cassandra
(NoSQL).
! Exposure to implementing real-time streaming data pipelines on large volumes of data using
Kafka, Spark.Ā 
! Experience with Data Processing (ETL, Data Warehousing, etc.)Ā 
! Big Data technologies and languages (Pig, Hive, Spark, Hadoop).Ā 
! Familiarity with version control software, such as Git.
! Highly proficient in Object Oriented Design and Development. 
! Experience in Automation and Load Testing Frameworks. Build, test, and maintain optimal data
pipeline architecture
! Assemble large, complex data sets to meet both functional and non-functional requirement
! Build the infrastructure necessary for optimal extraction, transformation, and loading of
data.Ā Identify, design, implement, and enhance internal processes
Unleash your data potential!
Senior Position
$120K - $140K
Expert
$150K+
Entry Level
$70K - $80K
Experienced
$90K - $110K
WeCloudData offers data Engineer accelerator program. We
specialize in teaching the newest open source tools and
techniques such as Hadoop, Spark, Python, Machine
Learning, Deep Learning, and Cloud.
Introduction
TYPE OF DATA JOB SEEKERS
ä¼ ę„Ÿå™ØļØø
ę•°ę®
ęœŗå™ØļØøå­¦
ä¹ 
ā¼ˆäŗŗā¼Æå·„ę™ŗ
能
ęœŗå™ØļØøā¼ˆäŗŗ
ā¾č”ŒļØˆåŠØ
č§¦å‘å™ØļØø

More Related Content

PDF
Big Data for Data Scientists - Info Session
PDF
Big Data for Data Scientists - WeCloudData
PDF
AWS Well Architected-Info Session WeCloudData
PDF
Building a Data Science as a Service Platform in Azure with Databricks
PDF
Data Science with Python - WeCloudData
PDF
SQL for Data Science
PDF
Building the Artificially Intelligent Enterprise
PDF
Automating Data Quality Processes at Reckitt
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - WeCloudData
AWS Well Architected-Info Session WeCloudData
Building a Data Science as a Service Platform in Azure with Databricks
Data Science with Python - WeCloudData
SQL for Data Science
Building the Artificially Intelligent Enterprise
Automating Data Quality Processes at Reckitt

What's hot (20)

PDF
Big Data Adavnced Analytics on Microsoft Azure
PDF
Building Lakehouses on Delta Lake with SQL Analytics Primer
PPTX
Global AI Bootcamp Madrid - Azure Databricks
PDF
201905 Azure Databricks for Machine Learning
PDF
Databricks Overview for MLOps
PPTX
Essential Data Engineering for Data Scientist
PPTX
Data engineering
PDF
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
PDF
Data Science and Enterprise Engineering with Michael Finger and Chris Robison
PPTX
DW Migration Webinar-March 2022.pptx
PDF
Intro to Delta Lake
PDF
Building data "Py-pelines"
PPTX
Ai & Data Analytics 2018 - Azure Databricks for data scientist
PPTX
Data Engineering for Data Scientists
PDF
MLCommons: Better ML for Everyone
PDF
Deep Learning for Large-Scale Online Fraud Detection—Fighting Fraudsters Amon...
PDF
Data Science Career Insights by WeCloudData
PDF
Spark as a Service with Azure Databricks
PDF
Summary introduction to data engineering
PDF
Implementing and running a secure datalake from the trenches
Big Data Adavnced Analytics on Microsoft Azure
Building Lakehouses on Delta Lake with SQL Analytics Primer
Global AI Bootcamp Madrid - Azure Databricks
201905 Azure Databricks for Machine Learning
Databricks Overview for MLOps
Essential Data Engineering for Data Scientist
Data engineering
Building a MLOps Platform Around MLflow to Enable Model Productionalization i...
Data Science and Enterprise Engineering with Michael Finger and Chris Robison
DW Migration Webinar-March 2022.pptx
Intro to Delta Lake
Building data "Py-pelines"
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Data Engineering for Data Scientists
MLCommons: Better ML for Everyone
Deep Learning for Large-Scale Online Fraud Detection—Fighting Fraudsters Amon...
Data Science Career Insights by WeCloudData
Spark as a Service with Azure Databricks
Summary introduction to data engineering
Implementing and running a secure datalake from the trenches
Ad

Similar to Data Engineer Intro - WeCloudData (20)

PPTX
2010/08 - Database Architechs - Presentation
PPTX
2010/10 - Database Architechs presentation
PPTX
Enabling Data centric Teams
PDF
Zahid Resume.pdf
PDF
Full Stack | Android | Web | PHP Developement | SEO | Digital Marketing Cour...
DOC
Sriramjasti
PDF
The Basics of Data Engineering with IABAC
Ā 
PDF
2010/10 - Database Architechs Consulting Services Summary
PPTX
šŸ”„ Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...
DOCX
Mani_Sagar_ETL
DOCX
Chandan's_Resume
PDF
How to Build a Data Engineering Career | IABAC
PDF
omar_alhussein_final_cv
PDF
Rajeev kumar apache_spark & scala developer
DOCX
Imran_SAP_BI_BW_BODS_RESUME
PDF
Horses for Courses: Database Roundtable
PDF
2010/10 - Database Architechs - Data Services Summary
PPTX
Best Software Coaching Institute in Hyderabad
DOCX
Senthilkumar_SQL_New
2010/08 - Database Architechs - Presentation
2010/10 - Database Architechs presentation
Enabling Data centric Teams
Zahid Resume.pdf
Full Stack | Android | Web | PHP Developement | SEO | Digital Marketing Cour...
Sriramjasti
The Basics of Data Engineering with IABAC
Ā 
2010/10 - Database Architechs Consulting Services Summary
šŸ”„ Top 5 Skills For Data Engineer In 2023 | Data Engineer Skills Required For ...
Mani_Sagar_ETL
Chandan's_Resume
How to Build a Data Engineering Career | IABAC
omar_alhussein_final_cv
Rajeev kumar apache_spark & scala developer
Imran_SAP_BI_BW_BODS_RESUME
Horses for Courses: Database Roundtable
2010/10 - Database Architechs - Data Services Summary
Best Software Coaching Institute in Hyderabad
Senthilkumar_SQL_New
Ad

More from WeCloudData (10)

PDF
Data Engineering Course Syllabus - WeCloudData
PDF
Machine learning in Healthcare - WeCloudData
PDF
Deep Learning Introduction - WeCloudData
PDF
Introduction to Machine Learning - WeCloudData
PDF
Introduction to Python by WeCloudData
PDF
Web scraping project aritza-compressed
PDF
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
PDF
Introduction to Machine Learning - WeCloudData
PPTX
WeCloudData Toronto Open311 Workshop - Matthew Reyes
PPTX
Tordatasci meetup-precima-retail-analytics-201901
Data Engineering Course Syllabus - WeCloudData
Machine learning in Healthcare - WeCloudData
Deep Learning Introduction - WeCloudData
Introduction to Machine Learning - WeCloudData
Introduction to Python by WeCloudData
Web scraping project aritza-compressed
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Introduction to Machine Learning - WeCloudData
WeCloudData Toronto Open311 Workshop - Matthew Reyes
Tordatasci meetup-precima-retail-analytics-201901

Recently uploaded (20)

PPTX
Computer network topology notes for revision
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to machine learning and Linear Models
PPTX
Supervised vs unsupervised machine learning algorithms
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Business Analytics and business intelligence.pdf
Computer network topology notes for revision
.pdf is not working space design for the following data for the following dat...
Clinical guidelines as a resource for EBP(1).pdf
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
IBA_Chapter_11_Slides_Final_Accessible.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to machine learning and Linear Models
Supervised vs unsupervised machine learning algorithms
Reliability_Chapter_ presentation 1221.5784
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Fluorescence-microscope_Botany_detailed content
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Business Analytics and business intelligence.pdf

Data Engineer Intro - WeCloudData

  • 1. Data Engineer CareerTalk WeCloudData @WeCloudData @WeCloudData tordatascience weclouddata WeCloudData tordatascience
  • 2. 2013 2014 2015 20172016 2018 2019 Introduction Edwin Guo
  • 4. Data Engineer What is Data Engineer? Data engineers are mainly tasked with transforming data into a format that can be easily analyzed. They do this by developing, maintaining, and testing infrastructures for data generation. Data engineers work closely with data scientists and are largely in charge of architecting solutions for data scientists that enable them to do their jobs. In addition, data engineers possess a plethora of technical skills and the ability to approach problems in a creative manner.
  • 5. Data Engineer What is Data Engineer?
  • 6. Data Engineer What is Data Engineer?
  • 7. Data Engineer What does Data Engineer do?
  • 8. Data Engineer What does Data Engineer do? Batch mode
  • 9. Twitter API Kinesis Redshift Data Engineer What does Data Engineer do? Streaming Mode
  • 10. Data Engineer What is Data Engineer’s responsibility?
  • 11. Data Engineer What is Data Engineer’s required Skills?
  • 12. Data Engineer What is Data Engineer’s required Skills?
  • 13. Data Engineer What is Data Engineer’s required Skills?
  • 14. Data Science DW vs DL vs DM
  • 15. Banking Telecom Consulting Startups Data Engineer Hiring Companies
  • 16. Data Engineer Required skills Requirements: ! Bachelor's degree in Computer Science/Engineering or equivalent experience. Master’s degree preferred.Ā  ! Experience with large-scale distributed systems , Microservice and service-oriented architectures. ! Extensive experience with Amazon AWS, and other cloud offerings. ! Strong development skills in Scala, Java, Python and/or C++. ! Experience with caching technologies using Redis, Memcached. ! Knowledge of various databases / database technologies - Oracle, Postgres, Cassandra (NoSQL). ! Exposure to implementing real-time streaming data pipelines on large volumes of data using Kafka, Spark.Ā  ! Experience with Data Processing (ETL, Data Warehousing, etc.)Ā  ! Big Data technologies and languages (Pig, Hive, Spark, Hadoop).Ā  ! Familiarity with version control software, such as Git. ! Highly proficient in Object Oriented Design and Development.Ā  ! Experience in Automation and Load Testing Frameworks. Build, test, and maintain optimal data pipeline architecture ! Assemble large, complex data sets to meet both functional and non-functional requirement ! Build the infrastructure necessary for optimal extraction, transformation, and loading of data.Ā Identify, design, implement, and enhance internal processes
  • 17. Unleash your data potential! Senior Position $120K - $140K Expert $150K+ Entry Level $70K - $80K Experienced $90K - $110K WeCloudData offers data Engineer accelerator program. We specialize in teaching the newest open source tools and techniques such as Hadoop, Spark, Python, Machine Learning, Deep Learning, and Cloud. Introduction
  • 18. TYPE OF DATA JOB SEEKERS ä¼ ę„Ÿå™ØļØø ę•°ę® ęœŗå™ØļØøå­¦ ä¹  ā¼ˆäŗŗā¼Æå·„ę™ŗ 能 ęœŗå™ØļØøā¼ˆäŗŗ ā¾č”ŒļØˆåŠØ č§¦å‘å™ØļØø