SlideShare a Scribd company logo
2
Most read
3
Most read
5
Most read
Darshil – Data Engineering Road Map:
1. Computer Science Fundamentals (If you don’t have a CS background)
Watch this if you don’t have a computer science background, as a Data Engineer having good
knowledge of CS fundamentals is important to understand big systems and how they work
Watching these videos will give you a basic understanding of CS fundamentals
You can watch the first 7 lectures from this playlist
a. CS50 2022
b. Book - Grokking Algorithms: An illustrated guide
2. Programming Language
Do any courses, your main goal here is to understand how to write basic Python
Code and how to work with different datasets!
a. Darshil - Python for Data Engineering (Recommended)
b. DataCamp - Data Engineering With Python
c. Coursera - Python for Everybody Specialization (Do this if you don’t know
anything about python)
d. Udemy - Python Bootcamps: Learn Python Programming and Code Training
e. freeCodeCamp - Learn Python - Full Course for Beginners
Practice Projects:
• Scrape Data Using BeautifulSoup Library eg. Amazon, Covid, Wikipedia, or any website
you like
• Build A Calculator Using Python
3. SQL (Structured Query Language)
Learn about the basics of SQL and how to write queries, once you complete the course
make sure you do hands-on practice on Hackerrank or any website you like!
a. Udemy - The Complete SQL Bootcamp for the Manipulation and Analysis of Data
(Recommended)
b. Coursera - SQL for Data Science
c. DataCamp - Intro To SQL DataCamp
Practice SQL here
• Hackerrank SQL
4. Basics Of Linux
Why Linux? Because you will be working with many remote machines, doing SSH to
access them, and performing operations so it’s important to learn them.
You don’t have to remember all the commands but just understand what they do and
how to write them
a. Udemy - Linux for Beginners: Linux Basics
b. Coursera - Linux Fundamentals
c. freeCodeCamp - Top 50 Most Popular Linux Commands (Recommended)
Do Hands-On Project
• Beginner Data Engineering Portfolio Project (Recommended)
5. Big Data Fundamentals
This section is theoretical and you need to understand how big data system works and
their history of them
a. Coursera - Big Data Specialization (Recommended)
b. Udemy - Learn Big Data: The Hadoop Ecosystem Masterclass (Do this if you
want to learn about legacy systems)
6. Data Warehouse Fundamentals + Tool
Learn Fundamentals and then learn one tool, Snowflake, BigQuery, Redshift, etc… Just
learn one and you are good!
a. Fundamentals
i.Coursera - Data Warehousing for Business Intelligence Specialization
(recommended for deep dive)
ii.Udemy - Data Warehouse Fundamentals for Beginners (recommended for quick
learning)
b. Tools
i.Snowflake - Snowflake – The Complete Masterclass
ii.Snowflake Doc - https://www.snowflake.com/certifications/
7. Learn Batch Processing + Tool
a. Spark Fundamentals
i.DataCamp - Big Data Fundamentals with PySpark (recommended)
ii.Udemy - Spark and Python for Big Data with PySpark
b. Databricks
i.Udemy - Azure Databricks & Spark Core
ii.Udemy - Databricks Certified Data Engineer Associate
iii.Coursera - Databricks for Data Engineering
8. Learn RealTime Streaming
a. Realtime Streaming (Kafka)
i.Udemy - Apache Kafka Course for Beginners: Learn Kafka Online (check this)
ii.edX - Building ETL and Data Pipelines with Bash, Airflow, and Kafka
Do Hands-On Project - Stock Market Real-Time Streaming Pipeline
9. Data Orchestration (AirFlow)
a. Udemy - The Complete Hands-On Introduction to Apache Airflow
b. DataCamp - Airflow
Do Hands-On Project - Twitter Data Pipeline using Airflow
10. Cloud Computing
Advance section, do courses, and then do the certification to add value in your
Resume, If you are new then start with AWS but if you know about
other clouds then you can do that too!
a. AWS (Amazon Web Services)
i.Udemy - Ultimate AWS Certified Cloud Practitioner
ii.Udemy - Ultimate AWS Certified Solutions Architect Associate (SAA)
iii.Coursera - AWS Solution Architect Associate
b. GCP (Google Cloud Platform)
i.Coursera - Cloud Data Engineer Professional Certificate
c. Microsoft Azure
i.Coursera - Microsoft Azure Data Engineering Associate
ii.Udemy - AZ-900: Microsoft Azure Fundamentals
iii.Udemy - Azure Data Engineer Certified:8 COURSE BUNDLE
Do Hands-On Project
1. Build ETL Pipeline Using AWS Cloud
2. Covid Data Analysis Project
3. YouTube Data Analysis (End-To-End Data Engineering Project)
11. Learn Modern Data Stack
a. Learn Basics - https://analyticsindiamag.com/modern-data-stack-and-what-we-know-about-it/
b. Dbt - https://www.getdbt.com/dbt-learn/
c. Airbyte - https://airbyte.com/
d. Fivetran - https://www.fivetran.com/
12. DataOps
a. Docker Guide - https://www.coursera.org/projects/docker-for-absolute-beginners
b. Udemy - Docker & Kubernetes: The Practical Guide
Recommended Books
1. Designing Data-Intensive Applications
2. Fundamentals of Data Engineering
3. The Data Warehouse Toolkit
Read Real-World Case Studies
1. Netflix - https://netflixtechblog.medium.com/
2. AWS - https://aws.amazon.com/solutions/case-studies/
3. GCP - https://cloud.google.com/customers
4. Azure - https://azure.microsoft.com/en-us/resources/customer-stories/
Follow Me Here:
1. Twitter - https://twitter.com/parmardarshil07
2. Linkedin - https://www.linkedin.com/in/darshil-parmar/
3. YouTube - https://www.youtube.com/c/DarshilParmar
Jayzern: Data Engineering Road Map
https://bittersweet-mall-f00.notion.site/Fundamentals-b41c33ba1ab04e858a2be06946510c7e
https://bittersweet-mall-f00.notion.site/Core-Data-Skills-4de7de1787574324852916c2ecd257a5
https://bittersweet-mall-f00.notion.site/Advanced-Data-Skills-a87ccae98b4442d4861c72836fb2d376
Arif Alam: Data Engineering Road Map
https://www.linkedin.com/pulse/roadmap-becoming-data-engineer-2023-arif-alam-/
Have you found any cool resources about data engineering? Put them here
Learning Data Engineering
Courses
• Data Engineering Zoomcamp by DataTalks.Club (free)
• Big Data Platforms, Autumn 2022: Introduction to Big Data Processing Frameworks by the
University of Helsinki (free)
• Awesome Data Engineering Learning Path
Books
• Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable
Systems by Martin Kleppmann
• Big Data: Principles and Best Practices of Scalable Realtime Data Systems by Nathan Marz, James
Warren
• Practical DataOps: Delivering Agile Data Science at Scale by Harvinder Atwal
• Data Pipelines Pocket Reference: Moving and Processing Data for Analytics by James Densmore
• Best books for data engineering
• Fundamentals of Data Engineering: Plan and Build Robust Data Systems by Joe Reis, Matt
Housley
Introduction to Data engineering terms
• https://datatalks.club/podcast/s05e02-data-engineering-acronyms.html
Data engineering in practice
Conference talks from companies, blog posts, etc
• Uber Data Archives (Uber engineering blog)
• Data Engineering Weekly (DE-focused substack)
• Seattle Data Guy (DE-focused substack)
Doing Data Engineering
Coding & Python
• CS50's Introduction to Computer Science | edX (course)
• Python for Everybody SpecialisationSpecialization (course)
• Practical Python programming
SQL
• Intro to SQL: Querying and managing data | Khan Academy
• Mode SQL Tutorial
• Use The Index, Luke (SQL Indexing and Tuning e-Book)
• SQL Performance Explained (book)
Workflow orchestration
• What is DAG? (video)
• Airflow, Prefect, and Dagster: An Inside Look (blog post)
• Open-Source Spotlight - Prefect - Kevin Kho (video)
• Prefect as a Data Engineering Project Workflow Tool, with Mary Clair Thompson (Duke) -
11/6/2020 (video)
ETL and ELT
• ETL vs. ELT: What’s the Difference? (blog post) (print version)
Data lakes
• An Introduction to Modern Data Lake Storage Layers (Hodi, Iceberg, Delta Lake) (blog post)
• Lake House Architecture @ Halodoc: Data Platform 2.0 (blog post)
ㅋData warehousing
• Guide to Data Warehousing. Short and comprehensive information… | by Tomas Peluritis (blog
post)
• Snowflake, Redshift, BigQuery, and Others: Cloud Data Warehouse Tools Compared (blog post)
Streaming
•
DataOps
• DataOps 101 with Lars Albertsson – DataTalks.Club (podcast)
•
Monitoring and observability
• Data Observability: The Next Frontier of Data Engineering with Barr Moses (podcast)
Analytics engineering
• Analytics Engineer: New Role in a Data Team with Victoria Perez Mola (podcast)
• Modern Data Stack for Analytics Engineering - Kyle Shannon (video)
• Analytics Engineering vs Data Engineering | RudderStack Blog (blog post)
• Learn the Fundamentals of Analytics Engineering with dbt (course)
Data mesh
• Data Mesh in Practice - Max Schultze (video)
Cloud
• https://acceldataio.medium.com/data-engineering-best-practices-how-netflix-keeps-its-data-
infrastructure-cost-effective-dee310bcc910
Reverse ETL
• TODO: What is reverse ETL?
• https://datatalks.club/podcast/s05e02-data-engineering-acronyms.html
•
• Open-Source Spotlight - Grouparoo - Brian Leonard (video)
• Open-Source Spotlight - Castled.io (Reverse ETL) - Arun Thulasidharan (video)
Career in Data Engineering
Career in data engineering:
• From Data Science to Data Engineering with Ellen König – DataTalks.Club (podcast)
• Big Data Engineer vs Data Scientist with Roksolana Diachuk – DataTalks.Club (podcast)
• What Skills Do You Need to Become a Data Engineer (blog post)
• The future history of Data Engineering (blog post)
• What Skills Do Data Engineers Need (blog post)
Data Engineering Management
• Becoming a Data Engineering Manager with Rahul Jain – DataTalks.Club (podcast)
Data engineering projects
• How To Start A Data Engineering Project - With Data Engineering Project Ideas (video)
• Data Engineering Project for Beginners - Batch edition (blog post)
• Building a Data Engineering Project in 20 Minutes (blog post)
• Automating Nike Run Club Data Analysis with Python, Airflow and Google Data Studio | by Rich
Martin | Medium (blog post)
Data Engineering Resources
Blogs
• Start Data Engineering
Podcasts
• The Data Engineering Podcast
• DataTalks.Club Podcast (only some episodes are about data engineering)
•
Communities
• DataTalks.Club
• /r/dataengineering
•
Meetups
• Sydney Data Engineers
People to follow on Twitter and LinkedIn
•
YouTube channels
• Karolina Sowinska - YouTube
• Seattle Data Guy - YouTube
• Andreas Kretz - YouTube
• DataTalksClub - YouTube (only some videos are about data engineering)
Resource aggregators
• Reading List by Lars Albertsson
• GitHub - igorbarinov/awesome-data-engineering: A curated list of data engineering tools for
software developers (focus is more on tools)
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
CC BY 4.0

More Related Content

PPTX
Introduction to Data Engineering
PDF
2024 Fastest Way To Learn Data Engineering FREE on YouTube.pdf
PDF
Data Engineering.pdf
PDF
Data engineering zoomcamp introduction
PDF
Introduction to Data Engineer and Data Pipeline at Credit OK
PPTX
Introduction to Data Engineering
PDF
Data Engineering Course Syllabus - WeCloudData
PPTX
Key Skills Required for Data Engineering
Introduction to Data Engineering
2024 Fastest Way To Learn Data Engineering FREE on YouTube.pdf
Data Engineering.pdf
Data engineering zoomcamp introduction
Introduction to Data Engineer and Data Pipeline at Credit OK
Introduction to Data Engineering
Data Engineering Course Syllabus - WeCloudData
Key Skills Required for Data Engineering

Similar to Data_Engineering_Learning_Roadmap.pdf (20)

PDF
The Basics of Data Engineering with IABAC
PDF
azure-cloud-data-engineer-training-curriculum (1).pdf
PDF
Purdue-Data-Engineering (1).pdf
PDF
Data Science as Scale
PDF
The role of data engineering in data science and analytics practice
PPTX
05. Comprehensive-Guide-to-the-Data-Engineer-Role.pptx
PPTX
Cloud Data Engineering GCP vs AWS vs Azure – Visualpath.pptx
PDF
The Evolving Role of the Data Engineer - Whitepaper | Qubole
PDF
What Is Data Engineering? A Beginner’s Guide (2025 Update)
PPTX
Introduction to Data Engineering
PDF
2024-07-eb-big-book-of-data-engineering-3rd-edition.pdf
PDF
Big data pipelines
PPTX
Navigating the Tech Landscape of Software Development
PPTX
semana1.pptx
PDF
How to Build a Data Engineering Career | IABAC
PPTX
Azure Certification | Azure Fundamentals to DevOps
PPTX
DA_01_Intro.pptx
PPTX
Hands On: Introduction to the Hadoop Ecosystem
PPTX
Data Engineering Roles
PPTX
Building Modern Data Pipelines on GCP via a FREE online Bootcamp
The Basics of Data Engineering with IABAC
azure-cloud-data-engineer-training-curriculum (1).pdf
Purdue-Data-Engineering (1).pdf
Data Science as Scale
The role of data engineering in data science and analytics practice
05. Comprehensive-Guide-to-the-Data-Engineer-Role.pptx
Cloud Data Engineering GCP vs AWS vs Azure – Visualpath.pptx
The Evolving Role of the Data Engineer - Whitepaper | Qubole
What Is Data Engineering? A Beginner’s Guide (2025 Update)
Introduction to Data Engineering
2024-07-eb-big-book-of-data-engineering-3rd-edition.pdf
Big data pipelines
Navigating the Tech Landscape of Software Development
semana1.pptx
How to Build a Data Engineering Career | IABAC
Azure Certification | Azure Fundamentals to DevOps
DA_01_Intro.pptx
Hands On: Introduction to the Hadoop Ecosystem
Data Engineering Roles
Building Modern Data Pipelines on GCP via a FREE online Bootcamp
Ad

Recently uploaded (20)

PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
additive manufacturing of ss316l using mig welding
PDF
PPT on Performance Review to get promotions
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
Construction Project Organization Group 2.pptx
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
DOCX
573137875-Attendance-Management-System-original
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Sustainable Sites - Green Building Construction
PPTX
web development for engineering and engineering
Embodied AI: Ushering in the Next Era of Intelligent Systems
additive manufacturing of ss316l using mig welding
PPT on Performance Review to get promotions
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
Lecture Notes Electrical Wiring System Components
Construction Project Organization Group 2.pptx
Internet of Things (IOT) - A guide to understanding
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
573137875-Attendance-Management-System-original
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
UNIT 4 Total Quality Management .pptx
bas. eng. economics group 4 presentation 1.pptx
Sustainable Sites - Green Building Construction
web development for engineering and engineering
Ad

Data_Engineering_Learning_Roadmap.pdf

  • 1. Darshil – Data Engineering Road Map: 1. Computer Science Fundamentals (If you don’t have a CS background) Watch this if you don’t have a computer science background, as a Data Engineer having good knowledge of CS fundamentals is important to understand big systems and how they work Watching these videos will give you a basic understanding of CS fundamentals You can watch the first 7 lectures from this playlist a. CS50 2022 b. Book - Grokking Algorithms: An illustrated guide 2. Programming Language Do any courses, your main goal here is to understand how to write basic Python Code and how to work with different datasets! a. Darshil - Python for Data Engineering (Recommended) b. DataCamp - Data Engineering With Python c. Coursera - Python for Everybody Specialization (Do this if you don’t know anything about python) d. Udemy - Python Bootcamps: Learn Python Programming and Code Training e. freeCodeCamp - Learn Python - Full Course for Beginners Practice Projects: • Scrape Data Using BeautifulSoup Library eg. Amazon, Covid, Wikipedia, or any website you like • Build A Calculator Using Python 3. SQL (Structured Query Language) Learn about the basics of SQL and how to write queries, once you complete the course make sure you do hands-on practice on Hackerrank or any website you like! a. Udemy - The Complete SQL Bootcamp for the Manipulation and Analysis of Data (Recommended) b. Coursera - SQL for Data Science c. DataCamp - Intro To SQL DataCamp Practice SQL here • Hackerrank SQL
  • 2. 4. Basics Of Linux Why Linux? Because you will be working with many remote machines, doing SSH to access them, and performing operations so it’s important to learn them. You don’t have to remember all the commands but just understand what they do and how to write them a. Udemy - Linux for Beginners: Linux Basics b. Coursera - Linux Fundamentals c. freeCodeCamp - Top 50 Most Popular Linux Commands (Recommended) Do Hands-On Project • Beginner Data Engineering Portfolio Project (Recommended) 5. Big Data Fundamentals This section is theoretical and you need to understand how big data system works and their history of them a. Coursera - Big Data Specialization (Recommended) b. Udemy - Learn Big Data: The Hadoop Ecosystem Masterclass (Do this if you want to learn about legacy systems) 6. Data Warehouse Fundamentals + Tool Learn Fundamentals and then learn one tool, Snowflake, BigQuery, Redshift, etc… Just learn one and you are good! a. Fundamentals i.Coursera - Data Warehousing for Business Intelligence Specialization (recommended for deep dive) ii.Udemy - Data Warehouse Fundamentals for Beginners (recommended for quick learning) b. Tools i.Snowflake - Snowflake – The Complete Masterclass ii.Snowflake Doc - https://www.snowflake.com/certifications/ 7. Learn Batch Processing + Tool a. Spark Fundamentals i.DataCamp - Big Data Fundamentals with PySpark (recommended) ii.Udemy - Spark and Python for Big Data with PySpark b. Databricks i.Udemy - Azure Databricks & Spark Core
  • 3. ii.Udemy - Databricks Certified Data Engineer Associate iii.Coursera - Databricks for Data Engineering 8. Learn RealTime Streaming a. Realtime Streaming (Kafka) i.Udemy - Apache Kafka Course for Beginners: Learn Kafka Online (check this) ii.edX - Building ETL and Data Pipelines with Bash, Airflow, and Kafka Do Hands-On Project - Stock Market Real-Time Streaming Pipeline 9. Data Orchestration (AirFlow) a. Udemy - The Complete Hands-On Introduction to Apache Airflow b. DataCamp - Airflow Do Hands-On Project - Twitter Data Pipeline using Airflow 10. Cloud Computing Advance section, do courses, and then do the certification to add value in your Resume, If you are new then start with AWS but if you know about other clouds then you can do that too! a. AWS (Amazon Web Services) i.Udemy - Ultimate AWS Certified Cloud Practitioner ii.Udemy - Ultimate AWS Certified Solutions Architect Associate (SAA) iii.Coursera - AWS Solution Architect Associate b. GCP (Google Cloud Platform) i.Coursera - Cloud Data Engineer Professional Certificate c. Microsoft Azure i.Coursera - Microsoft Azure Data Engineering Associate ii.Udemy - AZ-900: Microsoft Azure Fundamentals iii.Udemy - Azure Data Engineer Certified:8 COURSE BUNDLE Do Hands-On Project 1. Build ETL Pipeline Using AWS Cloud 2. Covid Data Analysis Project 3. YouTube Data Analysis (End-To-End Data Engineering Project)
  • 4. 11. Learn Modern Data Stack a. Learn Basics - https://analyticsindiamag.com/modern-data-stack-and-what-we-know-about-it/ b. Dbt - https://www.getdbt.com/dbt-learn/ c. Airbyte - https://airbyte.com/ d. Fivetran - https://www.fivetran.com/ 12. DataOps a. Docker Guide - https://www.coursera.org/projects/docker-for-absolute-beginners b. Udemy - Docker & Kubernetes: The Practical Guide Recommended Books 1. Designing Data-Intensive Applications 2. Fundamentals of Data Engineering 3. The Data Warehouse Toolkit Read Real-World Case Studies 1. Netflix - https://netflixtechblog.medium.com/ 2. AWS - https://aws.amazon.com/solutions/case-studies/ 3. GCP - https://cloud.google.com/customers 4. Azure - https://azure.microsoft.com/en-us/resources/customer-stories/ Follow Me Here: 1. Twitter - https://twitter.com/parmardarshil07 2. Linkedin - https://www.linkedin.com/in/darshil-parmar/ 3. YouTube - https://www.youtube.com/c/DarshilParmar Jayzern: Data Engineering Road Map https://bittersweet-mall-f00.notion.site/Fundamentals-b41c33ba1ab04e858a2be06946510c7e https://bittersweet-mall-f00.notion.site/Core-Data-Skills-4de7de1787574324852916c2ecd257a5 https://bittersweet-mall-f00.notion.site/Advanced-Data-Skills-a87ccae98b4442d4861c72836fb2d376 Arif Alam: Data Engineering Road Map https://www.linkedin.com/pulse/roadmap-becoming-data-engineer-2023-arif-alam-/
  • 5. Have you found any cool resources about data engineering? Put them here Learning Data Engineering Courses • Data Engineering Zoomcamp by DataTalks.Club (free) • Big Data Platforms, Autumn 2022: Introduction to Big Data Processing Frameworks by the University of Helsinki (free) • Awesome Data Engineering Learning Path Books • Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann • Big Data: Principles and Best Practices of Scalable Realtime Data Systems by Nathan Marz, James Warren • Practical DataOps: Delivering Agile Data Science at Scale by Harvinder Atwal • Data Pipelines Pocket Reference: Moving and Processing Data for Analytics by James Densmore • Best books for data engineering • Fundamentals of Data Engineering: Plan and Build Robust Data Systems by Joe Reis, Matt Housley Introduction to Data engineering terms • https://datatalks.club/podcast/s05e02-data-engineering-acronyms.html Data engineering in practice Conference talks from companies, blog posts, etc • Uber Data Archives (Uber engineering blog) • Data Engineering Weekly (DE-focused substack) • Seattle Data Guy (DE-focused substack) Doing Data Engineering Coding & Python • CS50's Introduction to Computer Science | edX (course) • Python for Everybody SpecialisationSpecialization (course)
  • 6. • Practical Python programming SQL • Intro to SQL: Querying and managing data | Khan Academy • Mode SQL Tutorial • Use The Index, Luke (SQL Indexing and Tuning e-Book) • SQL Performance Explained (book) Workflow orchestration • What is DAG? (video) • Airflow, Prefect, and Dagster: An Inside Look (blog post) • Open-Source Spotlight - Prefect - Kevin Kho (video) • Prefect as a Data Engineering Project Workflow Tool, with Mary Clair Thompson (Duke) - 11/6/2020 (video) ETL and ELT • ETL vs. ELT: What’s the Difference? (blog post) (print version) Data lakes • An Introduction to Modern Data Lake Storage Layers (Hodi, Iceberg, Delta Lake) (blog post) • Lake House Architecture @ Halodoc: Data Platform 2.0 (blog post) ㅋData warehousing • Guide to Data Warehousing. Short and comprehensive information… | by Tomas Peluritis (blog post) • Snowflake, Redshift, BigQuery, and Others: Cloud Data Warehouse Tools Compared (blog post) Streaming • DataOps • DataOps 101 with Lars Albertsson – DataTalks.Club (podcast) • Monitoring and observability • Data Observability: The Next Frontier of Data Engineering with Barr Moses (podcast)
  • 7. Analytics engineering • Analytics Engineer: New Role in a Data Team with Victoria Perez Mola (podcast) • Modern Data Stack for Analytics Engineering - Kyle Shannon (video) • Analytics Engineering vs Data Engineering | RudderStack Blog (blog post) • Learn the Fundamentals of Analytics Engineering with dbt (course) Data mesh • Data Mesh in Practice - Max Schultze (video) Cloud • https://acceldataio.medium.com/data-engineering-best-practices-how-netflix-keeps-its-data- infrastructure-cost-effective-dee310bcc910 Reverse ETL • TODO: What is reverse ETL? • https://datatalks.club/podcast/s05e02-data-engineering-acronyms.html • • Open-Source Spotlight - Grouparoo - Brian Leonard (video) • Open-Source Spotlight - Castled.io (Reverse ETL) - Arun Thulasidharan (video) Career in Data Engineering Career in data engineering: • From Data Science to Data Engineering with Ellen König – DataTalks.Club (podcast) • Big Data Engineer vs Data Scientist with Roksolana Diachuk – DataTalks.Club (podcast) • What Skills Do You Need to Become a Data Engineer (blog post) • The future history of Data Engineering (blog post) • What Skills Do Data Engineers Need (blog post) Data Engineering Management • Becoming a Data Engineering Manager with Rahul Jain – DataTalks.Club (podcast) Data engineering projects • How To Start A Data Engineering Project - With Data Engineering Project Ideas (video) • Data Engineering Project for Beginners - Batch edition (blog post) • Building a Data Engineering Project in 20 Minutes (blog post)
  • 8. • Automating Nike Run Club Data Analysis with Python, Airflow and Google Data Studio | by Rich Martin | Medium (blog post) Data Engineering Resources Blogs • Start Data Engineering Podcasts • The Data Engineering Podcast • DataTalks.Club Podcast (only some episodes are about data engineering) • Communities • DataTalks.Club • /r/dataengineering • Meetups • Sydney Data Engineers People to follow on Twitter and LinkedIn • YouTube channels • Karolina Sowinska - YouTube • Seattle Data Guy - YouTube • Andreas Kretz - YouTube • DataTalksClub - YouTube (only some videos are about data engineering) Resource aggregators • Reading List by Lars Albertsson • GitHub - igorbarinov/awesome-data-engineering: A curated list of data engineering tools for software developers (focus is more on tools) License
  • 9. This work is licensed under a Creative Commons Attribution 4.0 International License. CC BY 4.0