I have researched lots of product-based companies like Google, Amazon, Microsoft, Walmart, Paypal, Uber, Netflix, etc for DE roles and I found that these 30 Data Pipeline & Architecture Design questions are almost asked in every interviews, both at the fresher and experienced levels. 𝐁𝐞𝐠𝐢𝐧𝐧𝐞𝐫-𝐋𝐞𝐯𝐞𝐥 𝐃𝐚𝐭𝐚 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐀𝐫𝐜𝐡𝐢𝐭𝐞𝐜𝐭𝐮𝐫𝐞 𝐃𝐞𝐬𝐢𝐠𝐧 1. Design a Data Pipeline to process logs from web servers. 2. Design a batch ETL pipeline to process e-commerce transactions. 3. Design a streaming data pipeline for real-time stock prices. 4. Design a solution to ingest and store sensor data from IoT devices. 5. Design a data ingestion pipeline for CSV/JSON files from S3 to Redshift. 6. Design a user clickstream data pipeline. 7. Design a pipeline to clean and aggregate marketing campaign data. 8. Design a daily job that syncs data from MySQL to BigQuery. 9. Design a basic data lake architecture. 10. Design a system that processes and analyzes ride-sharing trip data. 11. Design a data pipeline to detect fraud in payment transactions. 12. Design a system to track real-time delivery status in a food app. 13. Design an ETL pipeline for mobile app usage metrics. 14. Design a workflow to migrate data between two cloud environments. 15. Design a pipeline to monitor and alert on data quality issues. 𝐄𝐱𝐩𝐞𝐫𝐢𝐞𝐧𝐜𝐞𝐝-𝐋𝐞𝐯𝐞𝐥 𝐃𝐚𝐭𝐚 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐀𝐫𝐜𝐡𝐢𝐭𝐞𝐜𝐭𝐮𝐫𝐞 𝐃𝐞𝐬𝐢𝐠𝐧 16. Design a real-time analytics platform like Uber's Michelangelo. 17. Design a scalable log aggregation and querying system like ELK. 18. Design a CDC (Change Data Capture) system using Debezium and Kafka. 19. Design a batch + streaming hybrid architecture (Lambda/Kappa). 20. Design a warehouse architecture supporting SCD. 21. Design a distributed ETL pipeline using Spark or PySpark. 22. Design a time-series data warehouse for monitoring and IoT. 23. Design an event-driven architecture for order processing using Kafka. 24. Design a metadata management system like Apache Atlas. 25. Design a data catalog and lineage tracker. 26. Design a self-healing pipeline with retry, alert, and failover. 27. Design a real-time dashboard using Kafka + Flink + Druid. 28. Design a scalable system for A/B testing analysis. 29. Design a data pipeline to feed a recommendation engine. 30. Design a multi-tenant data platform for product analytics at scale. Start implementing to stand out in your next Data Engineer role. Join the community: https://lnkd.in/giE3e9yH - 𝐌𝐨𝐜𝐤 𝐈𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰𝐬 𝐟𝐨𝐫 𝐃𝐚𝐭𝐚 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐬: https://lnkd.in/g8Pqypt5 - 𝐈𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰 𝐩𝐫𝐞𝐩 & 𝐏𝐫𝐨𝐯𝐞𝐧 𝐓𝐢𝐩𝐬: https://lnkd.in/gUEVYCGy - 𝐑𝐞𝐬𝐮𝐦𝐞 𝐑𝐞𝐯𝐢𝐞𝐰 𝐚𝐧𝐝 𝐎𝐩𝐭𝐢𝐦𝐢𝐳𝐚𝐭𝐢𝐨𝐧: https://lnkd.in/gp3yZsfW 👋 Follow for more
Super useful list, Perfect prep material for DE interviews. Thanks for sharing Nishant Kumar.
Can you make cicd mechanism with Githib/AzureDevops. Automation with selenium and python automation
This is such a valuable list — thanks for putting this together! 🙌 I’ve just started my journey into Data Engineering and am currently learning Azure Data Factory. Seeing architecture questions like “design a batch ETL pipeline” or “real-time streaming system” helps me understand what skills I should focus on. I’m currently exploring how ADF can be used for batch pipelines — would love to hear how others approached these using Azure tools.
Excellent compilation—these design questions cover both fundamentals and advanced scenarios every data engineer should master.
@
Great Share
Super useful roundup, perfect for interview prep. Thanks for sharing!
Great Share Nishant Kumar
Very helpful Nishant Kumar thanks for sharing
Storyteller | Lead Data Engineer@Wavicle| Linkedin Top Voice 2025,2024 | Globant | Linkedin Learning Instructor | 2xGCP & AWS Certified | LICAP’2022
1wThese are some of the amazing beginner/experienced level architecture design questions to master! Nishant Kumar