SlideShare a Scribd company logo
U B E R | Data
Hadoop Infrastructure
@Uber Past , Present and
Future
Mayank Bansal
U B E R | Data
Senior Software Engineer, Uber
Hadoop Committer, Oozie PMC
Past: Sr. Staff @ ebay, Worked on Hadoop
Sr. Eng @ Yahoo, Worked on Oozie
Who Am I
U B E R | Data
• Past
• Challenges
• Present
• Done Along the way
• Future
• Work Ahead
Agenda
U B E R | Data
“ Transportation as reliable as running water ,
everywhere, for everyone ”
Uber’s Mission
75+ Countries 500+ Cities
And growing…
U B E R | Data
How Uber works
U B E R | Data
How Uber works
U B E R | Data
How Uber works
U B E R | Data
Data Driven Decisions
U B E R | Data
Uber’s Data Audience
● 1000s of City Operators (Uber Ops!)
○ On the ground team who run and scale uber’s
transportation network
● 100s of Data Scientists and Analysts
○ Spread across various functional groups including
Engineering, Marketing, BizDev etc
● 10s of Engineering Teams
○ Focussed on building automated Data Applications
U B E R | Data
Data Infra Once Upon a time.. (2014)
Kafka Logs
Key-Val DB
RDBMS DBs
S3
Applications
…
ETL
Business Ops
A/B Experiments
Adhoc Analytics
City Ops
Vertica
Data Warehouse
Data
Science
EMR
U B E R | Data
Pain Points
● Scalability
○ Data Grew faster then we expected
● Reliability
○ There were no checks in place to validate
U B E R | Data
Hadoop Scale (2015)
~Few Servers Some Data
No Hives No Presto
~100
Jobs/day
Some
Spark
Apps
U B E R | Data
Data Infrastructure Today
Kafka8 Logs
Schemaless DB
SOA DBs
Service Accounts
…
ETL
Machine Learning
Experimentation
Data Science
Adhoc Analytics
Ops/Data Science
HDFS
City Ops
Data
Science
Spark| Presto
Hive
Vertica
U B E R | Data
Hadoop Scale Today
~Few Thousand
Servers
Many Many
PBs
~20k
Hive
queries/day
~100k
Presto
queries/day
100k
Jobs/day
Few Thousand
Spark Apps /
day
U B E R | Data
A Few things we solved along the way..
● Strict Schema Management
○ Because our largest data audience are SQL
Savvy! (1000s of Uber Ops!)
○ SQL = Strict Schema
● Big Data Processing Tools Unlocked -
Hive, Presto and Spark
○ Migrate SQL savvy users from Vertica to Hive
& Presto (1000s of Ops & 100s of data
scientists & analysts)
○ Spark for more advanced users - 100s of data
scientists
U B E R | Data
A Few things we solved along the way..
● Scalable Ingestion Model
○ Data Grows exponentially
○ Need to think about this from the beginning
● Data Tools
○ Automated Hive registration Hdrone
○ Janus, http end point for running hive, presto
queries
○ Used by query builder
U B E R | Data
Yay, We Did it !!!
U B E R | Data
Now What ???
U B E R | Data
Hadoop Evolution @ ebay
2014
Few Nodes
Some Data
2015
~100’s Nodes
Few PB Data
3000+ node
30,000+ cores
50+ PB
2016
~1000 Nodes
~10’s PB Data
Hadoop Evolution @ Uber
2017
~ 5000 Nodes
~ 100’s PB Data
U B E R | Data
Hadoop Cluster Utilization
• Over
provisioning
for the peak
loads.
• Over capacity
for anticipation
of future
growth
U B E R | Data
Hadoop Evolution @ ebay
2014
0 Nodes
2015
Few Nodes
3000+ node
30,000+ cores
50+ PB
2016
~1000’s Nodes
Mesos Evolution @ Uber
2017
~ 10’s
Thousands Nodes
U B E R | Data
Mesos Cluster Utilization
• Over
provisioning for
the peak loads
• Over capacity
for anticipation
of future growth
U B E R | Data
End Goal
Online
Presto
U B E R | Data
What we need ?
GLOBAL VIEW OF RESOURCES
U B E R | Data
Available Resource Managers
U B E R | Data
Mesos vs YARN
YARN MESOS
Single Level Scheduler Two Level Scheduler
Use C groups for isolation Use C groups for Isolation
CPU, Memory as a resource CPU, Memory and Disk as a resource
Works well with Hadoop work loads Works well with longer running
services
YARN support time based
reservations
Mesos does not have support of
reservations
Dominant resource scheduling Scheduling is done by frameworks
and depends on case to case basis
Scales Better
Similar Isolation
Disk is
better
This is Important
Imp for batch SLA’s
Better for batch
U B E R | Data
Let’s tied them together
YARN is good for Hadoop / Batch
Mesos is good for Longer Running Services
In a Nutshell
U B E R | Data
U B E R | Data
• Myriad is Mesos Framework for Apache
YARN
• Mesos manages Data Center resources
• YARN manages Hadoop workloads
• Myriad
• Gets resources from Mesos
• Launches Node Managers
U B E R | Data
• YARN will handle
resources handed
over to it.
• Mesos will work on
rest of the resources
Myriad’s Limitations
Static Resource Partitioning
U B E R | Data
• YARN will never be able to do over subscription.
• Node Manager will go away
• Fragmentation of resources
• Mesos over subscription can kill YARN too
Myriad’s Limitations
Resource Over Subscription
U B E R | Data
• No Global Quota
Enforcement
• No Global
Priorities
Myriad’s Limitations
U B E R | Data
• Elastic Resource Management
• Utilization
• Stability
• Long List …
Myriad’s Limitations
U B E R | Data
Unified Scheduler
U B E R | Data
Few Takeaways …
• We need one scheduling layer across all
workloads
• Partitioning resources are not good
• At least can save 20-30% resources
• Stability and simplicity wins in Production
• Multi Level of resource Management and
scheduling will not be scalable
U B E R | Data
High Level Characteristics
• Global Quota Management
• Central Scheduling policies
• Over subscription for both Online and Batch
• Isolation and bin packing
• SLA guarantees at Global Level
U B E R | Data
Unified Scheduler
U B E R | Data
Different Schedulers
U B E R | Data
Peloton
U B E R | Data
Peloton - Architecture
U B E R | Data
Peloton – Initial Results
U B E R | Data
Peloton – Done so far
• Batch Workloads Support
• Spark Support
• GPU support
• Distributed Tensorflow support
• Gang Scheduling
U B E R | Data
Peloton – WIP
• YARN Api’s
• State full and stateless services
• Separate placement engines
• State full
• Stateless
• Control panel
• Peloton deploy Peloton
U B E R | Data
Peloton – Timelines
• Beta Released
• Production Early Q3
• Open Source – Q3-Q4 time frame
U B E R | Data
Peloton – Team
Min Shi Jimmy Eskil
Zhitao Tengfei Anant Mayank
U B E R | Data
U B E R | Data
Questions?
mabansal@uber.com
mayank@apache.org
U B E R | Data
Thank You !!!

More Related Content

PPTX
Databricks Fundamentals
PPTX
How to Choose The Right Database on AWS - Berlin Summit - 2019
PDF
Introducing Databricks Delta
PDF
Introduction to Google Cloud Platform
PDF
Introduction of Knowledge Graphs
PDF
Introduction to OpenStack
PPTX
Introduction to Microsoft Azure
Databricks Fundamentals
How to Choose The Right Database on AWS - Berlin Summit - 2019
Introducing Databricks Delta
Introduction to Google Cloud Platform
Introduction of Knowledge Graphs
Introduction to OpenStack
Introduction to Microsoft Azure

What's hot (20)

PPTX
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
PDF
Tools and Recipes to Replatform Monolithic Apps to Modern Cloud Environments
PDF
MLOps Using MLflow
PPTX
Architecting a datalake
PDF
Data Ingest Self Service and Management using Nifi and Kafka
PDF
Benefits of Stream Processing and Apache Kafka Use Cases
PDF
OSMC 2021 | Introduction into OpenSearch
PDF
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
PPTX
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
PDF
Scaling up uber's real time data analytics
PDF
Spring Boot on Amazon Web Services with Spring Cloud AWS
PPSX
Cloud Architecture - Multi Cloud, Edge, On-Premise
PPTX
MLOps in action
PDF
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
PDF
Introduction to Azure Data Lake
PDF
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
PDF
Google Cloud Platform
PDF
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
PDF
Big data and analytics
PDF
A Tour of Google Cloud Platform
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Tools and Recipes to Replatform Monolithic Apps to Modern Cloud Environments
MLOps Using MLflow
Architecting a datalake
Data Ingest Self Service and Management using Nifi and Kafka
Benefits of Stream Processing and Apache Kafka Use Cases
OSMC 2021 | Introduction into OpenSearch
Serverless Kafka on AWS as Part of a Cloud-native Data Lake Architecture
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Scaling up uber's real time data analytics
Spring Boot on Amazon Web Services with Spring Cloud AWS
Cloud Architecture - Multi Cloud, Edge, On-Premise
MLOps in action
End-to-End Spark/TensorFlow/PyTorch Pipelines with Databricks Delta
Introduction to Azure Data Lake
Clickhouse MeetUp@ContentSquare - ContentSquare's Experience Sharing
Google Cloud Platform
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Big data and analytics
A Tour of Google Cloud Platform
Ad

Similar to Hadoop Infrastructure @Uber Past, Present and Future (20)

PDF
Machine learning and big data @ uber a tale of two systems
PDF
Real time analytics on deep learning @ strata data 2019
PDF
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
PDF
Presto @ Uber Hadoop summit2017
PDF
Even Faster: When Presto meets Parquet @ Uber
PPTX
Big Data Pipelines and Machine Learning at Uber
PDF
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
PDF
Spark: Interactive To Production
PDF
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
PDF
AI meets Big Data
PDF
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
PDF
Michelangelo - Machine Learning Platform - 2018
PDF
Machine learning at Scale with Apache Spark
PDF
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
PDF
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
PDF
ML and Data Science at Uber - GITPro talk 2017
PDF
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
PDF
Data Analytics and Machine Learning: From Node to Cluster on ARM64
PDF
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
PDF
Simple, Modular and Extensible Big Data Platform Concept
Machine learning and big data @ uber a tale of two systems
Real time analytics on deep learning @ strata data 2019
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Presto @ Uber Hadoop summit2017
Even Faster: When Presto meets Parquet @ Uber
Big Data Pipelines and Machine Learning at Uber
Flink Forward SF 2017: Chinmay Soman - Real Time Analytics in the real World ...
Spark: Interactive To Production
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
AI meets Big Data
Using Machine Learning & Artificial Intelligence to Create Impactful Customer...
Michelangelo - Machine Learning Platform - 2018
Machine learning at Scale with Apache Spark
Uber - Building Intelligent Applications, Experimental ML with Uber’s Data Sc...
Building Intelligent Applications, Experimental ML with Uber’s Data Science W...
ML and Data Science at Uber - GITPro talk 2017
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
Data Analytics and Machine Learning: From Node to Cluster on ARM64
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
Simple, Modular and Extensible Big Data Platform Concept
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
KodekX | Application Modernization Development
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
A Presentation on Artificial Intelligence
PPTX
Cloud computing and distributed systems.
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPT
Teaching material agriculture food technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Network Security Unit 5.pdf for BCA BBA.
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KodekX | Application Modernization Development
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
A Presentation on Artificial Intelligence
Cloud computing and distributed systems.
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
The AUB Centre for AI in Media Proposal.docx
Teaching material agriculture food technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
“AI and Expert System Decision Support & Business Intelligence Systems”
20250228 LYD VKU AI Blended-Learning.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Encapsulation_ Review paper, used for researhc scholars
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...

Hadoop Infrastructure @Uber Past, Present and Future

  • 1. U B E R | Data Hadoop Infrastructure @Uber Past , Present and Future Mayank Bansal
  • 2. U B E R | Data Senior Software Engineer, Uber Hadoop Committer, Oozie PMC Past: Sr. Staff @ ebay, Worked on Hadoop Sr. Eng @ Yahoo, Worked on Oozie Who Am I
  • 3. U B E R | Data • Past • Challenges • Present • Done Along the way • Future • Work Ahead Agenda
  • 4. U B E R | Data “ Transportation as reliable as running water , everywhere, for everyone ” Uber’s Mission 75+ Countries 500+ Cities And growing…
  • 5. U B E R | Data How Uber works
  • 6. U B E R | Data How Uber works
  • 7. U B E R | Data How Uber works
  • 8. U B E R | Data Data Driven Decisions
  • 9. U B E R | Data Uber’s Data Audience ● 1000s of City Operators (Uber Ops!) ○ On the ground team who run and scale uber’s transportation network ● 100s of Data Scientists and Analysts ○ Spread across various functional groups including Engineering, Marketing, BizDev etc ● 10s of Engineering Teams ○ Focussed on building automated Data Applications
  • 10. U B E R | Data Data Infra Once Upon a time.. (2014) Kafka Logs Key-Val DB RDBMS DBs S3 Applications … ETL Business Ops A/B Experiments Adhoc Analytics City Ops Vertica Data Warehouse Data Science EMR
  • 11. U B E R | Data Pain Points ● Scalability ○ Data Grew faster then we expected ● Reliability ○ There were no checks in place to validate
  • 12. U B E R | Data Hadoop Scale (2015) ~Few Servers Some Data No Hives No Presto ~100 Jobs/day Some Spark Apps
  • 13. U B E R | Data Data Infrastructure Today Kafka8 Logs Schemaless DB SOA DBs Service Accounts … ETL Machine Learning Experimentation Data Science Adhoc Analytics Ops/Data Science HDFS City Ops Data Science Spark| Presto Hive Vertica
  • 14. U B E R | Data Hadoop Scale Today ~Few Thousand Servers Many Many PBs ~20k Hive queries/day ~100k Presto queries/day 100k Jobs/day Few Thousand Spark Apps / day
  • 15. U B E R | Data A Few things we solved along the way.. ● Strict Schema Management ○ Because our largest data audience are SQL Savvy! (1000s of Uber Ops!) ○ SQL = Strict Schema ● Big Data Processing Tools Unlocked - Hive, Presto and Spark ○ Migrate SQL savvy users from Vertica to Hive & Presto (1000s of Ops & 100s of data scientists & analysts) ○ Spark for more advanced users - 100s of data scientists
  • 16. U B E R | Data A Few things we solved along the way.. ● Scalable Ingestion Model ○ Data Grows exponentially ○ Need to think about this from the beginning ● Data Tools ○ Automated Hive registration Hdrone ○ Janus, http end point for running hive, presto queries ○ Used by query builder
  • 17. U B E R | Data Yay, We Did it !!!
  • 18. U B E R | Data Now What ???
  • 19. U B E R | Data Hadoop Evolution @ ebay 2014 Few Nodes Some Data 2015 ~100’s Nodes Few PB Data 3000+ node 30,000+ cores 50+ PB 2016 ~1000 Nodes ~10’s PB Data Hadoop Evolution @ Uber 2017 ~ 5000 Nodes ~ 100’s PB Data
  • 20. U B E R | Data Hadoop Cluster Utilization • Over provisioning for the peak loads. • Over capacity for anticipation of future growth
  • 21. U B E R | Data Hadoop Evolution @ ebay 2014 0 Nodes 2015 Few Nodes 3000+ node 30,000+ cores 50+ PB 2016 ~1000’s Nodes Mesos Evolution @ Uber 2017 ~ 10’s Thousands Nodes
  • 22. U B E R | Data Mesos Cluster Utilization • Over provisioning for the peak loads • Over capacity for anticipation of future growth
  • 23. U B E R | Data End Goal Online Presto
  • 24. U B E R | Data What we need ? GLOBAL VIEW OF RESOURCES
  • 25. U B E R | Data Available Resource Managers
  • 26. U B E R | Data Mesos vs YARN YARN MESOS Single Level Scheduler Two Level Scheduler Use C groups for isolation Use C groups for Isolation CPU, Memory as a resource CPU, Memory and Disk as a resource Works well with Hadoop work loads Works well with longer running services YARN support time based reservations Mesos does not have support of reservations Dominant resource scheduling Scheduling is done by frameworks and depends on case to case basis Scales Better Similar Isolation Disk is better This is Important Imp for batch SLA’s Better for batch
  • 27. U B E R | Data Let’s tied them together YARN is good for Hadoop / Batch Mesos is good for Longer Running Services In a Nutshell
  • 28. U B E R | Data
  • 29. U B E R | Data • Myriad is Mesos Framework for Apache YARN • Mesos manages Data Center resources • YARN manages Hadoop workloads • Myriad • Gets resources from Mesos • Launches Node Managers
  • 30. U B E R | Data • YARN will handle resources handed over to it. • Mesos will work on rest of the resources Myriad’s Limitations Static Resource Partitioning
  • 31. U B E R | Data • YARN will never be able to do over subscription. • Node Manager will go away • Fragmentation of resources • Mesos over subscription can kill YARN too Myriad’s Limitations Resource Over Subscription
  • 32. U B E R | Data • No Global Quota Enforcement • No Global Priorities Myriad’s Limitations
  • 33. U B E R | Data • Elastic Resource Management • Utilization • Stability • Long List … Myriad’s Limitations
  • 34. U B E R | Data Unified Scheduler
  • 35. U B E R | Data Few Takeaways … • We need one scheduling layer across all workloads • Partitioning resources are not good • At least can save 20-30% resources • Stability and simplicity wins in Production • Multi Level of resource Management and scheduling will not be scalable
  • 36. U B E R | Data High Level Characteristics • Global Quota Management • Central Scheduling policies • Over subscription for both Online and Batch • Isolation and bin packing • SLA guarantees at Global Level
  • 37. U B E R | Data Unified Scheduler
  • 38. U B E R | Data Different Schedulers
  • 39. U B E R | Data Peloton
  • 40. U B E R | Data Peloton - Architecture
  • 41. U B E R | Data Peloton – Initial Results
  • 42. U B E R | Data Peloton – Done so far • Batch Workloads Support • Spark Support • GPU support • Distributed Tensorflow support • Gang Scheduling
  • 43. U B E R | Data Peloton – WIP • YARN Api’s • State full and stateless services • Separate placement engines • State full • Stateless • Control panel • Peloton deploy Peloton
  • 44. U B E R | Data Peloton – Timelines • Beta Released • Production Early Q3 • Open Source – Q3-Q4 time frame
  • 45. U B E R | Data Peloton – Team Min Shi Jimmy Eskil Zhitao Tengfei Anant Mayank
  • 46. U B E R | Data
  • 47. U B E R | Data Questions? mabansal@uber.com mayank@apache.org
  • 48. U B E R | Data Thank You !!!