SlideShare a Scribd company logo
an introduction to pinot
Jean-François Im <jfim@linkedin.com>
2016-01-04 Tue
outline
Introduction
When to use Pinot?
An overview of the Pinot architecture
Managing Data in Pinot
Data storage
Realtime data in Pinot
Retention
Conclusion
2/38
introduction
what is pinot?
∙ Distributed near-realtime OLAP datastore
∙ Used at LinkedIn for various user-facing (“Who viewed
my profile,” publisher analytics, etc.), client-facing (ad
campaign creation and tracking) and internal analytics
(XLNT, EasyBI, Raptor, etc.)
4/38
what is pinot
∙ Offers a SQL query interface on top of a custom-written
data store
∙ Offers near-realtime ingestion of events from Kafka (a
few seconds latency at most)
∙ Supports pushing data from Hadoop
∙ Can combine data from Hadoop and Kafka at runtime
∙ Scales horizontally and linearly if data size or query
rate increases
∙ Fault tolerant (any component can fail without causing
availability issues, no single point of failure)
∙ Automatic data expiration
5/38
example of queries
SELECT
weeksSinceEpochSunday,
distinctCount(viewerId)
FROM mirrorProfileViewEvents
WHERE vieweeId = ... AND
(viewerPrivacySetting = ’F’ OR
... OR viewerPrivacySetting = ’’) AND
daysSinceEpoch >= 16624 AND
daysSinceEpoch <= 16714
GROUP BY weeksSinceEpochSunday
TOP 20 LIMIT 0
6/38
example of queries
7/38
how does “who viewed my profile” work?
8/38
usage of pinot at linkedin
∙ Over 50 use cases at LinkedIn
∙ Several thousands of queries per second across
multiple data centers
∙ Operates 24x7, exposes metrics for production
monitoring
∙ The internal de facto solution for scalable data
querying
9/38
when to use pinot?
design limitations
∙ Pinot is designed for analytical workloads (OLAP), not
transactional ones (OLTP)
∙ Data in Pinot is immutable (eg. no UPDATE statement),
though it can be overwritten in bulk
∙ Realtime data is append-only (can only load new rows)
∙ There is no support for JOINs or subselects
∙ There are no UDFs for aggregation (work in progress)
11/38
when to use pinot?
∙ When you have an analytics problem (How many of “x”
happened?)
∙ When you have many queries per day and require low
query latency (otherwise use Hadoop for one-time ad
hoc queries)
∙ When you can’t pre-aggregate data to be stored in
some other storage system (otherwise use Voldemort
or an OLAP cubing solution)
12/38
an overview of the pinot
architecture
controller, broker and server
∙ There are three components in Pinot: Controller, broker
and server
∙ Controller: Handles cluster-wide coordination using
Apache Helix and Apache Zookeeper
∙ Broker: Handles query fan out and query routing to
servers
∙ Server: Responds to query requests originating from
the brokers
14/38
controller, broker and server
15/38
controller, broker and server
∙ All of these components are redundant, so there is no
single point of failure by design
∙ Uses Zookeeper as a coordination mechanism
16/38
managing data in pinot
getting data into pinot
∙ Let’s first look at the offline case. We have data in
Hadoop that we would like to get into Pinot.
18/38
getting data into pinot
∙ Data in pinot is packaged into segments, which contain
a set of rows
∙ These are then uploaded into Pinot
19/38
getting data into pinot
∙ A segment is a pre-built index over this set of rows
∙ Data in Pinot is stored in columnar format (we’ll get to
this later)
∙ Each input Avro file maps to one Pinot segment
20/38
getting data into pinot
∙ Each segment file that is generated contains both the
minimum and maximum timestamp contained in the
data
∙ Each segment file also has a sequential number
appended to the end
∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_0
∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_1
∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_2
21/38
getting data into pinot
∙ Data uploaded into Pinot is stored on a segment basis
∙ Uploading a segment with the same name overwrites
the data that currently exists in that segment
∙ This is the only way to update data in Pinot
22/38
data storage
data orientation: rows and columns
∙ Most OLTP databases store data in a row-oriented
format
∙ Pinot stores its data in a column-oriented format
∙ If you have heard the terms array of structures (AoS)
and structure of arrays (SoA), this is the same idea
24/38
data orientation: rows and columns
25/38
benefits of column-orientation
∙ Queries only read the data they need (columns not
used in a query are not read)
∙ Individual row lookups are slower, aggregations are
faster
∙ Compression can be a lot more effective, as related
data is packed together
26/38
a couple of tricks
∙ Pinot uses a couple of techniques to reduce data size
∙ Dictionary encoding allows us to deduplicate repetitive
data in a single column (eg. country, state, gender)
∙ Bit packing allows us to pack multiple values in the
same byte/word/dword
27/38
realtime data in pinot
tables: offline and realtime
∙ Pinot has two kinds of tables: offline and realtime
∙ An offline table stores data that has been pushed from
Hadoop, while a realtime sources its data from Kafka
∙ These two tables are disjoint and can contain the same
data
29/38
data ingestion
∙ Realtime data ingestion is done through Kafka
∙ In the open source release, there is a JSON decoder
and an Avro decoder for messages
∙ This architecture allows plugging in new data ingestion
sources (eg. other message queuing systems), though
at this time there are no other sources implemented
30/38
hybrid querying
∙ Since realtime and offline tables are disjoint, how are
they queried?
∙ If an offline and realtime table have the same name,
when a broker receives a query, it rewrites it to two
queries, one for the offline and one for the realtime
table
31/38
hybrid querying
∙ Data is partitioned according to a time column, with a
preference given to offline data
32/38
data
∙ Since there are two data sources for the same data, if
there is an issue with one (eg. Kafka/Samza issue or
Hadoop cluster issue), the other one is used to answer
queries
∙ This means that you don’t get called in the middle of
the night for data-related issues and there’s a large
time window for fixing issues
33/38
retention
retention
∙ Tables in Pinot can have a customizable retention
period
∙ Segments will be expunged automatically when their
last timestamp is past the retention period
∙ This is done by a process called the retention manager
35/38
retention
∙ Offline and realtime tables have different retention
periods. For example, “who viewed my profile?” has a
realtime retention of seven days and an offline
retention period of 90 days.
∙ This means that even if the Hadoop job doesn’t run for
a couple of days, data from the realtime flow will
answer the query
36/38
conclusion
conclusion
∙ Pinot is a realtime distributed analytical data store that
can handle interactive analytical queries running on
large amounts of data
∙ It’s used for various internal and external use-cases at
LinkedIn
∙ It’s open source! (github.com/linkedin/pinot)
∙ Ping me if you want to deploy it, I’ll help you out
38/38

More Related Content

PDF
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
PPTX
Apache Pinot Meetup Sept02, 2020
PDF
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PDF
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
PPTX
Druid deep dive
PPTX
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Apache Pinot Meetup Sept02, 2020
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Tame the small files problem and optimize data layout for streaming ingestion...
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Druid deep dive
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache

What's hot (20)

PDF
Flink powered stream processing platform at Pinterest
PDF
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
PDF
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
PDF
Presto anatomy
PDF
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
PDF
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
PPTX
Bootstrapping state in Apache Flink
PPTX
Real-time Analytics with Trino and Apache Pinot
PDF
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
PDF
ClickHouse Introduction, by Alexander Zaitsev, Altinity CTO
PDF
Introduction to DataFusion An Embeddable Query Engine Written in Rust
PDF
Apache Iceberg - A Table Format for Hige Analytic Datasets
PDF
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
PPTX
Evening out the uneven: dealing with skew in Flink
PPTX
How to be Successful with Scylla
PDF
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
PDF
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
PPTX
Apache Hadoop YARN: best practices
PDF
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
PDF
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
Flink powered stream processing platform at Pinterest
Best Practice of Compression/Decompression Codes in Apache Spark with Sophia...
Streaming Event Time Partitioning with Apache Flink and Apache Iceberg - Juli...
Presto anatomy
Pinot: Enabling Real-time Analytics Applications @ LinkedIn's Scale
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Bootstrapping state in Apache Flink
Real-time Analytics with Trino and Apache Pinot
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
ClickHouse Introduction, by Alexander Zaitsev, Altinity CTO
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Apache Iceberg - A Table Format for Hige Analytic Datasets
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
Evening out the uneven: dealing with skew in Flink
How to be Successful with Scylla
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Apache Hadoop YARN: best practices
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
Ad

Similar to Intro to Pinot (2016-01-04) (20)

PDF
Optimizing Tiered Storage for Low-Latency Real-Time Analytics at AI Scale
PDF
Pinot: Realtime Distributed OLAP datastore
PDF
Pinotcoursera 151103183418-lva1-app6892 (1)
PDF
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
PDF
History of Apache Pinot
PDF
How LinkedIn Democratizes Big Data Visualization
PDF
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
PDF
New Features in Apache Pinot
PDF
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
PDF
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
PDF
Real-Time Analytics: Going Beyond Stream Processing With Apache Pinot
PDF
Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...
PPTX
Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)
PDF
Pinot: Near Realtime Analytics @ Uber
PDF
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
PDF
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
PDF
OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...
PDF
Conhecendo o DynamoDB
PDF
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
PPTX
Observability in real time at scale
Optimizing Tiered Storage for Low-Latency Real-Time Analytics at AI Scale
Pinot: Realtime Distributed OLAP datastore
Pinotcoursera 151103183418-lva1-app6892 (1)
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
History of Apache Pinot
How LinkedIn Democratizes Big Data Visualization
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
New Features in Apache Pinot
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Real-Time Analytics: Going Beyond Stream Processing With Apache Pinot
Analytics: The Final Data Frontier (or, Why Users Need Your Data and How Pino...
Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)
Pinot: Near Realtime Analytics @ Uber
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...
Conhecendo o DynamoDB
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Observability in real time at scale
Ad

Recently uploaded (20)

PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Nekopoi APK 2025 free lastest update
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
AI in Product Development-omnex systems
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
medical staffing services at VALiNTRY
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
top salesforce developer skills in 2025.pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
L1 - Introduction to python Backend.pptx
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
How to Migrate SBCGlobal Email to Yahoo Easily
Nekopoi APK 2025 free lastest update
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
AI in Product Development-omnex systems
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
How to Choose the Right IT Partner for Your Business in Malaysia
medical staffing services at VALiNTRY
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Navsoft: AI-Powered Business Solutions & Custom Software Development
Which alternative to Crystal Reports is best for small or large businesses.pdf
Design an Analysis of Algorithms II-SECS-1021-03
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
top salesforce developer skills in 2025.pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
L1 - Introduction to python Backend.pptx
Softaken Excel to vCard Converter Software.pdf
Odoo POS Development Services by CandidRoot Solutions
Odoo Companies in India – Driving Business Transformation.pdf
ISO 45001 Occupational Health and Safety Management System
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx

Intro to Pinot (2016-01-04)

  • 1. an introduction to pinot Jean-François Im <jfim@linkedin.com> 2016-01-04 Tue
  • 2. outline Introduction When to use Pinot? An overview of the Pinot architecture Managing Data in Pinot Data storage Realtime data in Pinot Retention Conclusion 2/38
  • 4. what is pinot? ∙ Distributed near-realtime OLAP datastore ∙ Used at LinkedIn for various user-facing (“Who viewed my profile,” publisher analytics, etc.), client-facing (ad campaign creation and tracking) and internal analytics (XLNT, EasyBI, Raptor, etc.) 4/38
  • 5. what is pinot ∙ Offers a SQL query interface on top of a custom-written data store ∙ Offers near-realtime ingestion of events from Kafka (a few seconds latency at most) ∙ Supports pushing data from Hadoop ∙ Can combine data from Hadoop and Kafka at runtime ∙ Scales horizontally and linearly if data size or query rate increases ∙ Fault tolerant (any component can fail without causing availability issues, no single point of failure) ∙ Automatic data expiration 5/38
  • 6. example of queries SELECT weeksSinceEpochSunday, distinctCount(viewerId) FROM mirrorProfileViewEvents WHERE vieweeId = ... AND (viewerPrivacySetting = ’F’ OR ... OR viewerPrivacySetting = ’’) AND daysSinceEpoch >= 16624 AND daysSinceEpoch <= 16714 GROUP BY weeksSinceEpochSunday TOP 20 LIMIT 0 6/38
  • 8. how does “who viewed my profile” work? 8/38
  • 9. usage of pinot at linkedin ∙ Over 50 use cases at LinkedIn ∙ Several thousands of queries per second across multiple data centers ∙ Operates 24x7, exposes metrics for production monitoring ∙ The internal de facto solution for scalable data querying 9/38
  • 10. when to use pinot?
  • 11. design limitations ∙ Pinot is designed for analytical workloads (OLAP), not transactional ones (OLTP) ∙ Data in Pinot is immutable (eg. no UPDATE statement), though it can be overwritten in bulk ∙ Realtime data is append-only (can only load new rows) ∙ There is no support for JOINs or subselects ∙ There are no UDFs for aggregation (work in progress) 11/38
  • 12. when to use pinot? ∙ When you have an analytics problem (How many of “x” happened?) ∙ When you have many queries per day and require low query latency (otherwise use Hadoop for one-time ad hoc queries) ∙ When you can’t pre-aggregate data to be stored in some other storage system (otherwise use Voldemort or an OLAP cubing solution) 12/38
  • 13. an overview of the pinot architecture
  • 14. controller, broker and server ∙ There are three components in Pinot: Controller, broker and server ∙ Controller: Handles cluster-wide coordination using Apache Helix and Apache Zookeeper ∙ Broker: Handles query fan out and query routing to servers ∙ Server: Responds to query requests originating from the brokers 14/38
  • 15. controller, broker and server 15/38
  • 16. controller, broker and server ∙ All of these components are redundant, so there is no single point of failure by design ∙ Uses Zookeeper as a coordination mechanism 16/38
  • 18. getting data into pinot ∙ Let’s first look at the offline case. We have data in Hadoop that we would like to get into Pinot. 18/38
  • 19. getting data into pinot ∙ Data in pinot is packaged into segments, which contain a set of rows ∙ These are then uploaded into Pinot 19/38
  • 20. getting data into pinot ∙ A segment is a pre-built index over this set of rows ∙ Data in Pinot is stored in columnar format (we’ll get to this later) ∙ Each input Avro file maps to one Pinot segment 20/38
  • 21. getting data into pinot ∙ Each segment file that is generated contains both the minimum and maximum timestamp contained in the data ∙ Each segment file also has a sequential number appended to the end ∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_0 ∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_1 ∙ mirrorProfileViewEvents_2015-10-04_2015-10-04_2 21/38
  • 22. getting data into pinot ∙ Data uploaded into Pinot is stored on a segment basis ∙ Uploading a segment with the same name overwrites the data that currently exists in that segment ∙ This is the only way to update data in Pinot 22/38
  • 24. data orientation: rows and columns ∙ Most OLTP databases store data in a row-oriented format ∙ Pinot stores its data in a column-oriented format ∙ If you have heard the terms array of structures (AoS) and structure of arrays (SoA), this is the same idea 24/38
  • 25. data orientation: rows and columns 25/38
  • 26. benefits of column-orientation ∙ Queries only read the data they need (columns not used in a query are not read) ∙ Individual row lookups are slower, aggregations are faster ∙ Compression can be a lot more effective, as related data is packed together 26/38
  • 27. a couple of tricks ∙ Pinot uses a couple of techniques to reduce data size ∙ Dictionary encoding allows us to deduplicate repetitive data in a single column (eg. country, state, gender) ∙ Bit packing allows us to pack multiple values in the same byte/word/dword 27/38
  • 29. tables: offline and realtime ∙ Pinot has two kinds of tables: offline and realtime ∙ An offline table stores data that has been pushed from Hadoop, while a realtime sources its data from Kafka ∙ These two tables are disjoint and can contain the same data 29/38
  • 30. data ingestion ∙ Realtime data ingestion is done through Kafka ∙ In the open source release, there is a JSON decoder and an Avro decoder for messages ∙ This architecture allows plugging in new data ingestion sources (eg. other message queuing systems), though at this time there are no other sources implemented 30/38
  • 31. hybrid querying ∙ Since realtime and offline tables are disjoint, how are they queried? ∙ If an offline and realtime table have the same name, when a broker receives a query, it rewrites it to two queries, one for the offline and one for the realtime table 31/38
  • 32. hybrid querying ∙ Data is partitioned according to a time column, with a preference given to offline data 32/38
  • 33. data ∙ Since there are two data sources for the same data, if there is an issue with one (eg. Kafka/Samza issue or Hadoop cluster issue), the other one is used to answer queries ∙ This means that you don’t get called in the middle of the night for data-related issues and there’s a large time window for fixing issues 33/38
  • 35. retention ∙ Tables in Pinot can have a customizable retention period ∙ Segments will be expunged automatically when their last timestamp is past the retention period ∙ This is done by a process called the retention manager 35/38
  • 36. retention ∙ Offline and realtime tables have different retention periods. For example, “who viewed my profile?” has a realtime retention of seven days and an offline retention period of 90 days. ∙ This means that even if the Hadoop job doesn’t run for a couple of days, data from the realtime flow will answer the query 36/38
  • 38. conclusion ∙ Pinot is a realtime distributed analytical data store that can handle interactive analytical queries running on large amounts of data ∙ It’s used for various internal and external use-cases at LinkedIn ∙ It’s open source! (github.com/linkedin/pinot) ∙ Ping me if you want to deploy it, I’ll help you out 38/38