SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Apache Kudu
Updatable Analytical Storage for Modern Data Platform
Sho Shimauchi | Sales Engineer | Cloudera
2© Cloudera, Inc. All rights reserved.
Who Am I?
Sho Shimauchi
Sales Engineer / Technical Evangelist
Joined Cloudera in 2011
The First Employee in Cloudera APJ
Email: sho@cloudera.com
Twitter: @shiumachi
3© Cloudera, Inc. All rights reserved.
•  Founded in 2008
•  1600+ Clouderans
•  Machine learning and analytics platform
•  Shared data experience
•  Cloud-native and cloud-differentiated
•  Open-source innovation and efficiency
4© Cloudera, Inc. All rights reserved.
Rakuten Card replaced Mainframe to
Cloudera Enterprise in 2017
Apache Spark improved performance
of the batch processes >2x
Please join Cloudera World Tokyo
2017 to see Kobayashi-san’s Keynote!
www.clouderaworldtokyo.com
Rakuten Card + Cloudera
5© Cloudera, Inc. All rights reserved.
Why Kudu?
Use Cases and Motivation
6© Cloudera, Inc. All rights reserved. 6
The modern platform for machine learning and analytics optimized for the cloud
EXTENSIBLE
SERVICES
CORE
SERVICES DATA
ENGINEERING
OPERATIONAL
DATABASE
ANALYTIC
DATABASE
DATA CATALOG
INGEST &
REPLICATION
SECURITY GOVERNANCE
WORKLOAD
MANAGEMENT
DATA
SCIENCE
NEW
OFFERINGS
Cloudera Enterprise
Amazon S3 Microsoft ADLS HDFS KUDU
STORAGE
SERVICES
7© Cloudera, Inc. All rights reserved.
HDFS
Fast Scans, Analytics
and Processing of
Stored Data
Fast On-Line
Updates &
Data Serving
Arbitrary Storage
(Active Archive)
Fast Analytics
(on fast-changing or
frequently-updated data)
Unchanging
Fast Changing
Frequent Updates
HBase
Append-Only
Real-Time
Kudu Kudu fills the Gap
Modern analytic applications
often require complex data
flow & difficult integration
work to move data between
HBase & HDFS
Analytic
Gap
Pace of Analysis
PaceofData
Filling the Analytic Gap
8© Cloudera, Inc. All rights reserved.
Apache Kudu: Scalable and fast structured storage
Scalable
•  Tested up to 300+ nodes (PBs cluster)
•  Designed to scale to 1000s of nodes and tens of PBs
Fast
•  Millions of read/write operations per second across cluster
•  Multiple GB/second read throughput per node
Tabular
•  Represents data in structured tables like a relational database
•  Strict schema, finite column count, no BLOBs
•  Individual record-level access to 100+ billion row tables
9© Cloudera, Inc. All rights reserved.
Apache Kudu Community
10© Cloudera, Inc. All rights reserved.
Can you insert time series data in
real time? How long does it take to
prepare it for analysis? Can you get
results and act fast enough to
change outcomes?
Can you handle large volumes of
machine-generated data? Do you
have the tools to identify problems or
threats? Can your system do
machine learning?
How fast can you add data to your
data store? Are you trading off the
ability to do broad analytics for the
ability to make updates? Are you
retaining only part of your data?
Time Series Data Machine Data Analytics Online Reporting
Why Kudu?
11© Cloudera, Inc. All rights reserved.
Cheaper and faster every year.
Persistent memory (3D XPoint™)
Kudu can take advantage of SSD
and NVM using Intel’s NVM Library.
RAM is cheaper and bigger every
day.
Kudu runs smoothly with huge RAM.
Written in C++ to avoid GC issues.
Modern CPUs are adding cores and
SIMD width, not GHz.
Kudu takes advantage of SIMD
instructions and concurrent data
structures.
Next generation hardware
Solid-state Storage Cheaper, Bigger Memory Efficiency on Modern CPUs
12© Cloudera, Inc. All rights reserved.
How it Works
Replication And Fault Tolerance
13© Cloudera, Inc. All rights reserved.
Tables, tablets, and tablet servers
•  Each table is horizontally partitioned into tablets
•  Range or hash partitioning
• PRIMARY KEY (host, metric, timestamp) DISTRIBUTE BY
HASH(timestamp) INTO 100 BUCKETS
•  Each tablet has N replicas (3 or 5) with Raft consensus
•  Automatic fault tolerance
•  MTTR (mean time to repair): ~5 seconds
•  Tablet servers host tablets on local disk drives
•  Master services metadata operations
•  Create/drop tables and tablets
•  Locate tablets
14© Cloudera, Inc. All rights reserved.
Metadata
Replicated master
Acts as a tablet directory
Acts as a catalog (which tables exist, etc)
Acts as a load balancer (tracks TS liveness, re-replicates under-
replicated tablets)
Caches all metadata in RAM for high performance
Client configured with master addresses
Asks master for tablet locations as needed and caches them
15© Cloudera, Inc. All rights reserved.
Client
Hey Master! Where is the row for ‘tlipcon’
in table “T”?
It’s part of tablet 2, which is on servers {Z,Y,X}.
BTW, here’s info on other tablets you might care
about: T1, T2, T3, …
UPDATE tlipcon
SET col=foo
Meta Cache
T1: …
T2: …
T3: …
16© Cloudera, Inc. All rights reserved.
Raft consensus
TS A
Tablet 1
(LEADER)
Client
TS B
Tablet 1
(FOLLOWER)
TS C
Tablet 1
(FOLLOWER)
WAL
WALWAL
2b. Leader writes local WAL
1a. Client->Leader: Write() RPC
2a. Leader->Followers:
UpdateConsensus() RPC
3. Follower: write WAL
4. Follower->Leader: success
3. Follower: write WAL
5. Leader has achieved majority
6. Leader->Client: Success!
17© Cloudera, Inc. All rights reserved.
How it Works
Columnar Storage
18© Cloudera, Inc. All rights reserved.
Row Storage
Scans have to read all the data, no encodings
{23059873, newsycbot, 1442865158, Visual exp…}
{22309487, RideImpala, 1442828307, Introducing …}
…
Tweet_id, user_name, created_at, text
19© Cloudera, Inc. All rights reserved.
{25059873,
22309487,
23059861,
23010982}
Tweet_id
{newsycbot,
RideImpala,
fastly, llvmorg}
User_name
{1442865158,
1442828307,
1442865156,
1442865155}
Created_at
{Visual exp…,
Introducing ..,
Missing July…,
LLVM 3.7….}
text
Columnar Storage
20© Cloudera, Inc. All rights reserved.
SELECT COUNT(*) FROM tweets WHERE user_name = ‘newsycbot’;
{25059873,
22309487,
23059861,
23010982}
Tweet_id
1GB
{newsycbot,
RideImpala,
fastly, llvmorg}
User_name
Only read 1 column
2GB
{1442865158,
1442828307,
1442865156,
1442865155}
Created_at
1GB
{Visual exp…,
Introducing ..,
Missing July…,
LLVM 3.7….}
text
200GB
Columnar Storage
21© Cloudera, Inc. All rights reserved.
{1442825158,
1442826100,
1442827994,
1442828527}
Created_at
Created_at Diff(created_at)
1442825158 n/a
1442826100 942
1442827994 1894	
1442828527 533
64 bits each 11 bits each
Columnar Compression
Many columns can compress to
a few bits per row!
Especially:
Timestamps
Time series values
Low-cardinality strings
Massive space savings and
throughput increase!
22© Cloudera, Inc. All rights reserved.
How it Works
Write and Read Paths
23© Cloudera, Inc. All rights reserved.
LSM vs Kudu
LSM – Log Structured Merge (Cassandra, HBase, etc)
Inserts and updates all go to an in-memory map (MemStore) and later
flush to on-disk files (HFile/SSTable)
Reads perform an on-the-fly merge of all on-disk HFiles
Kudu
Shares some traits (memstores, compactions)
More complex.
Slower writes in exchange for faster reads (especially scans)
24© Cloudera, Inc. All rights reserved.
LSM Insert Path
MemStore
INSERT
Row=r1 col=c1 val=“blah”
Row=r1 col=c2 val=“1”
HFile 1
Row=r1 col=c1 val=“blah”
Row=r1 col=c2 val=“1”
flush
25© Cloudera, Inc. All rights reserved.
LSM Insert Path
MemStore
INSERT
Row=r1 col=c1 val=“blah2”
Row=r1 col=c2 val=“2”
HFile 2
Row=r2 col=c1 val=“blah2”
Row=r2 col=c2 val=“2”
flush
HFile 1Row=r1 col=c1 val=“blah”
Row=r1 col=c2 val=“1”
26© Cloudera, Inc. All rights reserved.
LSM Update path
MemStore
UPDATE
HFile 1
Row=r1 col=c1 val=“blah”
Row=r1 col=c2 val=“2”
HFile 2
Row=r2 col=c1 val=“v2”
Row=r2 col=c2 val=“5”
Row=r2 col=c1 val=“newval”
Note: all updates are “fully
decoupled” from reads.
Random-write workload is
transformed to fully sequential!
27© Cloudera, Inc. All rights reserved.
LSM Read path
MemStore
HFile 1
Row=r1 col=c1 val=“blah”
Row=r1 col=c2 val=“2”
HFile 2
Row=r2 col=c1 val=“v2”
Row=r2 col=c2 val=“5”
Row=r2 col=c1 val=“newval”
Merge based on string
row keys
R1: c1=blah c2=2
R2: c1=newval c2=5
….
CPU intensive!
Must always read
rowkeys
Any given row may exist
across multiple HFiles: must
always merge!
The more HFiles to merge, the
slower it reads
28© Cloudera, Inc. All rights reserved.
Kudu storage – Inserts and Flushes
MemRowSet
INSERT(“todd”, “$1000”,”engineer”)
name pay role
DiskRowSet 1
flush
29© Cloudera, Inc. All rights reserved.
Kudu storage – Inserts and Flushes
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
INSERT(“doug”, “$1B”, “Hadoop man”)
flush
30© Cloudera, Inc. All rights reserved.
Kudu storage - Updates
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta MS
Delta MS
Each DiskRowSet has its
own DeltaMemStore to
accumulate updates
base data
base data
31© Cloudera, Inc. All rights reserved.
Kudu storage - Updates
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta MS
Delta MS
UPDATE set pay=“$1M”
WHERE name=“todd”
Is the row in DiskRowSet 2?
(check bloom filters)
Is the row in DiskRowSet 1?
(check bloom filters)
Bloom says: no!
Bloom says: maybe!
Search key column to find
offset: rowid = 150
150: col
1=$1M
base data
32© Cloudera, Inc. All rights reserved.
Kudu storage – Read path
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta MS
Delta MS
150: pay=$1M
Read rows in DiskRowSet 2
Then, read rows in
DiskRowSet 1
Any row is only in exactly one
DiskRowSet– no need to merge
cross-DRS!
Updates are merged based on
ordinal offset within DRS: array
indexing, no string compares
base data
base data
33© Cloudera, Inc. All rights reserved.
Kudu storage – Delta flushes
MemRowSet
name pay role
DiskRowSet 1
name pay role
DiskRowSet 2
Delta MS
Delta MS
0: pay=fooREDO DeltaFile
Flush
A REDO delta indicates how to
transform between the ‘base
data’ (columnar) and a later
version
base data
base data
34© Cloudera, Inc. All rights reserved.
Kudu storage – Major delta compaction
name pay role
DiskRowSet(pre-compaction)
Delta MS
REDO DeltaFile REDO DeltaFile REDO DeltaFile
Many deltas accumulate: lots of delta
application work on reads
name pay role
DiskRowSet(post-compaction)
Delta MS
Unmerged REDO
deltasUNDO deltas
If a column has few updates, doesn’t need to be re-
written: those deltas maintained in new DeltaFile
Merge updates for columns with high update percentage
base data
35© Cloudera, Inc. All rights reserved.
Kudu storage – RowSet Compactions
DRS 1 (32MB)
[PK=alice], [PK=joe], [PK=linda], [PK=zach]
DRS 2 (32MB)
[PK=bob], [PK=jon], [PK=mary] [PK=zeke]
DRS 3 (32MB)
[PK=carl], [PK=julie], [PK=omar] [PK=zoe]
DRS 4 (32MB) DRS 5 (32MB) DRS 6 (32MB)
[alice, bob, carl, joe] [jon, julie, linda, mary] [omar, zach, zeke, zoe]
Reorganize rows to avoid rowsets with
overlapping key ranges
Writes for “chris” have to perform
bloom lookups on all 3 RS
36© Cloudera, Inc. All rights reserved.
Kudu Storage - Compactions
Main Idea: Always be compacting!
Compactions run continuously to prevent IO storms
”Budgeted” RS compactions: What is the best way to spend X MBs IO?
Physical/Logical decoupling: different replicas run compactions at different
times
37© Cloudera, Inc. All rights reserved.
Conclusion
38© Cloudera, Inc. All rights reserved.
Getting Started
On the web: https://www.cloudera.com/documentation/kudu/latest.html,
https://www.cloudera.com/downloads.html, https://blog.cloudera.com/?s=Kudu,
kudu.apache.org
•  Apache project user mailing list: user@kudu.apache.org
•  Quickstart VM
•  Easiest way to get started
•  Impala and Kudu in an easy-to-install VM
•  CSD and Parcels
•  For installation on a Cloudera Manager-managed cluster
Training classes available: https://www.cloudera.com/more/training.html
39© Cloudera, Inc. All rights reserved.
Nov 7, 2017 Tue
ANA Intercontinental Hotel
Estimated Attendees #: 1000
E-1: Apache Kudu on Analytical Data
Platform
Register Now!
www.clouderaworldtokyo.com
Cloudera World Tokyo 2017
40© Cloudera, Inc. All rights reserved.
Thank	you	
sho@cloudera.com

More Related Content

PDF
Apache kudu
PPTX
Introduction to Apache Kudu
PDF
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
PPTX
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
PPTX
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
PPTX
High concurrency,
Low latency analytics
using Spark/Kudu
PPTX
Intro to Apache Kudu (short) - Big Data Application Meetup
PPTX
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache kudu
Introduction to Apache Kudu
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Using Kafka and Kudu for fast, low-latency SQL analytics on streaming data
February 2016 HUG: Apache Kudu (incubating): New Apache Hadoop Storage for Fa...
High concurrency,
Low latency analytics
using Spark/Kudu
Intro to Apache Kudu (short) - Big Data Application Meetup
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...

What's hot (20)

PDF
A Closer Look at Apache Kudu
PPTX
Introducing Kudu
PDF
Apache Flink & Kudu: a connector to develop Kappa architectures
PDF
Introduction to Apache Kudu
PDF
Introducing Kudu, Big Data Warehousing Meetup
PPTX
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
PPTX
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
PDF
Kudu - Fast Analytics on Fast Data
PPTX
A brave new world in mutable big data relational storage (Strata NYC 2017)
PDF
Kudu: Fast Analytics on Fast Data
PPTX
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
PPTX
Enabling the Active Data Warehouse with Apache Kudu
PDF
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
PPTX
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
PPTX
Kudu Deep-Dive
PDF
SQL Engines for Hadoop - The case for Impala
PDF
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
PDF
Exponea - Kafka and Hadoop as components of architecture
PDF
Low latency high throughput streaming using Apache Apex and Apache Kudu
A Closer Look at Apache Kudu
Introducing Kudu
Apache Flink & Kudu: a connector to develop Kappa architectures
Introduction to Apache Kudu
Introducing Kudu, Big Data Warehousing Meetup
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Kudu - Fast Analytics on Fast Data
A brave new world in mutable big data relational storage (Strata NYC 2017)
Kudu: Fast Analytics on Fast Data
Kudu: New Hadoop Storage for Fast Analytics on Fast Data
Enabling the Active Data Warehouse with Apache Kudu
Kudu: Resolving Transactional and Analytic Trade-offs in Hadoop
Introduction to Kudu: Hadoop Storage for Fast Analytics on Fast Data - Rüdige...
Kudu Deep-Dive
SQL Engines for Hadoop - The case for Impala
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Exponea - Kafka and Hadoop as components of architecture
Low latency high throughput streaming using Apache Apex and Apache Kudu
Ad

Viewers also liked (20)

PDF
Java ee7 with apache spark for the world's largest credit card core systems, ...
PDF
Big data processing using Cloudera Quickstart
PDF
What i learned from translation of the sre ryuji tamagawa
PDF
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
PDF
Rakuten app productivity initiative for developers marcus saw
PDF
Life of an enginner in rakuten osaka diarmaid lindsay
PDF
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
PDF
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
PDF
COBOL to Apache Spark
PDF
Rakutenとsreと私 yanagimoto koichi
PDF
Challenge for statup's cto from big company nagaaki hoshi
PDF
Don't manage too hard!
PDF
One Hundred Languages
PDF
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
PDF
AI based language learning tools
PDF
はてなのインフラの歴史、そしてMackerelへ至る道とこれから
PDF
Predictions and Hard Problems With AI
PDF
WannaEat: A computer vision-based, multi-platform restaurant lookup app
PDF
Human-Centric Machine Learning
PDF
トラブルシューティングのあれこれ Yoshihiko kamata
Java ee7 with apache spark for the world's largest credit card core systems, ...
Big data processing using Cloudera Quickstart
What i learned from translation of the sre ryuji tamagawa
AI AND FUNDAMENTAL GAME TECHNOLOGIESIN FINAL FANTASY XV
Rakuten app productivity initiative for developers marcus saw
Life of an enginner in rakuten osaka diarmaid lindsay
Rakuten Technology Conference 2017 A Distributed SQL Database For Data Analy...
Value Delivery through RakutenBig Data Intelligence Ecosystem and Technology
COBOL to Apache Spark
Rakutenとsreと私 yanagimoto koichi
Challenge for statup's cto from big company nagaaki hoshi
Don't manage too hard!
One Hundred Languages
時間がないといって、オペレーション改善を怠るな~オペレーション改善奮闘記~ Emi muroya
AI based language learning tools
はてなのインフラの歴史、そしてMackerelへ至る道とこれから
Predictions and Hard Problems With AI
WannaEat: A computer vision-based, multi-platform restaurant lookup app
Human-Centric Machine Learning
トラブルシューティングのあれこれ Yoshihiko kamata
Ad

Similar to cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform (17)

PDF
Apache Kudu - Updatable Analytical Storage #rakutentech
PPTX
SFHUG Kudu Talk
PDF
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
PDF
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
PPTX
Apache Kudu: Technical Deep Dive


PDF
Kudu austin oct 2015.pptx
PPTX
Introduction to Kudu - StampedeCon 2016
PDF
Spark Summit EU talk by Mike Percy
PPTX
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
ODP
The power of hadoop in cloud computing
PPTX
Simplifying Real-Time Architectures for IoT with Apache Kudu
PPTX
Part 1: Lambda Architectures: Simplified by Apache Kudu
PPTX
Moving Beyond Lambda Architectures with Apache Kudu
PDF
Kudu Cloudera Meetup Paris
PDF
Application Architectures with Hadoop - Big Data TechCon SF 2014
PDF
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Apache Kudu - Updatable Analytical Storage #rakutentech
SFHUG Kudu Talk
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
Apache Kudu Fast Analytics on Fast Data (Hadoop / Spark Conference Japan 2016...
Apache Kudu: Technical Deep Dive


Kudu austin oct 2015.pptx
Introduction to Kudu - StampedeCon 2016
Spark Summit EU talk by Mike Percy
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
The power of hadoop in cloud computing
Simplifying Real-Time Architectures for IoT with Apache Kudu
Part 1: Lambda Architectures: Simplified by Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
Kudu Cloudera Meetup Paris
Application Architectures with Hadoop - Big Data TechCon SF 2014
Контроль зверей: инструменты для управления и мониторинга распределенных сист...

More from Rakuten Group, Inc. (20)

PDF
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
PPTX
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
PDF
楽天における安全な秘匿情報管理への道のり
PDF
What Makes Software Green?
PDF
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
PDF
DataSkillCultureを浸透させる楽天の取り組み
PDF
大規模なリアルタイム監視の導入と展開
PDF
楽天における大規模データベースの運用
PDF
楽天サービスを支えるネットワークインフラストラクチャー
PDF
楽天の規模とクラウドプラットフォーム統括部の役割
PDF
Rakuten Services and Infrastructure Team.pdf
PDF
The Data Platform Administration Handling the 100 PB.pdf
PDF
Supporting Internal Customers as Technical Account Managers.pdf
PDF
Making Cloud Native CI_CD Services.pdf
PDF
How We Defined Our Own Cloud.pdf
PDF
Travel & Leisure Platform Department's tech info
PDF
Travel & Leisure Platform Department's tech info
PDF
OWASPTop10_Introduction
PDF
Introduction of GORA API Group technology
PDF
100PBを越えるデータプラットフォームの実情
EPSS (Exploit Prediction Scoring System)モニタリングツールの開発
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
楽天における安全な秘匿情報管理への道のり
What Makes Software Green?
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
DataSkillCultureを浸透させる楽天の取り組み
大規模なリアルタイム監視の導入と展開
楽天における大規模データベースの運用
楽天サービスを支えるネットワークインフラストラクチャー
楽天の規模とクラウドプラットフォーム統括部の役割
Rakuten Services and Infrastructure Team.pdf
The Data Platform Administration Handling the 100 PB.pdf
Supporting Internal Customers as Technical Account Managers.pdf
Making Cloud Native CI_CD Services.pdf
How We Defined Our Own Cloud.pdf
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
OWASPTop10_Introduction
Introduction of GORA API Group technology
100PBを越えるデータプラットフォームの実情

Recently uploaded (20)

PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Modernizing your data center with Dell and AMD
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Advanced IT Governance
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Cloud computing and distributed systems.
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Advanced Soft Computing BINUS July 2025.pdf
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Spectral efficient network and resource selection model in 5G networks
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Mobile App Security Testing_ A Comprehensive Guide.pdf
Empathic Computing: Creating Shared Understanding
NewMind AI Weekly Chronicles - August'25 Week I
Modernizing your data center with Dell and AMD
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Advanced IT Governance
Dropbox Q2 2025 Financial Results & Investor Presentation
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
GamePlan Trading System Review: Professional Trader's Honest Take
The Rise and Fall of 3GPP – Time for a Sabbatical?
Cloud computing and distributed systems.
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
Advanced Soft Computing BINUS July 2025.pdf

cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform

  • 1. 1© Cloudera, Inc. All rights reserved. Apache Kudu Updatable Analytical Storage for Modern Data Platform Sho Shimauchi | Sales Engineer | Cloudera
  • 2. 2© Cloudera, Inc. All rights reserved. Who Am I? Sho Shimauchi Sales Engineer / Technical Evangelist Joined Cloudera in 2011 The First Employee in Cloudera APJ Email: sho@cloudera.com Twitter: @shiumachi
  • 3. 3© Cloudera, Inc. All rights reserved. •  Founded in 2008 •  1600+ Clouderans •  Machine learning and analytics platform •  Shared data experience •  Cloud-native and cloud-differentiated •  Open-source innovation and efficiency
  • 4. 4© Cloudera, Inc. All rights reserved. Rakuten Card replaced Mainframe to Cloudera Enterprise in 2017 Apache Spark improved performance of the batch processes >2x Please join Cloudera World Tokyo 2017 to see Kobayashi-san’s Keynote! www.clouderaworldtokyo.com Rakuten Card + Cloudera
  • 5. 5© Cloudera, Inc. All rights reserved. Why Kudu? Use Cases and Motivation
  • 6. 6© Cloudera, Inc. All rights reserved. 6 The modern platform for machine learning and analytics optimized for the cloud EXTENSIBLE SERVICES CORE SERVICES DATA ENGINEERING OPERATIONAL DATABASE ANALYTIC DATABASE DATA CATALOG INGEST & REPLICATION SECURITY GOVERNANCE WORKLOAD MANAGEMENT DATA SCIENCE NEW OFFERINGS Cloudera Enterprise Amazon S3 Microsoft ADLS HDFS KUDU STORAGE SERVICES
  • 7. 7© Cloudera, Inc. All rights reserved. HDFS Fast Scans, Analytics and Processing of Stored Data Fast On-Line Updates & Data Serving Arbitrary Storage (Active Archive) Fast Analytics (on fast-changing or frequently-updated data) Unchanging Fast Changing Frequent Updates HBase Append-Only Real-Time Kudu Kudu fills the Gap Modern analytic applications often require complex data flow & difficult integration work to move data between HBase & HDFS Analytic Gap Pace of Analysis PaceofData Filling the Analytic Gap
  • 8. 8© Cloudera, Inc. All rights reserved. Apache Kudu: Scalable and fast structured storage Scalable •  Tested up to 300+ nodes (PBs cluster) •  Designed to scale to 1000s of nodes and tens of PBs Fast •  Millions of read/write operations per second across cluster •  Multiple GB/second read throughput per node Tabular •  Represents data in structured tables like a relational database •  Strict schema, finite column count, no BLOBs •  Individual record-level access to 100+ billion row tables
  • 9. 9© Cloudera, Inc. All rights reserved. Apache Kudu Community
  • 10. 10© Cloudera, Inc. All rights reserved. Can you insert time series data in real time? How long does it take to prepare it for analysis? Can you get results and act fast enough to change outcomes? Can you handle large volumes of machine-generated data? Do you have the tools to identify problems or threats? Can your system do machine learning? How fast can you add data to your data store? Are you trading off the ability to do broad analytics for the ability to make updates? Are you retaining only part of your data? Time Series Data Machine Data Analytics Online Reporting Why Kudu?
  • 11. 11© Cloudera, Inc. All rights reserved. Cheaper and faster every year. Persistent memory (3D XPoint™) Kudu can take advantage of SSD and NVM using Intel’s NVM Library. RAM is cheaper and bigger every day. Kudu runs smoothly with huge RAM. Written in C++ to avoid GC issues. Modern CPUs are adding cores and SIMD width, not GHz. Kudu takes advantage of SIMD instructions and concurrent data structures. Next generation hardware Solid-state Storage Cheaper, Bigger Memory Efficiency on Modern CPUs
  • 12. 12© Cloudera, Inc. All rights reserved. How it Works Replication And Fault Tolerance
  • 13. 13© Cloudera, Inc. All rights reserved. Tables, tablets, and tablet servers •  Each table is horizontally partitioned into tablets •  Range or hash partitioning • PRIMARY KEY (host, metric, timestamp) DISTRIBUTE BY HASH(timestamp) INTO 100 BUCKETS •  Each tablet has N replicas (3 or 5) with Raft consensus •  Automatic fault tolerance •  MTTR (mean time to repair): ~5 seconds •  Tablet servers host tablets on local disk drives •  Master services metadata operations •  Create/drop tables and tablets •  Locate tablets
  • 14. 14© Cloudera, Inc. All rights reserved. Metadata Replicated master Acts as a tablet directory Acts as a catalog (which tables exist, etc) Acts as a load balancer (tracks TS liveness, re-replicates under- replicated tablets) Caches all metadata in RAM for high performance Client configured with master addresses Asks master for tablet locations as needed and caches them
  • 15. 15© Cloudera, Inc. All rights reserved. Client Hey Master! Where is the row for ‘tlipcon’ in table “T”? It’s part of tablet 2, which is on servers {Z,Y,X}. BTW, here’s info on other tablets you might care about: T1, T2, T3, … UPDATE tlipcon SET col=foo Meta Cache T1: … T2: … T3: …
  • 16. 16© Cloudera, Inc. All rights reserved. Raft consensus TS A Tablet 1 (LEADER) Client TS B Tablet 1 (FOLLOWER) TS C Tablet 1 (FOLLOWER) WAL WALWAL 2b. Leader writes local WAL 1a. Client->Leader: Write() RPC 2a. Leader->Followers: UpdateConsensus() RPC 3. Follower: write WAL 4. Follower->Leader: success 3. Follower: write WAL 5. Leader has achieved majority 6. Leader->Client: Success!
  • 17. 17© Cloudera, Inc. All rights reserved. How it Works Columnar Storage
  • 18. 18© Cloudera, Inc. All rights reserved. Row Storage Scans have to read all the data, no encodings {23059873, newsycbot, 1442865158, Visual exp…} {22309487, RideImpala, 1442828307, Introducing …} … Tweet_id, user_name, created_at, text
  • 19. 19© Cloudera, Inc. All rights reserved. {25059873, 22309487, 23059861, 23010982} Tweet_id {newsycbot, RideImpala, fastly, llvmorg} User_name {1442865158, 1442828307, 1442865156, 1442865155} Created_at {Visual exp…, Introducing .., Missing July…, LLVM 3.7….} text Columnar Storage
  • 20. 20© Cloudera, Inc. All rights reserved. SELECT COUNT(*) FROM tweets WHERE user_name = ‘newsycbot’; {25059873, 22309487, 23059861, 23010982} Tweet_id 1GB {newsycbot, RideImpala, fastly, llvmorg} User_name Only read 1 column 2GB {1442865158, 1442828307, 1442865156, 1442865155} Created_at 1GB {Visual exp…, Introducing .., Missing July…, LLVM 3.7….} text 200GB Columnar Storage
  • 21. 21© Cloudera, Inc. All rights reserved. {1442825158, 1442826100, 1442827994, 1442828527} Created_at Created_at Diff(created_at) 1442825158 n/a 1442826100 942 1442827994 1894 1442828527 533 64 bits each 11 bits each Columnar Compression Many columns can compress to a few bits per row! Especially: Timestamps Time series values Low-cardinality strings Massive space savings and throughput increase!
  • 22. 22© Cloudera, Inc. All rights reserved. How it Works Write and Read Paths
  • 23. 23© Cloudera, Inc. All rights reserved. LSM vs Kudu LSM – Log Structured Merge (Cassandra, HBase, etc) Inserts and updates all go to an in-memory map (MemStore) and later flush to on-disk files (HFile/SSTable) Reads perform an on-the-fly merge of all on-disk HFiles Kudu Shares some traits (memstores, compactions) More complex. Slower writes in exchange for faster reads (especially scans)
  • 24. 24© Cloudera, Inc. All rights reserved. LSM Insert Path MemStore INSERT Row=r1 col=c1 val=“blah” Row=r1 col=c2 val=“1” HFile 1 Row=r1 col=c1 val=“blah” Row=r1 col=c2 val=“1” flush
  • 25. 25© Cloudera, Inc. All rights reserved. LSM Insert Path MemStore INSERT Row=r1 col=c1 val=“blah2” Row=r1 col=c2 val=“2” HFile 2 Row=r2 col=c1 val=“blah2” Row=r2 col=c2 val=“2” flush HFile 1Row=r1 col=c1 val=“blah” Row=r1 col=c2 val=“1”
  • 26. 26© Cloudera, Inc. All rights reserved. LSM Update path MemStore UPDATE HFile 1 Row=r1 col=c1 val=“blah” Row=r1 col=c2 val=“2” HFile 2 Row=r2 col=c1 val=“v2” Row=r2 col=c2 val=“5” Row=r2 col=c1 val=“newval” Note: all updates are “fully decoupled” from reads. Random-write workload is transformed to fully sequential!
  • 27. 27© Cloudera, Inc. All rights reserved. LSM Read path MemStore HFile 1 Row=r1 col=c1 val=“blah” Row=r1 col=c2 val=“2” HFile 2 Row=r2 col=c1 val=“v2” Row=r2 col=c2 val=“5” Row=r2 col=c1 val=“newval” Merge based on string row keys R1: c1=blah c2=2 R2: c1=newval c2=5 …. CPU intensive! Must always read rowkeys Any given row may exist across multiple HFiles: must always merge! The more HFiles to merge, the slower it reads
  • 28. 28© Cloudera, Inc. All rights reserved. Kudu storage – Inserts and Flushes MemRowSet INSERT(“todd”, “$1000”,”engineer”) name pay role DiskRowSet 1 flush
  • 29. 29© Cloudera, Inc. All rights reserved. Kudu storage – Inserts and Flushes MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 INSERT(“doug”, “$1B”, “Hadoop man”) flush
  • 30. 30© Cloudera, Inc. All rights reserved. Kudu storage - Updates MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta MS Delta MS Each DiskRowSet has its own DeltaMemStore to accumulate updates base data base data
  • 31. 31© Cloudera, Inc. All rights reserved. Kudu storage - Updates MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta MS Delta MS UPDATE set pay=“$1M” WHERE name=“todd” Is the row in DiskRowSet 2? (check bloom filters) Is the row in DiskRowSet 1? (check bloom filters) Bloom says: no! Bloom says: maybe! Search key column to find offset: rowid = 150 150: col 1=$1M base data
  • 32. 32© Cloudera, Inc. All rights reserved. Kudu storage – Read path MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta MS Delta MS 150: pay=$1M Read rows in DiskRowSet 2 Then, read rows in DiskRowSet 1 Any row is only in exactly one DiskRowSet– no need to merge cross-DRS! Updates are merged based on ordinal offset within DRS: array indexing, no string compares base data base data
  • 33. 33© Cloudera, Inc. All rights reserved. Kudu storage – Delta flushes MemRowSet name pay role DiskRowSet 1 name pay role DiskRowSet 2 Delta MS Delta MS 0: pay=fooREDO DeltaFile Flush A REDO delta indicates how to transform between the ‘base data’ (columnar) and a later version base data base data
  • 34. 34© Cloudera, Inc. All rights reserved. Kudu storage – Major delta compaction name pay role DiskRowSet(pre-compaction) Delta MS REDO DeltaFile REDO DeltaFile REDO DeltaFile Many deltas accumulate: lots of delta application work on reads name pay role DiskRowSet(post-compaction) Delta MS Unmerged REDO deltasUNDO deltas If a column has few updates, doesn’t need to be re- written: those deltas maintained in new DeltaFile Merge updates for columns with high update percentage base data
  • 35. 35© Cloudera, Inc. All rights reserved. Kudu storage – RowSet Compactions DRS 1 (32MB) [PK=alice], [PK=joe], [PK=linda], [PK=zach] DRS 2 (32MB) [PK=bob], [PK=jon], [PK=mary] [PK=zeke] DRS 3 (32MB) [PK=carl], [PK=julie], [PK=omar] [PK=zoe] DRS 4 (32MB) DRS 5 (32MB) DRS 6 (32MB) [alice, bob, carl, joe] [jon, julie, linda, mary] [omar, zach, zeke, zoe] Reorganize rows to avoid rowsets with overlapping key ranges Writes for “chris” have to perform bloom lookups on all 3 RS
  • 36. 36© Cloudera, Inc. All rights reserved. Kudu Storage - Compactions Main Idea: Always be compacting! Compactions run continuously to prevent IO storms ”Budgeted” RS compactions: What is the best way to spend X MBs IO? Physical/Logical decoupling: different replicas run compactions at different times
  • 37. 37© Cloudera, Inc. All rights reserved. Conclusion
  • 38. 38© Cloudera, Inc. All rights reserved. Getting Started On the web: https://www.cloudera.com/documentation/kudu/latest.html, https://www.cloudera.com/downloads.html, https://blog.cloudera.com/?s=Kudu, kudu.apache.org •  Apache project user mailing list: user@kudu.apache.org •  Quickstart VM •  Easiest way to get started •  Impala and Kudu in an easy-to-install VM •  CSD and Parcels •  For installation on a Cloudera Manager-managed cluster Training classes available: https://www.cloudera.com/more/training.html
  • 39. 39© Cloudera, Inc. All rights reserved. Nov 7, 2017 Tue ANA Intercontinental Hotel Estimated Attendees #: 1000 E-1: Apache Kudu on Analytical Data Platform Register Now! www.clouderaworldtokyo.com Cloudera World Tokyo 2017
  • 40. 40© Cloudera, Inc. All rights reserved. Thank you sho@cloudera.com