SlideShare a Scribd company logo
How we built this: data
tiering, searchable
snapshots, and
asynchronous search
Jason Tedor
Elasticsearch Tech Lead
This presentation and the accompanying oral presentation contain forward-looking statements, including statements
concerning plans for future offerings; the expected strength, performance or benefits of our offerings; and our future
operations and expected performance. These forward-looking statements are subject to the safe harbor provisions
under the Private Securities Litigation Reform Act of 1995. Our expectations and beliefs in light of currently
available information regarding these matters may not materialize. Actual outcomes and results may differ materially
from those contemplated by these forward-looking statements due to uncertainties, risks, and changes in
circumstances, including, but not limited to those related to: the impact of the COVID-19 pandemic on our business
and our customers and partners; our ability to continue to deliver and improve our offerings and successfully
develop new offerings, including security-related product offerings and SaaS offerings; customer acceptance and
purchase of our existing offerings and new offerings, including the expansion and adoption of our SaaS offerings;
our ability to realize value from investments in the business, including R&D investments; our ability to maintain and
expand our user and customer base; our international expansion strategy; our ability to successfully execute our
go-to-market strategy and expand in our existing markets and into new markets, and our ability to forecast customer
retention and expansion; and general market, political, economic and business conditions.
Additional risks and uncertainties that could cause actual outcomes and results to differ materially are included in
our filings with the Securities and Exchange Commission (the “SEC”), including our Annual Report on Form 10-K for
the most recent fiscal year, our quarterly report on Form 10-Q for the most recent fiscal quarter, and any
subsequent reports filed with the SEC. SEC filings are available on the Investor Relations section of Elastic’s
website at ir.elastic.co and the SEC’s website at www.sec.gov.
Any features or functions of services or products referenced in this presentation, or in any presentations, press
releases or public statements, which are not currently available or not currently available as a general availability
release, may not be delivered on time or at all. The development, release, and timing of any features or functionality
described for our products remains at our sole discretion. Customers who purchase our products and services
should make the purchase decisions based upon services and product features and functions that are currently
available.
All statements are made only as of the date of the presentation, and Elastic assumes no obligation to, and does not
currently intend to, update any forward-looking statements or statements relating to features or functions of services
or products, except as required by law.
Forward-Looking Statements
Data Tiers
A simple, integrated approach to
optimize for cost and performance
How we built this: Data tiering, snapshots, and asynchronous search
Daily
3 TB/day
Weekly
21 TB/week
Monthly
90 TB/month
Yearly
1 PB/year
Time Series Data
Only when
lawyers ask
Search Frequency
RarelyOccasionallyConstantly
Hot tier
$$$
Data from recent days
Highly relevant
Searched constantly
Warm tier
$$$
Data from recent weeks
Somewhat relevant data
Searched occasionally
Cold tier
$$$
Data from recent months
Barely relevant data
Searched rarely
Frozen tier
$$$
Data from recent years
Irrelevant data
Searched when the lawyers ask
Data tiers store data cost-effectively
1 2 3 4
Hot tier Warm tier Cold tier Frozen Tier
Data Tiers
• A simplified way of managing your data, both content and time
series
• Currently: use node attributes
• Very soon: use node roles data_content, data_hot, data_warm,
data_cold (and later, data_frozen)
• Then, the Elastic Stack will automatically manage the
allocation and relocation of data between tiers at the
appropriate phase
How we built this: Data tiering, snapshots, and asynchronous search
Data Tiers
• Lower storage costs
• Streamlined operations
• Deeper insights and new use cases
Searchable Snapshots
Efficiently search your indices
stored in cheap cloud object
stores like S3
Hot Warm
$$
Searchable Snapshots
$$$
Snapshot
Disk
Hot|Warm
S3
Disk
Cold
Cold Searchable Snapshots
• Back an index by a snapshot, replicas no longer needed
• Data is downloaded and persisted locally
• For resilience, Elasticsearch automatically recovers from
the snapshot when needed
• Deep costs savings without really impacting performance
Hot Warm Cold
Snapshot
$$ $/2
Searchable Snapshots
$$$
S3
Disk
Frozen
Doc Values
Stored Fields
Term Dictionary
Term Proximity
Normalization Factors
Point Values
Meta Lookup
Doc Values
Stored Fields
Term Dictionary
Term Proximity
Normalization Factors
Point Values
Meta Lookup
S3
Disk
Frozen
S3
Disk
Frozen
Frozen Searchable Snapshots
• Make an index snapshot look like a regular index
• Incoming searches download the necessary index data
• A cache persists this so recently-searched snapshots
perform as fast as if local
• Costs approaching the cost of S3 to store the data
Hot Warm Cold
$$ $/2
Frozen
$
Searchable Snapshots
Snapshot
$$$
Searchable Snapshots
Cold: Use snapshots instead of replicas
Frozen: Search a snapshot stored on S3 (or your favorite
cloud object store)
Asynchronous Search
Long-running queries execute in
the background
Data analysis: Async Search
Tons of DataSlower (Cheaper) Hardware Show Search Progress
Asynchronous Search
Run potentially long-running queries in the background,
allowing you to track their progress and retrieve partial
results as they become available
Data Tiers, Searchable
Snapshots, and
Asynchronous Search
A simple, integrated experience
for managing the cost and
performance of your data

More Related Content

PDF
Oracle RAC 19c: Best Practices and Secret Internals
PPTX
AWS Introduction
PDF
Enterprise manager 13c
PDF
Oracle Enterprise Manager Cloud Control 13c for DBAs
PDF
Oracle Active Data Guard: Best Practices and New Features Deep Dive
PDF
Denver MuleSoft Meetup: Deep Dive into Anypoint Runtime Fabric Security
PDF
z/OS Connect - Overview at the "z Systems Agile Enterprise Development Confer...
PPTX
Delta Lake with Azure Databricks
Oracle RAC 19c: Best Practices and Secret Internals
AWS Introduction
Enterprise manager 13c
Oracle Enterprise Manager Cloud Control 13c for DBAs
Oracle Active Data Guard: Best Practices and New Features Deep Dive
Denver MuleSoft Meetup: Deep Dive into Anypoint Runtime Fabric Security
z/OS Connect - Overview at the "z Systems Agile Enterprise Development Confer...
Delta Lake with Azure Databricks

What's hot (20)

PDF
Postgresql tutorial
PDF
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
PDF
Core Banking System on Apache Kafka
PDF
Maa goldengate-rac-2007111
PDF
Streaming Visualization
PDF
Integrating systems in the age of Quarkus and Camel
PDF
Machine Learning and the Elastic Stack
PPTX
Dynatrace: New Approach to Digital Performance Management - Gartner Symposium...
PPTX
Kafka and Avro with Confluent Schema Registry
PPTX
Introducing Azure SQL Data Warehouse
PPTX
Domain logic patterns of Software Architecture
PPSX
Oracle Performance Tuning Fundamentals
PDF
Introduction to WebSockets Presentation
PPTX
Introduction to Oracle Database
PPTX
AltaVault
PDF
Oracle Key Vault Data Subsetting and Masking
PDF
Splunk Cloud
PPTX
Oracle architecture ppt
PPT
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
PDF
클라우드 네이티브 데이터베이스 서비스로 Oracle RAC 전환 - 김지훈 :: AWS 클라우드 마이그레이션 온라인
Postgresql tutorial
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Core Banking System on Apache Kafka
Maa goldengate-rac-2007111
Streaming Visualization
Integrating systems in the age of Quarkus and Camel
Machine Learning and the Elastic Stack
Dynatrace: New Approach to Digital Performance Management - Gartner Symposium...
Kafka and Avro with Confluent Schema Registry
Introducing Azure SQL Data Warehouse
Domain logic patterns of Software Architecture
Oracle Performance Tuning Fundamentals
Introduction to WebSockets Presentation
Introduction to Oracle Database
AltaVault
Oracle Key Vault Data Subsetting and Masking
Splunk Cloud
Oracle architecture ppt
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
클라우드 네이티브 데이터베이스 서비스로 Oracle RAC 전환 - 김지훈 :: AWS 클라우드 마이그레이션 온라인
Ad

Similar to How we built this: Data tiering, snapshots, and asynchronous search (20)

PDF
Elasticsearch: From development to production in 15 minutes
PDF
Saving money with Elastic
PDF
Public sector keynote
PDF
Why you should use Elastic for infrastructure metrics
PDF
Migrating to Elasticsearch Service on Elastic Cloud
PDF
Elastic Stack keynote
PDF
From secure VPC links to SSO with Elastic Cloud
PDF
Finding relevant results faster with Elasticsearch
PDF
Elastic Cloud keynote
PDF
Using Elastic @ Elastic: Fast-tracking support search
PDF
Elastic Cloud: The best way to experience everything Elastic
PDF
Creating stellar customer support experiences using search
PDF
SIEM, malware protection, deep data visibility — for free
PDF
Shaping insight into results with Elastic App Search
PDF
Elastic Security under the hood
PDF
How Zebra Technologies delivers business intelligence with Elastic on Google ...
PDF
Streamline search with Elasticsearch Service on Microsoft Azure
PDF
Modernizing deployment in any environment with Elastic
PDF
Streamline search with Elasticsearch Service on Microsoft Azure
PDF
Keynote: Making search better, faster, easier
Elasticsearch: From development to production in 15 minutes
Saving money with Elastic
Public sector keynote
Why you should use Elastic for infrastructure metrics
Migrating to Elasticsearch Service on Elastic Cloud
Elastic Stack keynote
From secure VPC links to SSO with Elastic Cloud
Finding relevant results faster with Elasticsearch
Elastic Cloud keynote
Using Elastic @ Elastic: Fast-tracking support search
Elastic Cloud: The best way to experience everything Elastic
Creating stellar customer support experiences using search
SIEM, malware protection, deep data visibility — for free
Shaping insight into results with Elastic App Search
Elastic Security under the hood
How Zebra Technologies delivers business intelligence with Elastic on Google ...
Streamline search with Elasticsearch Service on Microsoft Azure
Modernizing deployment in any environment with Elastic
Streamline search with Elasticsearch Service on Microsoft Azure
Keynote: Making search better, faster, easier
Ad

More from Elasticsearch (20)

PDF
An introduction to Elasticsearch's advanced relevance ranking toolbox
PDF
From MSP to MSSP using Elastic
PDF
Cómo crear excelentes experiencias de búsqueda en sitios web
PDF
Te damos la bienvenida a una nueva forma de realizar búsquedas
PDF
Tirez pleinement parti d'Elastic grâce à Elastic Cloud
PDF
Comment transformer vos données en informations exploitables
PDF
Plongez au cœur de la recherche dans tous ses états.
PDF
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
PDF
An introduction to Elasticsearch's advanced relevance ranking toolbox
PDF
Welcome to a new state of find
PDF
Building great website search experiences
PDF
Keynote: Harnessing the power of Elasticsearch for simplified search
PDF
Cómo transformar los datos en análisis con los que tomar decisiones
PDF
Explore relève les défis Big Data avec Elastic Cloud
PDF
Comment transformer vos données en informations exploitables
PDF
Transforming data into actionable insights
PDF
Opening Keynote: Why Elastic?
PDF
Empowering agencies using Elastic as a Service inside Government
PDF
The opportunities and challenges of data for public good
PDF
Enterprise search and unstructured data with CGI and Elastic
An introduction to Elasticsearch's advanced relevance ranking toolbox
From MSP to MSSP using Elastic
Cómo crear excelentes experiencias de búsqueda en sitios web
Te damos la bienvenida a una nueva forma de realizar búsquedas
Tirez pleinement parti d'Elastic grâce à Elastic Cloud
Comment transformer vos données en informations exploitables
Plongez au cœur de la recherche dans tous ses états.
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
An introduction to Elasticsearch's advanced relevance ranking toolbox
Welcome to a new state of find
Building great website search experiences
Keynote: Harnessing the power of Elasticsearch for simplified search
Cómo transformar los datos en análisis con los que tomar decisiones
Explore relève les défis Big Data avec Elastic Cloud
Comment transformer vos données en informations exploitables
Transforming data into actionable insights
Opening Keynote: Why Elastic?
Empowering agencies using Elastic as a Service inside Government
The opportunities and challenges of data for public good
Enterprise search and unstructured data with CGI and Elastic

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Machine learning based COVID-19 study performance prediction
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Encapsulation theory and applications.pdf
PPT
Teaching material agriculture food technology
PPTX
Big Data Technologies - Introduction.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
sap open course for s4hana steps from ECC to s4
Diabetes mellitus diagnosis method based random forest with bat algorithm
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Network Security Unit 5.pdf for BCA BBA.
The AUB Centre for AI in Media Proposal.docx
Programs and apps: productivity, graphics, security and other tools
Mobile App Security Testing_ A Comprehensive Guide.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Machine learning based COVID-19 study performance prediction
Advanced methodologies resolving dimensionality complications for autism neur...
A comparative analysis of optical character recognition models for extracting...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Encapsulation theory and applications.pdf
Teaching material agriculture food technology
Big Data Technologies - Introduction.pptx
Unlocking AI with Model Context Protocol (MCP)
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation_ Review paper, used for researhc scholars
sap open course for s4hana steps from ECC to s4

How we built this: Data tiering, snapshots, and asynchronous search

  • 1. How we built this: data tiering, searchable snapshots, and asynchronous search Jason Tedor Elasticsearch Tech Lead
  • 2. This presentation and the accompanying oral presentation contain forward-looking statements, including statements concerning plans for future offerings; the expected strength, performance or benefits of our offerings; and our future operations and expected performance. These forward-looking statements are subject to the safe harbor provisions under the Private Securities Litigation Reform Act of 1995. Our expectations and beliefs in light of currently available information regarding these matters may not materialize. Actual outcomes and results may differ materially from those contemplated by these forward-looking statements due to uncertainties, risks, and changes in circumstances, including, but not limited to those related to: the impact of the COVID-19 pandemic on our business and our customers and partners; our ability to continue to deliver and improve our offerings and successfully develop new offerings, including security-related product offerings and SaaS offerings; customer acceptance and purchase of our existing offerings and new offerings, including the expansion and adoption of our SaaS offerings; our ability to realize value from investments in the business, including R&D investments; our ability to maintain and expand our user and customer base; our international expansion strategy; our ability to successfully execute our go-to-market strategy and expand in our existing markets and into new markets, and our ability to forecast customer retention and expansion; and general market, political, economic and business conditions. Additional risks and uncertainties that could cause actual outcomes and results to differ materially are included in our filings with the Securities and Exchange Commission (the “SEC”), including our Annual Report on Form 10-K for the most recent fiscal year, our quarterly report on Form 10-Q for the most recent fiscal quarter, and any subsequent reports filed with the SEC. SEC filings are available on the Investor Relations section of Elastic’s website at ir.elastic.co and the SEC’s website at www.sec.gov. Any features or functions of services or products referenced in this presentation, or in any presentations, press releases or public statements, which are not currently available or not currently available as a general availability release, may not be delivered on time or at all. The development, release, and timing of any features or functionality described for our products remains at our sole discretion. Customers who purchase our products and services should make the purchase decisions based upon services and product features and functions that are currently available. All statements are made only as of the date of the presentation, and Elastic assumes no obligation to, and does not currently intend to, update any forward-looking statements or statements relating to features or functions of services or products, except as required by law. Forward-Looking Statements
  • 3. Data Tiers A simple, integrated approach to optimize for cost and performance
  • 5. Daily 3 TB/day Weekly 21 TB/week Monthly 90 TB/month Yearly 1 PB/year Time Series Data
  • 6. Only when lawyers ask Search Frequency RarelyOccasionallyConstantly
  • 7. Hot tier $$$ Data from recent days Highly relevant Searched constantly
  • 8. Warm tier $$$ Data from recent weeks Somewhat relevant data Searched occasionally
  • 9. Cold tier $$$ Data from recent months Barely relevant data Searched rarely
  • 10. Frozen tier $$$ Data from recent years Irrelevant data Searched when the lawyers ask
  • 11. Data tiers store data cost-effectively 1 2 3 4 Hot tier Warm tier Cold tier Frozen Tier
  • 12. Data Tiers • A simplified way of managing your data, both content and time series • Currently: use node attributes • Very soon: use node roles data_content, data_hot, data_warm, data_cold (and later, data_frozen) • Then, the Elastic Stack will automatically manage the allocation and relocation of data between tiers at the appropriate phase
  • 14. Data Tiers • Lower storage costs • Streamlined operations • Deeper insights and new use cases
  • 15. Searchable Snapshots Efficiently search your indices stored in cheap cloud object stores like S3
  • 19. Cold Searchable Snapshots • Back an index by a snapshot, replicas no longer needed • Data is downloaded and persisted locally • For resilience, Elasticsearch automatically recovers from the snapshot when needed • Deep costs savings without really impacting performance
  • 20. Hot Warm Cold Snapshot $$ $/2 Searchable Snapshots $$$
  • 22. Doc Values Stored Fields Term Dictionary Term Proximity Normalization Factors Point Values Meta Lookup
  • 23. Doc Values Stored Fields Term Dictionary Term Proximity Normalization Factors Point Values Meta Lookup
  • 26. Frozen Searchable Snapshots • Make an index snapshot look like a regular index • Incoming searches download the necessary index data • A cache persists this so recently-searched snapshots perform as fast as if local • Costs approaching the cost of S3 to store the data
  • 27. Hot Warm Cold $$ $/2 Frozen $ Searchable Snapshots Snapshot $$$
  • 28. Searchable Snapshots Cold: Use snapshots instead of replicas Frozen: Search a snapshot stored on S3 (or your favorite cloud object store)
  • 29. Asynchronous Search Long-running queries execute in the background
  • 30. Data analysis: Async Search Tons of DataSlower (Cheaper) Hardware Show Search Progress
  • 31. Asynchronous Search Run potentially long-running queries in the background, allowing you to track their progress and retrieve partial results as they become available
  • 32. Data Tiers, Searchable Snapshots, and Asynchronous Search A simple, integrated experience for managing the cost and performance of your data