SlideShare a Scribd company logo
© 2022 New Relic, Inc. All rights reserved.
Using Queryable State
for Fun and Profit
Ron Crocker
New Relic Fellow & Lead Architect, Observability
rcrocker@newrelic.com
@RonCrocker
#FlinkForward
© 2022 New Relic, Inc. All rights reserved.
Safe Harbor
This presentation and the information herein (including any information that may be incorporated by reference) is provided for
informational purposes only and should not be construed as an offer, commitment, promise or obligation on behalf of New Relic, Inc.
(“New Relic”) to sell securities or deliver any product, material, code, functionality, or other feature. Any information provided hereby
is proprietary to New Relic and may not be replicated or disclosed without New Relic’s express written permission.
Such information may contain forward-looking statements within the meaning of federal securities laws. Any statement that is not a
historical fact or refers to expectations, projections, future plans, objectives, estimates, goals, or other characterizations of future
events is a forward-looking statement. These forward-looking statements can often be identified as such because the context of the
statement will include words such as “believes,” “anticipates,” “expects” or words of similar import.
Actual results may differ materially from those expressed in these forward-looking statements, which speak only as of the date
hereof, and are subject to change at any time without notice. Existing and prospective investors, customers and other third parties
transacting business with New Relic are cautioned not to place undue reliance on this forward-looking information. The achievement
or success of the matters covered by such forward-looking statements are based on New Relic’s current assumptions, expectations,
and beliefs and are subject to substantial risks, uncertainties, assumptions, and changes in circumstances that may cause the actual
results, performance, or achievements to differ materially from those expressed or implied in any forward-looking statement. Further
information on factors that could affect such forward-looking statements is included in the filings New Relic makes with the SEC from
time to time. Copies of these documents may be obtained by visiting New Relic’s Investor Relations website at ir.newrelic.com or the
SEC’s website at www.sec.gov.
New Relic assumes no obligation and does not intend to update these forward-looking statements, except as required by law. New
Relic makes no warranties, expressed or implied, in this presentation or otherwise, with respect to the information provided.
© 2022 New Relic, Inc. All rights reserved.
Agenda
01 Problem Context
02 Fun
03 Profit
04 Conclusions
© 2022 New Relic, Inc. All rights reserved
Problem Context
Looking at all the things is
hard
© 2022 New Relic, Inc. All rights reserved.
Some New Relic terms
Things (or groups of
things) that report
telemetry to NR
● Unique identifier
● Groupable by type
Entity / Entity type Cell*
Internal analytic DB,
federated among the
telemetry ingest cells
Telemetry data lives
at rest in NRDB
NRDB
A deployment unit of
parts of the NR
system.
Independent units of
capacity, similarly
shaped within a cell
type
* http://highscalability.com/blog/2012/5/9/cell-architectures.html
© 2022 New Relic, Inc. All rights reserved.
What is Lookout?
New Relic Lookout provides visibility across your entire digital estate,
highlighting any abnormal signals by looking at the golden metrics.
© 2022 New Relic, Inc. All rights reserved.
Golden Metrics
Applications
ElastiCache
3-6 key summarizing metrics per entity type
Google SRE Golden Signals,
extended to everything
swapUsageBytes:
title: Swap usage (bytes)
unit: BYTES
queries:
aws:
select: average(aws.elasticache.SwapUsage.byRedisCluster)
from: Metric
eventId: entity.guid
eventName: entity.name
© 2022 New Relic, Inc. All rights reserved.
What is Lookout?
New Relic Lookout provides visibility across your entire digital estate,
highlighting any abnormal signals by looking at the golden metrics.
© 2022 New Relic, Inc. All rights reserved.
What’s behind a Lookout query
Three steps
1. Fetch the entities
2. Fetch the time
series data for
each entity &
metric
3. Process each time
series and return
the result
© 2022 New Relic, Inc. All rights reserved.
Strategy 1: Direct NRDB Query
Low code solution
✔Few moving parts
✔Supports all time
ranges
✘O(n) query duration
© 2022 New Relic, Inc. All rights reserved.
Strategy 2: Look-aside cache, in Redis
Help us Redis,
you’re our only hope
✔Queries change from
O(n) to O(1)
✘Several moving parts
✘90 minute cache
implies valid only for
prior 30 minutes
○prior 90 minutes
for mini-overviews
© 2022 New Relic, Inc. All rights reserved.
Golden Metric Buffer system, Redis
edition
●Filter service
forwards Golden
Metrics to Kafka
●Golden-signal-writer
publishes Golden
Metrics to Redis
●Mirror forwards
Golden Metrics to
centralized services
(“watcher”)
ETL FTW
© 2022 New Relic, Inc. All rights reserved.
Fun
Let’s welcome Flink and
Queryable State to the
party
© 2022 New Relic, Inc. All rights reserved.
Queryable State?
Accesses state inside
the stream processing
job
Available since 1.2.0
Image from https://www.ververica.com/blog/queryable-state-use-case-demo
© 2022 New Relic, Inc. All rights reserved.
How does Queryable State work?
© 2022 New Relic, Inc. All rights reserved.
Strategy 3: Look-aside cache, in Flink
Flink Queryable State
powers a new way
✔Queries remain O(1)
Fewer moving parts
✘90 minute cache
implies valid only for
prior 30 minutes
○prior 90 minutes
for mini-overviews
© 2022 New Relic, Inc. All rights reserved.
Golden Metric Buffer system, Flink edition
●Filter & mirror same
as prior system
●Flink cluster running
live state job
●Query service
bridges query to
Queryable State
client
Fewer JVMs
© 2022 New Relic, Inc. All rights reserved.
Flink job graph
© 2022 New Relic, Inc. All rights reserved.
Learning 1:
Important choices for Queryable State
Data Key IS
Query Key
Data state
IS
Query
response
© 2022 New Relic, Inc. All rights reserved.
Learning 2:
Embrace SessionWindow
© 2022 New Relic, Inc. All rights reserved.
Learning 3:
Sliding window without sliding window
Append-then-
rightsize
It’s ok to use
more state than
the queryable
state
© 2022 New Relic, Inc. All rights reserved
Profit
Comparing solution costs
© 2022 New Relic, Inc. All rights reserved.
Lookaside cache requirements
Memory-resident
data has minimal
query time
All data in memory
System must survive
a node failure
All data is durable
At least 20% free
capacity
Growable without
compromising “in
memory” or
“durability”
requirements
Growth headroom
© 2022 New Relic, Inc. All rights reserved.
Golden Metric Buffer system, Redis edition
© 2022 New Relic, Inc. All rights reserved.
Buffer
ElastiCache Redis
(6x cache.r6g.4xlarge)
Buffer Writer
EC2
(1x m5.4xlarge)
7.2 0.6
7.8
Cost estimate, in US$K/mo
Classic Solution
(6x cache.r6g.4xlarge
+
1x m5.4xlarge)
Estimates using AWS Pricing Calculator
(on-demand strategy)
(1 cell)
© 2022 New Relic, Inc. All rights reserved.
Buffer
ElastiCache Redis
(162x cache.r6g.4xlarge)
Buffer Writer
EC2
(27x m5.4xlarge)
194.4 16.2
210.6
Cost estimate, in US$K/mo
Classic Solution
(162x cache.r6g.4xlarge
+
27x m5.4xlarge)
Estimates using AWS Pricing Calculator
(on-demand strategy)
(27 cells)
© 2022 New Relic, Inc. All rights reserved.
Golden Metric Buffer system, Flink edition
© 2022 New Relic, Inc. All rights reserved.
There are a lot of Golden Metrics
© 2022 New Relic, Inc. All rights reserved.
Store only what’s necessary for the query
Golden Metric kind Submetric monoid
min <double, min(), Double.MAX_VALUE>
max <double, max(), -Double.MAX_VALUE>
sum <double, sum(), 0.0>
latest <double, latest(), 0.0>
count <long, sum(), 0>
average <<long,double>, <sum(),sum()>, <0,0.0>>
percentile <byte[], Distribution.merge(), Distribution.empty()>
uniqueCount <byte[], UniqueCount.merge(), UniqueCount.empty()>
© 2022 New Relic, Inc. All rights reserved.
Store only what’s necessary for the query
© 2022 New Relic, Inc. All rights reserved.
Flink solution
EC2
(3x r6i.8xlarge)
4.4
Cost estimate, in US$K/mo
Estimates using AWS Pricing Calculator
(on-demand strategy)
16 TM Flink cluster
© 2022 New Relic, Inc. All rights reserved.
Cost estimate comparison, in US$K/mo
Flink solution
EC2
(3x r6i.8xlarge)
Estimates using AWS Pricing Calculator
(on-demand strategy)
Classic Solution
(162x cache.r6g.4xlarge
+
27x m5.4xlarge)
-206.2
-97%
210.6 4.4
© 2022 New Relic, Inc. All rights reserved
Conclusions
What did we learn today?
© 2022 New Relic, Inc. All rights reserved.
We replaced a Redis, but…
… this is not a replacement for Redis
© 2022 New Relic, Inc. All rights reserved.
Query drives decisions
when using
Queryable State
You can’t spell Queryable State
without Query
© 2022 New Relic, Inc. All rights reserved.
Queryable State has real and tangible value
and
deserves rescue from deprecation
Help me save Queryable State
© 2022 New Relic, Inc. All rights reserved.
Thank you.
rcrocker@newrelic.com
@RonCrocker

More Related Content

PPTX
Evening out the uneven: dealing with skew in Flink
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
PDF
Introducing the Apache Flink Kubernetes Operator
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PPTX
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
PDF
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Evening out the uneven: dealing with skew in Flink
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Building a fully managed stream processing platform on Flink at scale for Lin...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing the Apache Flink Kubernetes Operator
Where is my bottleneck? Performance troubleshooting in Flink
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake

What's hot (20)

PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PPTX
Building Reliable Lakehouses with Apache Flink and Delta Lake
PPTX
Practical learnings from running thousands of Flink jobs
PDF
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
PDF
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
PPTX
The Current State of Table API in 2022
PPTX
Apache Flink and what it is used for
PPTX
Stephan Ewen - Experiences running Flink at Very Large Scale
PDF
Flink powered stream processing platform at Pinterest
PDF
Deploying Flink on Kubernetes - David Anderson
PPTX
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
PDF
Batch Processing at Scale with Flink & Iceberg
PPTX
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PDF
Apache airflow
PDF
How Uber scaled its Real Time Infrastructure to Trillion events per day
PDF
A Thorough Comparison of Delta Lake, Iceberg and Hudi
PDF
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PDF
Airflow presentation
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Building Reliable Lakehouses with Apache Flink and Delta Lake
Practical learnings from running thousands of Flink jobs
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
The Current State of Table API in 2022
Apache Flink and what it is used for
Stephan Ewen - Experiences running Flink at Very Large Scale
Flink powered stream processing platform at Pinterest
Deploying Flink on Kubernetes - David Anderson
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Batch Processing at Scale with Flink & Iceberg
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Dynamic Rule-based Real-time Market Data Alerts
Apache airflow
How Uber scaled its Real Time Infrastructure to Trillion events per day
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
Tame the small files problem and optimize data layout for streaming ingestion...
Airflow presentation
Ad

Similar to Using Queryable State for Fun and Profit (20)

PPTX
Storms Ahead - How Your Monitoring Can Keep Pace in the Dynamic Cloud {Future...
PPTX
Megabase: How We Containerized Databases at New Relic
PPTX
Best Practices for Measuring your Code Pipeline
PDF
It's Good to Have (JVM) Options - JavaOne
PPTX
Lew Cirne, FS16 Keynote [FutureStack16]
PPTX
A Skeptic's Guide to Docker
PPTX
Implementing Docker in Production at Scale
PPTX
Cloud Migration Acceptance Testing - Prove Success
PPTX
You’re ready to migrate, but how will you prove success?
PDF
Monitoring the Dynamic Nature of the Cloud [FutureStack16 NYC]
PDF
Future Stack NY - Monitoring the Dynamic Nature of the Cloud
PDF
Elastic Cloud keynote
PDF
SIEM, malware protection, deep data visibility — for free
PPTX
Host for the Most: Cloud Cost Optimization
PDF
Managing the Elastic Stack at Scale
PDF
From Force.com to Heroku and Back
PDF
Public sector keynote
PDF
"Containers, DevOps, Microservices and Kafka: Tools used by our Monolith wrec...
PPTX
Scaling with Docker: New Relic’s Containerization Journey
PPTX
Increasing MTBLS with New Relic [FutureStack16 NYC]
Storms Ahead - How Your Monitoring Can Keep Pace in the Dynamic Cloud {Future...
Megabase: How We Containerized Databases at New Relic
Best Practices for Measuring your Code Pipeline
It's Good to Have (JVM) Options - JavaOne
Lew Cirne, FS16 Keynote [FutureStack16]
A Skeptic's Guide to Docker
Implementing Docker in Production at Scale
Cloud Migration Acceptance Testing - Prove Success
You’re ready to migrate, but how will you prove success?
Monitoring the Dynamic Nature of the Cloud [FutureStack16 NYC]
Future Stack NY - Monitoring the Dynamic Nature of the Cloud
Elastic Cloud keynote
SIEM, malware protection, deep data visibility — for free
Host for the Most: Cloud Cost Optimization
Managing the Elastic Stack at Scale
From Force.com to Heroku and Back
Public sector keynote
"Containers, DevOps, Microservices and Kafka: Tools used by our Monolith wrec...
Scaling with Docker: New Relic’s Containerization Journey
Increasing MTBLS with New Relic [FutureStack16 NYC]
Ad

More from Flink Forward (14)

PPTX
Autoscaling Flink with Reactive Mode
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PPTX
One sink to rule them all: Introducing the new Async Sink
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
PPTX
Apache Flink in the Cloud-Native Era
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PDF
Flink SQL on Pulsar made easy
PPTX
Processing Semantically-Ordered Streams in Financial Services
PPTX
Welcome to the Flink Community!
PPTX
Extending Flink SQL for stream processing use cases
PPTX
The top 3 challenges running multi-tenant Flink at scale
PDF
Changelog Stream Processing with Apache Flink
PPTX
Large Scale Real Time Fraudulent Web Behavior Detection
PPTX
Near real-time statistical modeling and anomaly detection using Flink!
Autoscaling Flink with Reactive Mode
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
One sink to rule them all: Introducing the new Async Sink
Tuning Apache Kafka Connectors for Flink.pptx
Apache Flink in the Cloud-Native Era
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink SQL on Pulsar made easy
Processing Semantically-Ordered Streams in Financial Services
Welcome to the Flink Community!
Extending Flink SQL for stream processing use cases
The top 3 challenges running multi-tenant Flink at scale
Changelog Stream Processing with Apache Flink
Large Scale Real Time Fraudulent Web Behavior Detection
Near real-time statistical modeling and anomaly detection using Flink!

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
KodekX | Application Modernization Development
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation theory and applications.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Review of recent advances in non-invasive hemoglobin estimation
KodekX | Application Modernization Development
Advanced methodologies resolving dimensionality complications for autism neur...
Digital-Transformation-Roadmap-for-Companies.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Unlocking AI with Model Context Protocol (MCP)
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Dropbox Q2 2025 Financial Results & Investor Presentation

Using Queryable State for Fun and Profit

  • 1. © 2022 New Relic, Inc. All rights reserved. Using Queryable State for Fun and Profit Ron Crocker New Relic Fellow & Lead Architect, Observability rcrocker@newrelic.com @RonCrocker #FlinkForward
  • 2. © 2022 New Relic, Inc. All rights reserved. Safe Harbor This presentation and the information herein (including any information that may be incorporated by reference) is provided for informational purposes only and should not be construed as an offer, commitment, promise or obligation on behalf of New Relic, Inc. (“New Relic”) to sell securities or deliver any product, material, code, functionality, or other feature. Any information provided hereby is proprietary to New Relic and may not be replicated or disclosed without New Relic’s express written permission. Such information may contain forward-looking statements within the meaning of federal securities laws. Any statement that is not a historical fact or refers to expectations, projections, future plans, objectives, estimates, goals, or other characterizations of future events is a forward-looking statement. These forward-looking statements can often be identified as such because the context of the statement will include words such as “believes,” “anticipates,” “expects” or words of similar import. Actual results may differ materially from those expressed in these forward-looking statements, which speak only as of the date hereof, and are subject to change at any time without notice. Existing and prospective investors, customers and other third parties transacting business with New Relic are cautioned not to place undue reliance on this forward-looking information. The achievement or success of the matters covered by such forward-looking statements are based on New Relic’s current assumptions, expectations, and beliefs and are subject to substantial risks, uncertainties, assumptions, and changes in circumstances that may cause the actual results, performance, or achievements to differ materially from those expressed or implied in any forward-looking statement. Further information on factors that could affect such forward-looking statements is included in the filings New Relic makes with the SEC from time to time. Copies of these documents may be obtained by visiting New Relic’s Investor Relations website at ir.newrelic.com or the SEC’s website at www.sec.gov. New Relic assumes no obligation and does not intend to update these forward-looking statements, except as required by law. New Relic makes no warranties, expressed or implied, in this presentation or otherwise, with respect to the information provided.
  • 3. © 2022 New Relic, Inc. All rights reserved. Agenda 01 Problem Context 02 Fun 03 Profit 04 Conclusions
  • 4. © 2022 New Relic, Inc. All rights reserved Problem Context Looking at all the things is hard
  • 5. © 2022 New Relic, Inc. All rights reserved. Some New Relic terms Things (or groups of things) that report telemetry to NR ● Unique identifier ● Groupable by type Entity / Entity type Cell* Internal analytic DB, federated among the telemetry ingest cells Telemetry data lives at rest in NRDB NRDB A deployment unit of parts of the NR system. Independent units of capacity, similarly shaped within a cell type * http://highscalability.com/blog/2012/5/9/cell-architectures.html
  • 6. © 2022 New Relic, Inc. All rights reserved. What is Lookout? New Relic Lookout provides visibility across your entire digital estate, highlighting any abnormal signals by looking at the golden metrics.
  • 7. © 2022 New Relic, Inc. All rights reserved. Golden Metrics Applications ElastiCache 3-6 key summarizing metrics per entity type Google SRE Golden Signals, extended to everything swapUsageBytes: title: Swap usage (bytes) unit: BYTES queries: aws: select: average(aws.elasticache.SwapUsage.byRedisCluster) from: Metric eventId: entity.guid eventName: entity.name
  • 8. © 2022 New Relic, Inc. All rights reserved. What is Lookout? New Relic Lookout provides visibility across your entire digital estate, highlighting any abnormal signals by looking at the golden metrics.
  • 9. © 2022 New Relic, Inc. All rights reserved. What’s behind a Lookout query Three steps 1. Fetch the entities 2. Fetch the time series data for each entity & metric 3. Process each time series and return the result
  • 10. © 2022 New Relic, Inc. All rights reserved. Strategy 1: Direct NRDB Query Low code solution ✔Few moving parts ✔Supports all time ranges ✘O(n) query duration
  • 11. © 2022 New Relic, Inc. All rights reserved. Strategy 2: Look-aside cache, in Redis Help us Redis, you’re our only hope ✔Queries change from O(n) to O(1) ✘Several moving parts ✘90 minute cache implies valid only for prior 30 minutes ○prior 90 minutes for mini-overviews
  • 12. © 2022 New Relic, Inc. All rights reserved. Golden Metric Buffer system, Redis edition ●Filter service forwards Golden Metrics to Kafka ●Golden-signal-writer publishes Golden Metrics to Redis ●Mirror forwards Golden Metrics to centralized services (“watcher”) ETL FTW
  • 13. © 2022 New Relic, Inc. All rights reserved. Fun Let’s welcome Flink and Queryable State to the party
  • 14. © 2022 New Relic, Inc. All rights reserved. Queryable State? Accesses state inside the stream processing job Available since 1.2.0 Image from https://www.ververica.com/blog/queryable-state-use-case-demo
  • 15. © 2022 New Relic, Inc. All rights reserved. How does Queryable State work?
  • 16. © 2022 New Relic, Inc. All rights reserved. Strategy 3: Look-aside cache, in Flink Flink Queryable State powers a new way ✔Queries remain O(1) Fewer moving parts ✘90 minute cache implies valid only for prior 30 minutes ○prior 90 minutes for mini-overviews
  • 17. © 2022 New Relic, Inc. All rights reserved. Golden Metric Buffer system, Flink edition ●Filter & mirror same as prior system ●Flink cluster running live state job ●Query service bridges query to Queryable State client Fewer JVMs
  • 18. © 2022 New Relic, Inc. All rights reserved. Flink job graph
  • 19. © 2022 New Relic, Inc. All rights reserved. Learning 1: Important choices for Queryable State Data Key IS Query Key Data state IS Query response
  • 20. © 2022 New Relic, Inc. All rights reserved. Learning 2: Embrace SessionWindow
  • 21. © 2022 New Relic, Inc. All rights reserved. Learning 3: Sliding window without sliding window Append-then- rightsize It’s ok to use more state than the queryable state
  • 22. © 2022 New Relic, Inc. All rights reserved Profit Comparing solution costs
  • 23. © 2022 New Relic, Inc. All rights reserved. Lookaside cache requirements Memory-resident data has minimal query time All data in memory System must survive a node failure All data is durable At least 20% free capacity Growable without compromising “in memory” or “durability” requirements Growth headroom
  • 24. © 2022 New Relic, Inc. All rights reserved. Golden Metric Buffer system, Redis edition
  • 25. © 2022 New Relic, Inc. All rights reserved. Buffer ElastiCache Redis (6x cache.r6g.4xlarge) Buffer Writer EC2 (1x m5.4xlarge) 7.2 0.6 7.8 Cost estimate, in US$K/mo Classic Solution (6x cache.r6g.4xlarge + 1x m5.4xlarge) Estimates using AWS Pricing Calculator (on-demand strategy) (1 cell)
  • 26. © 2022 New Relic, Inc. All rights reserved. Buffer ElastiCache Redis (162x cache.r6g.4xlarge) Buffer Writer EC2 (27x m5.4xlarge) 194.4 16.2 210.6 Cost estimate, in US$K/mo Classic Solution (162x cache.r6g.4xlarge + 27x m5.4xlarge) Estimates using AWS Pricing Calculator (on-demand strategy) (27 cells)
  • 27. © 2022 New Relic, Inc. All rights reserved. Golden Metric Buffer system, Flink edition
  • 28. © 2022 New Relic, Inc. All rights reserved. There are a lot of Golden Metrics
  • 29. © 2022 New Relic, Inc. All rights reserved. Store only what’s necessary for the query Golden Metric kind Submetric monoid min <double, min(), Double.MAX_VALUE> max <double, max(), -Double.MAX_VALUE> sum <double, sum(), 0.0> latest <double, latest(), 0.0> count <long, sum(), 0> average <<long,double>, <sum(),sum()>, <0,0.0>> percentile <byte[], Distribution.merge(), Distribution.empty()> uniqueCount <byte[], UniqueCount.merge(), UniqueCount.empty()>
  • 30. © 2022 New Relic, Inc. All rights reserved. Store only what’s necessary for the query
  • 31. © 2022 New Relic, Inc. All rights reserved. Flink solution EC2 (3x r6i.8xlarge) 4.4 Cost estimate, in US$K/mo Estimates using AWS Pricing Calculator (on-demand strategy) 16 TM Flink cluster
  • 32. © 2022 New Relic, Inc. All rights reserved. Cost estimate comparison, in US$K/mo Flink solution EC2 (3x r6i.8xlarge) Estimates using AWS Pricing Calculator (on-demand strategy) Classic Solution (162x cache.r6g.4xlarge + 27x m5.4xlarge) -206.2 -97% 210.6 4.4
  • 33. © 2022 New Relic, Inc. All rights reserved Conclusions What did we learn today?
  • 34. © 2022 New Relic, Inc. All rights reserved. We replaced a Redis, but… … this is not a replacement for Redis
  • 35. © 2022 New Relic, Inc. All rights reserved. Query drives decisions when using Queryable State You can’t spell Queryable State without Query
  • 36. © 2022 New Relic, Inc. All rights reserved. Queryable State has real and tangible value and deserves rescue from deprecation Help me save Queryable State
  • 37. © 2022 New Relic, Inc. All rights reserved. Thank you. rcrocker@newrelic.com @RonCrocker