SlideShare a Scribd company logo
Version 1.0
Prometheus at Scale : Thanos / Cortex / etc.
An Anant Corporation Story.
How Prometheus scales in global business platforms
Prometheus (recap)
● Multidimensional data model over time via metric
name, and key/value pairs
● PromQL, now standard query language
● Time series collection via pull or push (via gateway)
● Dynamic service discovery or via static configuration
● Separation of concerns in graphing / dashboarding
Prometheus (on Kubernetes)
Prometheus at Scale
● Cortex
● Thanos
● M3DB (from Uber)
● Victoria Metrics
● Vulcan (from Digital Ocean)
https://sysdig.com/blog/challenges-scale-prometheus/
Prometheus at Scale Needs
● Global View - Queries over multiple promethei
● Multi-Replica / High Availability - No downtime, no data loss
● Long Term Storage - Store data in cold storage for future
● Global Scale - Millions of containers / pods / vms
● Community Support - Many people using it
● Community Knowledge Online - Many people documenting
Cortex
● Global View - Centralized data
● Multi-Replica / High Availability - Dedupe at write
● Long Term Storage - NoSQL Index + Chunks
○ Index (Cassandra / DynamoDB/ BigTable)
○ Chunk (Cassandra / DynamoDB/ BigTable/S3 /
GCS/Azure)
Cortex
●
Cortex
●
Cortex
●
Thanos
● Global View - Federated Data / Fan out queries
● Multi-Replica / High Availability - Query time dedupe
● Long Term Storage - TSDB blocks in object store
○ GCS
○ S3 Compatible (Ceph/ Minio
○ Azure Blob Storage
○ ….
Thanos
Thanos - Basic Architecture
●
Thanos Architecture
Thanos
●
Thanos
●
Thanos / Cortex Together
Resources
● Thanos - Scalable Prometheus
(https://www.infoq.com/news/2018/06/thanos-scalable-prometheus )
● Cortex Architecture (https://cortexmetrics.io/docs/architecture/)
● Thanos (https://thanos.io/)
● Challenges of Prometheus at Scale (https://sysdig.com/blog/challenges-scale-prometheus/)
● Tutorial : Prometheus at Scale
(https://epsagon.com/tools/thanos-tutorial-prometheus-at-scale/)
● Github / Cortex (https://github.com/cortexproject/cortex)
Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
 www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

More Related Content

PDF
Weave Cortex: Multi-tenant, horizontally scalable Prometheus as a Service
PDF
Project Frankenstein: A multitenant, horizontally scalable Prometheus as a se...
PDF
Loki: An Opensource Zipkin/Prometheus Mashup written in Go.
PDF
Microservices: Lessons Learned
PDF
Project Frankenstein: A multitenant, horizontally scalable Prometheus as a se...
PDF
Cortex: Prometheus as a Service, One Year On
PDF
Monitoring Kubernetes with Prometheus
PDF
Kubernetes and Prometheus
Weave Cortex: Multi-tenant, horizontally scalable Prometheus as a Service
Project Frankenstein: A multitenant, horizontally scalable Prometheus as a se...
Loki: An Opensource Zipkin/Prometheus Mashup written in Go.
Microservices: Lessons Learned
Project Frankenstein: A multitenant, horizontally scalable Prometheus as a se...
Cortex: Prometheus as a Service, One Year On
Monitoring Kubernetes with Prometheus
Kubernetes and Prometheus

What's hot (20)

PPTX
Monitoring on Kubernetes using prometheus
PDF
Things you wish you never knew about the Prometheus Remote Write API.
PPTX
APPLICATIONS AND CONTAINERS AT SCALE: OpenShift + Kubernetes + Docker
PDF
OpenWhisk and IBM cloud functions
PDF
Kubernetes Webinar - Using ConfigMaps & Secrets
PDF
Node.js and Containers Go Together Like Peanut Butter and Jelly
PDF
Kubernetes Sealed secrets
PDF
Kubernetes 101
PDF
COSCUP 2017 - infrastructure As Code
PDF
Nextflow and AWS Batch - GCC/BOSC 2018
PDF
FaaS-and-Furious
PDF
Flowable on Kubenetes
PDF
Cloudformation vs terraform_vs_ansible
PDF
Cluster api devopscon berlin
PDF
Shaker
PDF
CNCF explore k8s_api
PDF
Helm - Application deployment management for Kubernetes
PDF
Optimizing Kubernetes deployments with Helm
PDF
OpenShift on IBM Cloud BMS
PDF
TDC2017 | São Paulo - Trilha Containers How we figured out we had a SRE team ...
Monitoring on Kubernetes using prometheus
Things you wish you never knew about the Prometheus Remote Write API.
APPLICATIONS AND CONTAINERS AT SCALE: OpenShift + Kubernetes + Docker
OpenWhisk and IBM cloud functions
Kubernetes Webinar - Using ConfigMaps & Secrets
Node.js and Containers Go Together Like Peanut Butter and Jelly
Kubernetes Sealed secrets
Kubernetes 101
COSCUP 2017 - infrastructure As Code
Nextflow and AWS Batch - GCC/BOSC 2018
FaaS-and-Furious
Flowable on Kubenetes
Cloudformation vs terraform_vs_ansible
Cluster api devopscon berlin
Shaker
CNCF explore k8s_api
Helm - Application deployment management for Kubernetes
Optimizing Kubernetes deployments with Helm
OpenShift on IBM Cloud BMS
TDC2017 | São Paulo - Trilha Containers How we figured out we had a SRE team ...
Ad

Similar to Data Engineer's Lunch #23: Thanos/Cortex (20)

PPTX
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
PDF
Open ebs 101
PDF
Rook: Storage for Containers in Containers – data://disrupted® 2020
PDF
Monitoring with prometheus at scale
PDF
Monitoring with prometheus at scale
PDF
Container orchestration
PDF
Container Orchestration @Docker Meetup Hamburg
PPTX
ДЕНИС КЛЕПIКОВ «Long Term storage for Prometheus» Lviv DevOps Conference 2019
PDF
ACM_Intro_Containers_Cloud.pdf Cloud.pdf
PDF
Initial presentation of swift (for montreal user group)
PDF
Apache Mesos Overview and Integration
ODP
The journey to container adoption in enterprise
PDF
Low-Cost, Unlimited Metrics Storage with Thanos: Monitor All Your K8s Cluster...
PDF
Prometheus loves Grafana
ODP
Ceph Day NYC: Building Tomorrow's Ceph
ODP
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
PDF
Building Scalable Cloud Applications - Presentation at VCCF 2012
PDF
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
PDF
Containers - Portable, repeatable user-oriented application delivery. Build, ...
ODP
London Ceph Day Keynote: Building Tomorrow's Ceph
Prometheus in Practice: High Availability with Thanos (DevOpsDays Edinburgh 2...
Open ebs 101
Rook: Storage for Containers in Containers – data://disrupted® 2020
Monitoring with prometheus at scale
Monitoring with prometheus at scale
Container orchestration
Container Orchestration @Docker Meetup Hamburg
ДЕНИС КЛЕПIКОВ «Long Term storage for Prometheus» Lviv DevOps Conference 2019
ACM_Intro_Containers_Cloud.pdf Cloud.pdf
Initial presentation of swift (for montreal user group)
Apache Mesos Overview and Integration
The journey to container adoption in enterprise
Low-Cost, Unlimited Metrics Storage with Thanos: Monitor All Your K8s Cluster...
Prometheus loves Grafana
Ceph Day NYC: Building Tomorrow's Ceph
Ceph Day Santa Clara: Keynote: Building Tomorrow's Ceph
Building Scalable Cloud Applications - Presentation at VCCF 2012
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
Containers - Portable, repeatable user-oriented application delivery. Build, ...
London Ceph Day Keynote: Building Tomorrow's Ceph
Ad

More from Anant Corporation (20)

PPTX
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
PPTX
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
PDF
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
PDF
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
PDF
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
PDF
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
PPTX
YugabyteDB Developer Tools
PPTX
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
PPTX
Machine Learning Orchestration with Airflow
PDF
Cassandra Lunch 130: Recap of Cassandra Forward Talks
PDF
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
PDF
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
PDF
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
PDF
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
PDF
Data Engineer's Lunch #85: Designing a Modern Data Stack
PPTX
PDF
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
PDF
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
PPTX
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
PPTX
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
YugabyteDB Developer Tools
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Machine Learning Orchestration with Airflow
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...

Recently uploaded (20)

PDF
Fluorescence-microscope_Botany_detailed content
PPTX
1_Introduction to advance data techniques.pptx
PPT
Quality review (1)_presentation of this 21
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Introduction to Business Data Analytics.
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Fluorescence-microscope_Botany_detailed content
1_Introduction to advance data techniques.pptx
Quality review (1)_presentation of this 21
Clinical guidelines as a resource for EBP(1).pdf
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Supervised vs unsupervised machine learning algorithms
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to Business Data Analytics.
IB Computer Science - Internal Assessment.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Business Acumen Training GuidePresentation.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Acceptance and paychological effects of mandatory extra coach I classes.pptx

Data Engineer's Lunch #23: Thanos/Cortex

  • 1. Version 1.0 Prometheus at Scale : Thanos / Cortex / etc. An Anant Corporation Story. How Prometheus scales in global business platforms
  • 2. Prometheus (recap) ● Multidimensional data model over time via metric name, and key/value pairs ● PromQL, now standard query language ● Time series collection via pull or push (via gateway) ● Dynamic service discovery or via static configuration ● Separation of concerns in graphing / dashboarding
  • 4. Prometheus at Scale ● Cortex ● Thanos ● M3DB (from Uber) ● Victoria Metrics ● Vulcan (from Digital Ocean) https://sysdig.com/blog/challenges-scale-prometheus/
  • 5. Prometheus at Scale Needs ● Global View - Queries over multiple promethei ● Multi-Replica / High Availability - No downtime, no data loss ● Long Term Storage - Store data in cold storage for future ● Global Scale - Millions of containers / pods / vms ● Community Support - Many people using it ● Community Knowledge Online - Many people documenting
  • 6. Cortex ● Global View - Centralized data ● Multi-Replica / High Availability - Dedupe at write ● Long Term Storage - NoSQL Index + Chunks ○ Index (Cassandra / DynamoDB/ BigTable) ○ Chunk (Cassandra / DynamoDB/ BigTable/S3 / GCS/Azure)
  • 10. Thanos ● Global View - Federated Data / Fan out queries ● Multi-Replica / High Availability - Query time dedupe ● Long Term Storage - TSDB blocks in object store ○ GCS ○ S3 Compatible (Ceph/ Minio ○ Azure Blob Storage ○ ….
  • 12. Thanos - Basic Architecture ●
  • 16. Thanos / Cortex Together
  • 17. Resources ● Thanos - Scalable Prometheus (https://www.infoq.com/news/2018/06/thanos-scalable-prometheus ) ● Cortex Architecture (https://cortexmetrics.io/docs/architecture/) ● Thanos (https://thanos.io/) ● Challenges of Prometheus at Scale (https://sysdig.com/blog/challenges-scale-prometheus/) ● Tutorial : Prometheus at Scale (https://epsagon.com/tools/thanos-tutorial-prometheus-at-scale/) ● Github / Cortex (https://github.com/cortexproject/cortex)
  • 18. Strategy: Scalable Fast Data Architecture: Cassandra, Spark, Kafka Engineering: Node, Python, JVM,CLR Operations: Cloud, Container Rescue: Downtime!! I need help.  www.anant.us | solutions@anant.us | (855) 262-6826 3 Washington Circle, NW | Suite 301 | Washington, DC 20037