SlideShare a Scribd company logo
Demystifying Real Time Analytics With TiDB
Unlocking the Power of TiFlash for Real-Time Data Insights
Kabilesh PR
Co-Founder, Mydbops LLP
33rd
MyWebinar - Mydbops
About Me
Kabilesh PR
❏ Interested in Open Source DB technologies
❏ Keen Interest in MySQL, TiDB & Distributed SQL’s
❏ Active Tech Speaker/Blogger
❏ Pingcap Certified TiDB Professional
❏ AWS Database Speciality
❏ Founding Partner, Mydbops
Focus on MySQL, MongoDB, PostgreSQL, TiDB, Cassandra
Consulting
Services
Consulting
Services
Managed
Services
24*7
DBA Team
Targeted
Engagement
Mydbops Services
❏ Introduction
❏ TiDB Architecture
❏ Understanding Real-time Analytics
❏ Analytical Engine - TiFlash
❏ Enabling TiFlash
❏ Queries with TiFlash
Agenda
Introduction
TiDB is an Open Source , Distributed HTAP database compatible with MySQL Protocol.
Introduction
2
Understanding TiDB Architecture
Demystifying Real time Analytics with TiDB
MySQL compatible, the TiDB SQL Layer
separates compute from storage to make
scaling simpler,
The Placement Driver functions as a
orchestrator. Responsible for TSO,
scheduling, shard maintenance,
metadata and much more
Tikv is ROW based, Transactional
storage, Offers high-availability, strong
consistency that can auto-scale to
hundreds of node with petabyte data
scale
Advantages of TiDB
Open Source
No Vendor lock-in with a
database that’s 100% open source.
Horizontal Scaling
Grants total transparency into data workloads without
manual sharding.
Horizontal Scaling
Grants total transparency into
data workloads with automatic sharding.
High Availability
Guarantees auto-failover and self-healing for
continuous data access.
MySQL Compatibility
Enjoy the most MySQL compatible
distributed SQL database on the planet.
Multi-Cloud
Deploy database clusters
anywhere in the world.
Mixed Workloads
Streamlined tech stack makes it
easier to produce real-time analytics.
Robust Security
Protect data with enterprise-grade
encryption both in-flight and at-rest.
Global client-Base TiDB
Understanding Real-time Analytics
❏ Real-time analytics: Process of analyzing data as it is created, collected, and processed to provide
immediate insights to enable prompt decision-making.
❏ Use- Case:
Real-time Fraud detection, Market Analysis, Personalized Recommendations, Demand forecasting
❏ Challenges:
Data Volume and Velocity
Integration
Data Quality
Cost
● With TiDB Realtime insights as in when the business happens.
● Easy Integration and maintenance.
Real-Time Analytical Engine = TiFlash
❏ An Integrated columnar storage engine built exclusively for analytical workload.
❏ It's tightly integrated with TiKV and uses Clickhouse co-processor for providing MPP (Massively
Parallel Processing) analytical queries.
What is TiFlash?
Data Sync with TiFlash
Data Sync to TiFlash is done using the extended Raft-Learner Algorithm
Enabling TiFlash
❏ Adding a TiFlash node online won't impact the OLTP workload.
Tiflash_servers:
- host: 10.0.1.10
#tiup cluster scale-out <cluster-name> scale-out-topology.yaml
❏ After adding a TiFlash node, replication won’t starts by default.
❏ Replication to TiFlash can be at the table level or schema level.
ALTER TABLE table_name SET TIFLASH REPLICA count;
ALTER DATABASE db_name SET TIFLASH REPLICA count;
❏ Monitoring of TiFlash replication:
SELECT * FROM information_schema.tiflash_replica;
Enabling TiFlash
Scaling TiFlash
❏ Scaling out and scaling in TiFlash nodes is done online and won't impact the OLTP workload.
Nodes Addition:
Tiflash_servers:
- host: 10.0.1.10
- host: 10.0.1.12
#tiup cluster scale-out <cluster-name> scale-out-topology.yaml
Adjust the table / Schema replica count
ALTER TABLE table_name SET TIFLASH REPLICA count;
ALTER DATABASE db_name SET TIFLASH REPLICA count;
Node Removal:
Set the replica count to 0 for table
ALTER TABLE table_name SET TIFLASH REPLICA 0;
#tiup cluster scale-in <cluster-name> --node <tiflash_node_id>
Scaling TiFlash
Queries With TiFlash
❏ TiDB Optimizer automatically determines to use TiFlash replicas based on the COST.
❏ This works even in mix of workloads.
Smart Selection
❏ You can specify read queries to use replicas of specific engines with TiDB as shown below:
Config file:
[isolation-read]
engines = ["tikv", "tidb", "tiflash"]
SESSION:
set SESSION tidb_isolation_read_engines = "engine list separated
by commas";
Engine Isolation
❏ You can force the TiDB to use TiFlash replica as below with manual hint in query.
select /*+ read_from_storage(tiflash[table_name]) */ ... from
table_name;
Manual Hint
TiFlash Modes
❏ This mode enables the execution of queries in parallel across multiple nodes.
❏ TiDB automatically determines when to select MPP based on the optimizer’s cost estimation.
tidb_allow_mpp ,tidb_enforce_mpp- Control variables
MPP Mode
❏ With FastScan, TiFlash provides more efficient query performance but sacrifices the data
consistency.
❏ This mode is disabled by default.
❏ Query results might include old data of a table.
❏ Enable / Disable using tiflash_fastscan
FastScan Mode
Any Questions?
Thank You

More Related Content

PDF
TiDB DevCon 2020 Opening Keynote
PDF
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - Mydbops
PDF
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event
PDF
Migration Journey To TiDB - Kabilesh PR - Mydbops MyWebinar 38
PDF
Keynote -- Percona Live Europe 2018
PDF
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
PDF
A Brief Introduction of TiDB (Percona Live)
PDF
TiDB Introduction - San Francisco MySQL Meetup
TiDB DevCon 2020 Opening Keynote
TiDB in a Nutshell - Power of Open-Source Distributed SQL Database - Mydbops
Data-at-scale-with-TIDB Mydbops Co-Founder Kabilesh PR at LSPE Event
Migration Journey To TiDB - Kabilesh PR - Mydbops MyWebinar 38
Keynote -- Percona Live Europe 2018
Introducing TiDB [Delivered: 09/27/18 at NYC SQL Meetup]
A Brief Introduction of TiDB (Percona Live)
TiDB Introduction - San Francisco MySQL Meetup

Similar to Demystifying Real time Analytics with TiDB (20)

PDF
TiDB Introduction - Boston MySQL Meetup Group
PDF
Scaling TiDB for Large-Scale Application
PDF
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
PDF
TiDB - From Data to Discovery: Exploring the Intersection of Distributed Dat...
PDF
Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV
PDF
Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]
PDF
TiDB vs Aurora.pdf
PDF
Windows of Opportunity: Big Data on Tap
PDF
Scale Relational Database with NewSQL
PDF
FOSDEM MySQL and Friends Devroom
PPTX
The Most Trusted In-Memory database in the world- Altibase
PDF
Introducing TiDB @ SF DevOps Meetup
PDF
Introducing TiDB - Percona Live Frankfurt
PPTX
Dealing with an Upside Down Internet With High Performance Time Series Database
PPTX
Dunning time-series-2015
PPTX
How the Internet of Things is Turning the Internet Upside Down
PPTX
Dealing with an Upside Down Internet
PPTX
How the Internet of Things are Turning the Internet Upside Down
PDF
TiDB Introduction
PDF
What is DataStax Enterprise?
TiDB Introduction - Boston MySQL Meetup Group
Scaling TiDB for Large-Scale Application
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
TiDB - From Data to Discovery: Exploring the Intersection of Distributed Dat...
Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV
Introducing TiDB [Delivered: 09/25/18 at Portland Cloud Native Meetup]
TiDB vs Aurora.pdf
Windows of Opportunity: Big Data on Tap
Scale Relational Database with NewSQL
FOSDEM MySQL and Friends Devroom
The Most Trusted In-Memory database in the world- Altibase
Introducing TiDB @ SF DevOps Meetup
Introducing TiDB - Percona Live Frankfurt
Dealing with an Upside Down Internet With High Performance Time Series Database
Dunning time-series-2015
How the Internet of Things is Turning the Internet Upside Down
Dealing with an Upside Down Internet
How the Internet of Things are Turning the Internet Upside Down
TiDB Introduction
What is DataStax Enterprise?
Ad

More from Mydbops (20)

PDF
AWS MySQL Showdown - RDS vs RDS Multi AZ vs Aurora vs Serverless - Mydbops...
PDF
Mastering Vector Search with MongoDB Atlas - Manosh Malai - Mydbops MyWebinar 39
PDF
AWS Blue Green Deployment for Databases - Mydbops
PDF
What's New In MySQL 8.4 LTS Mydbops MyWebinar Edition 36
PDF
What's New in PostgreSQL 17? - Mydbops MyWebinar Edition 35
PDF
What's New in MongoDB 8.0 - Mydbops MyWebinar Edition 34
PDF
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
PDF
Read/Write Splitting using MySQL Router - Mydbops Meetup16
PDF
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
PDF
Must Know Postgres Extension for DBA and Developer during Migration
PDF
Efficient MySQL Indexing and what's new in MySQL Explain
PDF
Scale your database traffic with Read & Write split using MySQL Router
PDF
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024
PDF
Mastering Aurora PostgreSQL Clusters for Disaster Recovery
PDF
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...
PDF
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15
PDF
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
PDF
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
PDF
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...
PDF
Data Organisation: Table Partitioning in PostgreSQL
AWS MySQL Showdown - RDS vs RDS Multi AZ vs Aurora vs Serverless - Mydbops...
Mastering Vector Search with MongoDB Atlas - Manosh Malai - Mydbops MyWebinar 39
AWS Blue Green Deployment for Databases - Mydbops
What's New In MySQL 8.4 LTS Mydbops MyWebinar Edition 36
What's New in PostgreSQL 17? - Mydbops MyWebinar Edition 35
What's New in MongoDB 8.0 - Mydbops MyWebinar Edition 34
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Read/Write Splitting using MySQL Router - Mydbops Meetup16
MySQL InnoDB Storage Engine: Deep Dive - Mydbops
Must Know Postgres Extension for DBA and Developer during Migration
Efficient MySQL Indexing and what's new in MySQL Explain
Scale your database traffic with Read & Write split using MySQL Router
PostgreSQL Schema Changes with pg-osc - Mydbops @ PGConf India 2024
Mastering Aurora PostgreSQL Clusters for Disaster Recovery
Navigating Transactions: ACID Complexity in Modern Databases- Mydbops Open So...
AWS RDS in MySQL 2023 Vinoth Kanna @ Mydbops OpenSource Database Meetup 15
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
Scaling-MongoDB-with-Horizontal-and-Vertical-Sharding Mydbops Opensource Data...
Mastering MongoDB Atlas: Essentials of Diagnostics and Debugging in the Cloud...
Data Organisation: Table Partitioning in PostgreSQL
Ad

Recently uploaded (20)

PDF
Modernizing your data center with Dell and AMD
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
A Presentation on Artificial Intelligence
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
Teaching material agriculture food technology
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Machine learning based COVID-19 study performance prediction
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Modernizing your data center with Dell and AMD
Per capita expenditure prediction using model stacking based on satellite ima...
A Presentation on Artificial Intelligence
CIFDAQ's Market Insight: SEC Turns Pro Crypto
MYSQL Presentation for SQL database connectivity
Chapter 3 Spatial Domain Image Processing.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Teaching material agriculture food technology
Building Integrated photovoltaic BIPV_UPV.pdf
The AUB Centre for AI in Media Proposal.docx
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation theory and applications.pdf
NewMind AI Monthly Chronicles - July 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
Network Security Unit 5.pdf for BCA BBA.
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Review of recent advances in non-invasive hemoglobin estimation
Machine learning based COVID-19 study performance prediction
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication

Demystifying Real time Analytics with TiDB

  • 1. Demystifying Real Time Analytics With TiDB Unlocking the Power of TiFlash for Real-Time Data Insights Kabilesh PR Co-Founder, Mydbops LLP 33rd MyWebinar - Mydbops
  • 2. About Me Kabilesh PR ❏ Interested in Open Source DB technologies ❏ Keen Interest in MySQL, TiDB & Distributed SQL’s ❏ Active Tech Speaker/Blogger ❏ Pingcap Certified TiDB Professional ❏ AWS Database Speciality ❏ Founding Partner, Mydbops
  • 3. Focus on MySQL, MongoDB, PostgreSQL, TiDB, Cassandra Consulting Services Consulting Services Managed Services 24*7 DBA Team Targeted Engagement Mydbops Services
  • 4. ❏ Introduction ❏ TiDB Architecture ❏ Understanding Real-time Analytics ❏ Analytical Engine - TiFlash ❏ Enabling TiFlash ❏ Queries with TiFlash Agenda
  • 6. TiDB is an Open Source , Distributed HTAP database compatible with MySQL Protocol. Introduction 2
  • 9. MySQL compatible, the TiDB SQL Layer separates compute from storage to make scaling simpler,
  • 10. The Placement Driver functions as a orchestrator. Responsible for TSO, scheduling, shard maintenance, metadata and much more
  • 11. Tikv is ROW based, Transactional storage, Offers high-availability, strong consistency that can auto-scale to hundreds of node with petabyte data scale
  • 12. Advantages of TiDB Open Source No Vendor lock-in with a database that’s 100% open source. Horizontal Scaling Grants total transparency into data workloads without manual sharding. Horizontal Scaling Grants total transparency into data workloads with automatic sharding. High Availability Guarantees auto-failover and self-healing for continuous data access. MySQL Compatibility Enjoy the most MySQL compatible distributed SQL database on the planet. Multi-Cloud Deploy database clusters anywhere in the world. Mixed Workloads Streamlined tech stack makes it easier to produce real-time analytics. Robust Security Protect data with enterprise-grade encryption both in-flight and at-rest.
  • 15. ❏ Real-time analytics: Process of analyzing data as it is created, collected, and processed to provide immediate insights to enable prompt decision-making. ❏ Use- Case: Real-time Fraud detection, Market Analysis, Personalized Recommendations, Demand forecasting ❏ Challenges: Data Volume and Velocity Integration Data Quality Cost
  • 16. ● With TiDB Realtime insights as in when the business happens. ● Easy Integration and maintenance.
  • 18. ❏ An Integrated columnar storage engine built exclusively for analytical workload. ❏ It's tightly integrated with TiKV and uses Clickhouse co-processor for providing MPP (Massively Parallel Processing) analytical queries. What is TiFlash?
  • 19. Data Sync with TiFlash
  • 20. Data Sync to TiFlash is done using the extended Raft-Learner Algorithm
  • 22. ❏ Adding a TiFlash node online won't impact the OLTP workload. Tiflash_servers: - host: 10.0.1.10 #tiup cluster scale-out <cluster-name> scale-out-topology.yaml ❏ After adding a TiFlash node, replication won’t starts by default. ❏ Replication to TiFlash can be at the table level or schema level. ALTER TABLE table_name SET TIFLASH REPLICA count; ALTER DATABASE db_name SET TIFLASH REPLICA count; ❏ Monitoring of TiFlash replication: SELECT * FROM information_schema.tiflash_replica; Enabling TiFlash
  • 24. ❏ Scaling out and scaling in TiFlash nodes is done online and won't impact the OLTP workload. Nodes Addition: Tiflash_servers: - host: 10.0.1.10 - host: 10.0.1.12 #tiup cluster scale-out <cluster-name> scale-out-topology.yaml Adjust the table / Schema replica count ALTER TABLE table_name SET TIFLASH REPLICA count; ALTER DATABASE db_name SET TIFLASH REPLICA count; Node Removal: Set the replica count to 0 for table ALTER TABLE table_name SET TIFLASH REPLICA 0; #tiup cluster scale-in <cluster-name> --node <tiflash_node_id> Scaling TiFlash
  • 26. ❏ TiDB Optimizer automatically determines to use TiFlash replicas based on the COST. ❏ This works even in mix of workloads. Smart Selection
  • 27. ❏ You can specify read queries to use replicas of specific engines with TiDB as shown below: Config file: [isolation-read] engines = ["tikv", "tidb", "tiflash"] SESSION: set SESSION tidb_isolation_read_engines = "engine list separated by commas"; Engine Isolation
  • 28. ❏ You can force the TiDB to use TiFlash replica as below with manual hint in query. select /*+ read_from_storage(tiflash[table_name]) */ ... from table_name; Manual Hint
  • 30. ❏ This mode enables the execution of queries in parallel across multiple nodes. ❏ TiDB automatically determines when to select MPP based on the optimizer’s cost estimation. tidb_allow_mpp ,tidb_enforce_mpp- Control variables MPP Mode
  • 31. ❏ With FastScan, TiFlash provides more efficient query performance but sacrifices the data consistency. ❏ This mode is disabled by default. ❏ Query results might include old data of a table. ❏ Enable / Disable using tiflash_fastscan FastScan Mode