SlideShare a Scribd company logo
Covering indexes

Stéphane Combaudon - SQLI
Indexing basics

 Data structure intended to speed up SELECTs

 Similar to an index in a book

 Overhead for every write
       ● Usually negligeable / speed up for SELECT




 Possibility to have one index for several columns
Index types
SHOW INDEX info
BTree indexes




   All leaves at the same distance from the root
   Efficient insertions, deletions
   Values are sorted
   B+Trees
           ● Efficient range scans


           ● Values stored in the leaves
BTree indexes

 Ok for most kinds of lookups:
        ● Exact full value (= xxx)


        ● Range of values (BETWEEN xx AND yy)


        ● Column prefix (LIKE 'xx%')


        ● Leftmost prefix




 Ok for sorting too

 But
        ● Not useful for 'LIKE %xxx' or LIKE '%xx%'
        ● You can't skip columns
Hash indexes

 Hash table with hash and pointer to row




Drawbacks
       ● Useful only for exact lookups (=, IN)


       ● Not supported by InnoDB or MyISAM




 Benefits
        ● Very fast


        ● Compact
R-Tree and T-Tree indexes

 R-Tree Indexes
        ● Same principle as B-Tree indexes


        ● Used for spatial indexes


        ● Requires the use of GIS functions


        ● MyISAM only




 T-Tree indexes
        ● Same principle as B-Tree indexes


        ● Specialized for in-memory storage engines


        ● Used in NDB Cluster
Index and data layouts
Data and indexes for MyISAM
 Data, primary key and secondary key (simplified)




 No structural difference between PK and secondary
  key
Data and indexes for InnoDB
 Data, primary key and secondary key (simplified)




 Two lookups needed to get row from secondary key
Accessing data
Different methods to access data

 Disk : cheap but slow
         ● ~ 100 random I/O ops/s


         ● ~ 500,000 sequential I/O ops/s




 RAM : quick but expensive
       ● ~ 250,000 random accesses/s


       ● ~ 5,000,000 sequential accesses/s




 Remember :
      ● Disks are extremely slow for random accesses


      ● Not much difference for sequential accesses
Covering indexes
Index-covered queries

 When performance problems occur:
       ● Add indexes


       ● Rewrite your queries


       ● Or both




 Do you need to fetch data (often on disk) ?

 If the index contains the data, you don't

 If you don't, your query is covered by an index (=index-
  only query)
Index-covered queries

 Query with traditional index:
       ● Get right rows with index


       ● Get data from rows


       ● Send data back to client




 Index-covered query:
        ● Get right rows with index


        ● Get data from rows


        ● Send data back to client
Covering index and EXPLAIN

mysql> EXPLAIN SELECT ID FROM world.CityG
*************************** 1. row ***************************
              id: 1
   select_type: SIMPLE
          table: City
           type: index
possible_keys: NULL
            key: PRIMARY
       key_len: 4
            ref: NULL
          rows: 4079
         Extra: Using index
Advantages of a covering index

 No access to the rows anymore !

 Indexes smaller and easier to cache than data

 Indexes sorted by values: random access can become
  sequential access

 Additional trick with InnoDB (more later)

 => Covering indexes are very beneficial for I/O bound
  workloads
When you can't use a covering idx

 SELECT *



 Indexes that don't store the values:
        ● Indexes different from BTree indexes


        ● BTree indexes with MEMORY tables


        ● Indexes on a column's prefix
Case studies
A case study

CREATE TABLE `customer` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `name` varchar(20) NOT NULL DEFAULT '',
    `age` tinyint(4) DEFAULT NULL,
    `subscription` date NOT NULL,
    PRIMARY KEY (`id`)
) ENGINE=MyISAM



 Name of people who subscribed on 2009-01-01 ?
 We want this list to be sorted by name
The naïve way
mysql> EXPLAIN SELECT name FROM customer
   WHERE subscription='2009-01-01' ORDER BY name
*************************** 1. row ***************************
               id: 1
    select_type: SIMPLE
           table: customer
           type: ALL
possible_keys: NULL
             key: NULL
       key_len: NULL
              ref: NULL
           rows: 5000000
          Extra: Using where; Using filesort
First try ...
mysql> ALTER TABLE customer ADD INDEX idx_name
       (name)

mysql> EXPLAIN SELECT name FROM customer
   WHERE subscription='2009-01-01' ORDER BY name
*************************** 1. row ***************************
                ...
           type: ALL
possible_keys: NULL
            key: NULL
       key_len: NULL
              ref: NULL
           rows: 5000000
           Extra: Using where; Using filesort
Better ...
mysql> ALTER TABLE customer ADD INDEX idx_sub
       (subscription)

mysql> EXPLAIN SELECT name FROM customer
   WHERE subscription='2009-01-01' ORDER BY name
*************************** 1. row ***************************
           ...
       type: ref
        key: idx_sub
      rows: 4370
      Extra: Using where; Using filesort
The ideal way

mysql> ALTER TABLE customer ADD INDEX
       idx_sub_name (subscription,name)



mysql> EXPLAIN SELECT name FROM customer
   WHERE subscription='2009-01-01' ORDER BY name
*************************** 1. row ***************************
           ...
       type: ref
        key: idx_sub_name
      rows: 4363
      Extra: Using where; Using index
Benchmarks

 Avg number of sec to run the query
       ● Without index: 3.743


       ● Index on subscription: 0.435


       ● Covering index: 0.012




 Covering index
        ● 35x faster than index on subscription


        ● 300x faster than full table scan
Even better for MyISAM

 We can keep the covering index in memory


  mysql> SET GLOBAL
  customer_cache.key_buffer_size = 130000000;
  mysql> CACHE INDEX customer IN customer_cache;
  mysql> LOAD INDEX INTO CACHE customer;

 Avg number of sec to run the query: 0.007

 This step is specific to MyISAM !
Even better for InnoDB

 InnoDB secondary keys hold primary key values


mysql> EXPLAIN SELECT name,id FROM customer
  WHERE subscription='2009-01-01' ORDER BY name

*************************** 1. row ***************************
 possible_keys: idx_sub_name
             key: idx_sub_name
           Extra: Using where; Using index
Another (harder) case study

 Same table : customer

 List people who subscribed on 2009-01-01 AND
  whose name ends up with xx ?

 SELECT * FROM customer WHERE
  subscription='2009-01-01' AND name LIKE '%xx'

 Let's add an index on (subscription,name) ...
Another (harder) case study

mysql> EXPLAIN SELECT * FROM customer WHERE
   subscription='2009-01-01' AND name LIKE '%xx'
*************************** 1. row ***************************
                ...
            key: idx_sub_name
       key_len: 3
             ref: const
           rows: 500272
          Extra: Using where

 The index is not covering anymore
Query rewriting - Indexing

 Rewriting the query
        SELECT * FROM customer
        INNER JOIN (
             SELECT id FROM customer
             WHERE subscription='2009-01-01'
             AND name LIKE '%xx'
         ) AS t USING(id)

 Adding an index
       ALTER TABLE customer ADD INDEX
       idx_sub_name_id (subscription,name,id)
Running EXPLAIN

*************************** 1. row ***************************
 select_type: PRIMARY
          table: <derived2>
*************************** 2. row ***************************
 select_type: PRIMARY
          table: customer
*************************** 3. row ***************************
 select_type: DERIVED
         table: customer
           key: idx_sub_name_id
         Extra: Using where; Using index
Efficiency of the optimization

 Beware of the subquery

 10 subs./3 names with %xx
        ● Original query: 0.000 s


        ● Rewritten query: 0.000 s




 300,000 subs./500 names with %xx
        ● Original query: 1.284 s


        ● Rewritten query: 0.553 s




 Many intermediate situations

 Always benchmark !
InnoDB ?

 The index on (subscription,name) is already covering
  for the subquery

 Your work is easier: just rewrite the query if need be

 But you still need to benchmark

More Related Content

PDF
MySQL Query Optimisation 101
PDF
Advanced MySQL Query and Schema Tuning
PPT
Intro To TSQL - Unit 4
PDF
CIS 336 Final Exam 2 (Devry)
PDF
CIS 336 Final Exam 2 (Devry)p
PDF
CIS 336 Final Exam 2 (Devry)s
PDF
PBDJ 19-4(woolley rev)
PDF
Cis 336 final exam 2
MySQL Query Optimisation 101
Advanced MySQL Query and Schema Tuning
Intro To TSQL - Unit 4
CIS 336 Final Exam 2 (Devry)
CIS 336 Final Exam 2 (Devry)p
CIS 336 Final Exam 2 (Devry)s
PBDJ 19-4(woolley rev)
Cis 336 final exam 2

What's hot (20)

PDF
Cis 336 final exam 2
PDF
Advanced MySQL Query Tuning
PDF
Introduction To Oracle Sql
PPTX
SQL (Basic to Intermediate Customized 8 Hours)
PDF
New Query Optimizer features in MariaDB 10.3
PPTX
Introduction to SQL (for Chicago Booth MBA technology club)
PDF
Chapter9 more on database and sql
PDF
MySQL Index Cookbook
PDF
Chapter8 my sql revision tour
PPTX
Optimizing MySQL Queries
PDF
Introduction to oracle functions
PDF
MySQL: Indexing for Better Performance
PPT
Mssql
PPTX
Sql operator
PDF
Chapter 4 Structured Query Language
PPTX
Sql fundamentals
PPTX
Files,blocks and functions in R
PPT
Using ddl statements to create and manage tables
DOCX
Learning sql from w3schools
PDF
SQL Overview
Cis 336 final exam 2
Advanced MySQL Query Tuning
Introduction To Oracle Sql
SQL (Basic to Intermediate Customized 8 Hours)
New Query Optimizer features in MariaDB 10.3
Introduction to SQL (for Chicago Booth MBA technology club)
Chapter9 more on database and sql
MySQL Index Cookbook
Chapter8 my sql revision tour
Optimizing MySQL Queries
Introduction to oracle functions
MySQL: Indexing for Better Performance
Mssql
Sql operator
Chapter 4 Structured Query Language
Sql fundamentals
Files,blocks and functions in R
Using ddl statements to create and manage tables
Learning sql from w3schools
SQL Overview
Ad

Viewers also liked (17)

PDF
What's new in MySQL 5.5?
PDF
MySQL 5.7 NEW FEATURES, BETTER PERFORMANCE, AND THINGS THAT WILL BREAK -- Mid...
PDF
InnoDB Architecture and Performance Optimization, Peter Zaitsev
PDF
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
PDF
MariaDB Galera Cluster - Simple, Transparent, Highly Available
PDF
etcd based PostgreSQL HA Cluster
PPTX
Scalable Real-time analytics using Druid
PDF
HBase schema design Big Data TechCon Boston
PDF
Secure PostgreSQL deployment
PDF
How to Design Indexes, Really
PPT
7. Relational Database Design in DBMS
PDF
Security Best Practices for your Postgres Deployment
PPT
Databases: Normalisation
PDF
Database design & Normalization (1NF, 2NF, 3NF)
PDF
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
PDF
Intro to HBase
PPT
MongoDB Schema Design
What's new in MySQL 5.5?
MySQL 5.7 NEW FEATURES, BETTER PERFORMANCE, AND THINGS THAT WILL BREAK -- Mid...
InnoDB Architecture and Performance Optimization, Peter Zaitsev
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
MariaDB Galera Cluster - Simple, Transparent, Highly Available
etcd based PostgreSQL HA Cluster
Scalable Real-time analytics using Druid
HBase schema design Big Data TechCon Boston
Secure PostgreSQL deployment
How to Design Indexes, Really
7. Relational Database Design in DBMS
Security Best Practices for your Postgres Deployment
Databases: Normalisation
Database design & Normalization (1NF, 2NF, 3NF)
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
Intro to HBase
MongoDB Schema Design
Ad

Similar to Covering indexes (20)

PDF
MySQL Indexing
PDF
MySQL Indexing : Improving Query Performance Using Index (Covering Index)
PDF
MySQL Query And Index Tuning
PDF
Scaling MySQL Strategies for Developers
PDF
Introduction to Databases - query optimizations for MySQL
PPTX
Optimizing MySQL queries
PDF
Advanced MySQL Query Optimizations
PPTX
Indexes: The Second Pillar of Database Wisdom
PPTX
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
PDF
MySQL Performance Optimization
PPTX
MySQL Indexes
PDF
My MySQL SQL Presentation
PDF
High Performance Mysql - Friday Tech Talks at Squareboat
PDF
Need for Speed: MySQL Indexing
PDF
Mysql query optimization
PDF
Mysql Optimization
PDF
Zurich2007 MySQL Query Optimization
PDF
Zurich2007 MySQL Query Optimization
PPTX
Tunning sql query
PPTX
MySQL Indexing - Best practices for MySQL 5.6
MySQL Indexing
MySQL Indexing : Improving Query Performance Using Index (Covering Index)
MySQL Query And Index Tuning
Scaling MySQL Strategies for Developers
Introduction to Databases - query optimizations for MySQL
Optimizing MySQL queries
Advanced MySQL Query Optimizations
Indexes: The Second Pillar of Database Wisdom
Работа с индексами - лучшие практики для MySQL 5.6, Петр Зайцев (Percona)
MySQL Performance Optimization
MySQL Indexes
My MySQL SQL Presentation
High Performance Mysql - Friday Tech Talks at Squareboat
Need for Speed: MySQL Indexing
Mysql query optimization
Mysql Optimization
Zurich2007 MySQL Query Optimization
Zurich2007 MySQL Query Optimization
Tunning sql query
MySQL Indexing - Best practices for MySQL 5.6

More from MYXPLAIN (14)

PDF
Query Optimization with MySQL 5.6: Old and New Tricks
PDF
Advanced Query Optimizer Tuning and Analysis
PDF
Are You Getting the Best of your MySQL Indexes
PDF
How to Design Indexes, Really
PDF
MySQL 5.6 Performance
PDF
56 Query Optimization
PDF
Tools and Techniques for Index Design
PDF
Powerful Explain in MySQL 5.6
PDF
Optimizing Queries with Explain
PDF
The Power of MySQL Explain
PDF
Improving Performance with Better Indexes
PDF
Explaining the MySQL Explain
PDF
MySQL Optimizer Overview
PDF
Advanced query optimization
Query Optimization with MySQL 5.6: Old and New Tricks
Advanced Query Optimizer Tuning and Analysis
Are You Getting the Best of your MySQL Indexes
How to Design Indexes, Really
MySQL 5.6 Performance
56 Query Optimization
Tools and Techniques for Index Design
Powerful Explain in MySQL 5.6
Optimizing Queries with Explain
The Power of MySQL Explain
Improving Performance with Better Indexes
Explaining the MySQL Explain
MySQL Optimizer Overview
Advanced query optimization

Recently uploaded (20)

PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
1. Introduction to Computer Programming.pptx
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Encapsulation theory and applications.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
Hybrid model detection and classification of lung cancer
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Getting Started with Data Integration: FME Form 101
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
TLE Review Electricity (Electricity).pptx
Assigned Numbers - 2025 - Bluetooth® Document
1. Introduction to Computer Programming.pptx
cloud_computing_Infrastucture_as_cloud_p
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Chapter 5: Probability Theory and Statistics
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Encapsulation theory and applications.pdf
A Presentation on Artificial Intelligence
Enhancing emotion recognition model for a student engagement use case through...
MIND Revenue Release Quarter 2 2025 Press Release
Univ-Connecticut-ChatGPT-Presentaion.pdf
Web App vs Mobile App What Should You Build First.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Tartificialntelligence_presentation.pptx
Hybrid model detection and classification of lung cancer
Unlocking AI with Model Context Protocol (MCP)
Getting Started with Data Integration: FME Form 101
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
DP Operators-handbook-extract for the Mautical Institute
TLE Review Electricity (Electricity).pptx

Covering indexes

  • 2. Indexing basics  Data structure intended to speed up SELECTs  Similar to an index in a book  Overhead for every write ● Usually negligeable / speed up for SELECT  Possibility to have one index for several columns
  • 5. BTree indexes  All leaves at the same distance from the root  Efficient insertions, deletions  Values are sorted  B+Trees ● Efficient range scans ● Values stored in the leaves
  • 6. BTree indexes  Ok for most kinds of lookups: ● Exact full value (= xxx) ● Range of values (BETWEEN xx AND yy) ● Column prefix (LIKE 'xx%') ● Leftmost prefix  Ok for sorting too  But ● Not useful for 'LIKE %xxx' or LIKE '%xx%' ● You can't skip columns
  • 7. Hash indexes  Hash table with hash and pointer to row Drawbacks ● Useful only for exact lookups (=, IN) ● Not supported by InnoDB or MyISAM  Benefits ● Very fast ● Compact
  • 8. R-Tree and T-Tree indexes  R-Tree Indexes ● Same principle as B-Tree indexes ● Used for spatial indexes ● Requires the use of GIS functions ● MyISAM only  T-Tree indexes ● Same principle as B-Tree indexes ● Specialized for in-memory storage engines ● Used in NDB Cluster
  • 9. Index and data layouts
  • 10. Data and indexes for MyISAM  Data, primary key and secondary key (simplified)  No structural difference between PK and secondary key
  • 11. Data and indexes for InnoDB  Data, primary key and secondary key (simplified)  Two lookups needed to get row from secondary key
  • 13. Different methods to access data  Disk : cheap but slow ● ~ 100 random I/O ops/s ● ~ 500,000 sequential I/O ops/s  RAM : quick but expensive ● ~ 250,000 random accesses/s ● ~ 5,000,000 sequential accesses/s  Remember : ● Disks are extremely slow for random accesses ● Not much difference for sequential accesses
  • 15. Index-covered queries  When performance problems occur: ● Add indexes ● Rewrite your queries ● Or both  Do you need to fetch data (often on disk) ?  If the index contains the data, you don't  If you don't, your query is covered by an index (=index- only query)
  • 16. Index-covered queries  Query with traditional index: ● Get right rows with index ● Get data from rows ● Send data back to client  Index-covered query: ● Get right rows with index ● Get data from rows ● Send data back to client
  • 17. Covering index and EXPLAIN mysql> EXPLAIN SELECT ID FROM world.CityG *************************** 1. row *************************** id: 1 select_type: SIMPLE table: City type: index possible_keys: NULL key: PRIMARY key_len: 4 ref: NULL rows: 4079 Extra: Using index
  • 18. Advantages of a covering index  No access to the rows anymore !  Indexes smaller and easier to cache than data  Indexes sorted by values: random access can become sequential access  Additional trick with InnoDB (more later)  => Covering indexes are very beneficial for I/O bound workloads
  • 19. When you can't use a covering idx  SELECT *  Indexes that don't store the values: ● Indexes different from BTree indexes ● BTree indexes with MEMORY tables ● Indexes on a column's prefix
  • 21. A case study CREATE TABLE `customer` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL DEFAULT '', `age` tinyint(4) DEFAULT NULL, `subscription` date NOT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM  Name of people who subscribed on 2009-01-01 ?  We want this list to be sorted by name
  • 22. The naïve way mysql> EXPLAIN SELECT name FROM customer WHERE subscription='2009-01-01' ORDER BY name *************************** 1. row *************************** id: 1 select_type: SIMPLE table: customer type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5000000 Extra: Using where; Using filesort
  • 23. First try ... mysql> ALTER TABLE customer ADD INDEX idx_name (name) mysql> EXPLAIN SELECT name FROM customer WHERE subscription='2009-01-01' ORDER BY name *************************** 1. row *************************** ... type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 5000000 Extra: Using where; Using filesort
  • 24. Better ... mysql> ALTER TABLE customer ADD INDEX idx_sub (subscription) mysql> EXPLAIN SELECT name FROM customer WHERE subscription='2009-01-01' ORDER BY name *************************** 1. row *************************** ... type: ref key: idx_sub rows: 4370 Extra: Using where; Using filesort
  • 25. The ideal way mysql> ALTER TABLE customer ADD INDEX idx_sub_name (subscription,name) mysql> EXPLAIN SELECT name FROM customer WHERE subscription='2009-01-01' ORDER BY name *************************** 1. row *************************** ... type: ref key: idx_sub_name rows: 4363 Extra: Using where; Using index
  • 26. Benchmarks  Avg number of sec to run the query ● Without index: 3.743 ● Index on subscription: 0.435 ● Covering index: 0.012  Covering index ● 35x faster than index on subscription ● 300x faster than full table scan
  • 27. Even better for MyISAM  We can keep the covering index in memory mysql> SET GLOBAL customer_cache.key_buffer_size = 130000000; mysql> CACHE INDEX customer IN customer_cache; mysql> LOAD INDEX INTO CACHE customer;  Avg number of sec to run the query: 0.007  This step is specific to MyISAM !
  • 28. Even better for InnoDB  InnoDB secondary keys hold primary key values mysql> EXPLAIN SELECT name,id FROM customer WHERE subscription='2009-01-01' ORDER BY name *************************** 1. row *************************** possible_keys: idx_sub_name key: idx_sub_name Extra: Using where; Using index
  • 29. Another (harder) case study  Same table : customer  List people who subscribed on 2009-01-01 AND whose name ends up with xx ?  SELECT * FROM customer WHERE subscription='2009-01-01' AND name LIKE '%xx'  Let's add an index on (subscription,name) ...
  • 30. Another (harder) case study mysql> EXPLAIN SELECT * FROM customer WHERE subscription='2009-01-01' AND name LIKE '%xx' *************************** 1. row *************************** ... key: idx_sub_name key_len: 3 ref: const rows: 500272 Extra: Using where  The index is not covering anymore
  • 31. Query rewriting - Indexing  Rewriting the query SELECT * FROM customer INNER JOIN ( SELECT id FROM customer WHERE subscription='2009-01-01' AND name LIKE '%xx' ) AS t USING(id)  Adding an index ALTER TABLE customer ADD INDEX idx_sub_name_id (subscription,name,id)
  • 32. Running EXPLAIN *************************** 1. row *************************** select_type: PRIMARY table: <derived2> *************************** 2. row *************************** select_type: PRIMARY table: customer *************************** 3. row *************************** select_type: DERIVED table: customer key: idx_sub_name_id Extra: Using where; Using index
  • 33. Efficiency of the optimization  Beware of the subquery  10 subs./3 names with %xx ● Original query: 0.000 s ● Rewritten query: 0.000 s  300,000 subs./500 names with %xx ● Original query: 1.284 s ● Rewritten query: 0.553 s  Many intermediate situations  Always benchmark !
  • 34. InnoDB ?  The index on (subscription,name) is already covering for the subquery  Your work is easier: just rewrite the query if need be  But you still need to benchmark