SlideShare a Scribd company logo
MySQL
Query Optimisation
101
€ whoami
● Federico Razzoli
● Freelance consultant
● Writing SQL since MySQL 2.23
hello@federico-razzoli.com
● I love open source, sharing,
Collaboration, win-win, etc
● I love MariaDB, MySQL, Postgres, etc
○ Even Db2, somehow
Why is the database important?
Remember the Von Neumann machine?
It’s always about Data
● Since then, the purpose of hardware and software never changed:
○ Receive data
○ Process data
○ Output data
A rose by any other name...
● Feel free to use synonyms
○ Validate
○ Sanitise
○ Parse
○ Persist
○ Normalise / Denormalise
○ Cache
○ Map / Reduce
○ Print
○ Ping
○ ...
...would smell as sweet
● The database of known stars is not a ping package
● You use a DBMS to abstract data management as much as possible
○ Persistence
○ Consistence
○ Queries
○ Fast search
○ …
● That’s why “database is magic”
Busy
● But it’s just a very busy person, performing many tasks concurrently
● And each is:
○ Important (must be reliable)
○ Complex (must be fast)
○ Expected (if something goes wrong,
you will complain)
In practice, the DBMS is usually the bottleneck.
Terminology
Statement: An SQL command
Query: A statement that returns a resultset
...or any other statement :)
Resultset: output 0 or more rows
Optimiser / Query Planner: Component responsible of deciding a query’s
execution plan
Optimised query: A query whose execution plan is reasonably good
...this doesn’t imply in any way that the query is fast
Database: set of tables (schema)
Instance / Server: running MySQL daemon (cluster)
Performance
When should a query be optimised?
mysql> EXPLAIN SELECT * FROM t WHERE c < 10 G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: t
partitions: NULL
type: ALL
possible_keys: idx_c
key: idx_c
key_len: 4
ref: NULL
rows: 213030
filtered: 50.00
Extra: Using where
1 row in set, 1 warning (0.01 sec)
When should a query be optimised?
mysql> SELECT * FROM performance_schema.events_statements_summary_by_digestWHERE DIGEST =
'254a65744e661e072103b7a7630dee1c3a3b8e906f19889f7c796aebe7cdd4f8' G
*************************** 1. row ***************************
SCHEMA_NAME: test
DIGEST: 254a65744e661e072103b7a7630dee1c3a3b8e906f19889f7c796aebe7cdd4f8
DIGEST_TEXT: SELECT * FROM `t` WHERE `c` < ?
COUNT_STAR: 1
...
SUM_ROWS_AFFECTED: 0
SUM_ROWS_SENT: 57344
SUM_ROWS_EXAMINED: 212992
SUM_CREATED_TMP_DISK_TABLES: 0
SUM_CREATED_TMP_TABLES: 0
SUM_SELECT_FULL_JOIN: 0
SUM_SELECT_FULL_RANGE_JOIN: 0
SUM_SELECT_RANGE: 0
SUM_SELECT_RANGE_CHECK: 0
SUM_SELECT_SCAN: 1
SUM_SORT_MERGE_PASSES: 0
SUM_SORT_RANGE: 0
SUM_SORT_ROWS: 0
SUM_SORT_SCAN: 0
SUM_NO_INDEX_USED: 1
SUM_NO_GOOD_INDEX_USED: 0
FIRST_SEEN: 2019-05-14 00:31:24.078967
LAST_SEEN: 2019-05-14 00:31:24.078967
...
QUERY_SAMPLE_TEXT: SELECT * FROM t WHERE c < 10
QUERY_SAMPLE_SEEN: 2019-05-14 00:31:24.078967
QUERY_SAMPLE_TIMER_WAIT: 117493874000
But how do I find impacting queries?
● It depends what you mean by “impacting”
● There are several monitoring methods (USE, etc)
● But 3 philosophies:
But how do I find impacting queries?
● It depends what you mean by “impacting”
● There are several monitoring methods (USE, etc)
● But 3 philosophies:
○ Panicking when you hear that something is down or slow
But how do I find impacting queries?
● It depends what you mean by “impacting”
● There are several monitoring methods (USE, etc)
● But 3 philosophies:
○ Panicking when you hear that something is down or slow
○ System-centric monitoring
But how do I find impacting queries?
● It depends what you mean by “impacting”
● There are several monitoring methods (USE, etc)
● But 3 philosophies:
○ Panicking when you hear that something is down or slow
○ System-centric monitoring
○ User-centric
But how do I find impacting queries?
● Panicking when you hear that something is down or slow
● System-centric monitoring
● User-centric
You can use them all.
Panicking
● Simplest method
● Do nothing do prevent anything
○ Optionally, take a lot of actions to prevent imaginary problems in
imaginary ways
○ There is no evidence that your job is useless, so your boss will not fire
you
System-centric
● pt-query-digest, PMM, etc
● Merge queries into one, normalising its text and replacing parameters
○ SELECT * FROM t WHERE b= 111 AND a = 0 -- comment
○ Select * From t Where a = 24 and b=42;
○ SELECT * FROM t WHERE a = ? AND b = ?
● Sum execution time of each occurrence (Grand Total Time)
● Optimise the queries with highest GTT
User-Centric
● Calculate the cost of slowness (users don’t buy, maybe leave 4ever)
● Cost of slowness is different for different
○ URLs
○ Number of users
○ ...other variables that depend on your business
■ (day of month, country, etc)
● Set Service Level Objectives
● Monitor the HTTP calls latency, and the involved services
● Find out what’s slowing them down
Query Performance
What makes a query “important”?
● How many times it’s executed
● It’s locking
What makes a query slow?
● Number of rows read
○ Read != return
○ Indexes are there to lower the number of reads
● Number of rows written
○ In-memory temp tables are not good
○ On-disk temp tables are worse
How do I optimise a query?
● Use indexes properly
● Avoid creation of temp tables, if possible
What is an index?
Index Types
● BTREE - ordered data structure
● HASH - hash table
● PostgreSQL has much more
● Each storage engine can implement any of both
● InnoDB uses BTREE and internally uses HASH when it thinks it’s better
● The syntax CREATE INDEX USING [BTREE | HASH] is generally useless
We will focus on BTREE indexes in InnoDB
Index Properties
● Primary key: unique values, not null
● UNIQUE
● Multiple columns
○ The order matters
● Column prefix (only for strings)
InnoDB Indexes
● InnoDB tables should always have a Primary Key
● The table is stored ordered by primary key
● The table itself is the primary key
○ Columns “not part of the primary key” simply don’t affect the order of
rows
InnoDB Indexes
Table columns: {a, b, c}
Primary key: {a, b}
A B C
1 1 4
1 2 1
2 1 9
2 2 3
3 0 3
4 20 0
InnoDB Indexes
● Secondary indexes are stored separately
● They are ordered by the indexed column
● Each entry contain a reference to a primary key entry
InnoDB Indexes
Primary key: {a, b}
Index idx_c: {c}
c a b
0 4 20
1 1 2
3 2 2
3 3 0
4 1 1
9 2 1
Which queries will be faster?
Table columns: {a, b, c, d, e}
Primary key: {a, b}
Index idx_c: {c}
● SELECT * FROM t WHERE a = 1
● SELECT * FROM t WHERE a = 1 AND b = 2
● SELECT a, b FROM t WHERE c = 0
● SELECT d, e FROM t WHERE c = 0
More performance considerations?
More performance considerations?
● Big primary key = big indexes
● Primary key should be append-only
INTEGER UNSIGNED AUTO_INCREMENT
● These indexes are duplicates: {a} - {a, b}
● This index is wrong: {a, id}
Index implementation
https://github.com/jeremycole/innodb_diagrams
More performance considerations?
More performance considerations?
● Writing to an index is relatively slow
● Deleting many rows leaves fragmented indexes
WHERE is the index?
Phone Book
● Indexes are ordered data structures
● Think to them as a phone book
Table: {first_name, last_name, phone, address}
Index: {last_name, first_name}
Phone Book
● I will show you some queries, and you will tell me which can be solved by
using the index
● You may not know, but your mind contains a pretty good SQL optimiser
Table: {first_name, last_name, phone, address}
Index: {last_name, first_name}
Queries
SELECT * FROM phone_book …
WHERE last_name = 'Baker'
WHERE last_name IN ('Hartnell','Baker', 'Whittaker')
WHERE last_name > 'Baker'
WHERE last_name >= 'Baker'
WHERE last_name < 'Baker'
WHERE last_name <= 'Baker'
WHERE last_name <> 'Baker'
Queries
SELECT * FROM phone_book …
WHERE last_name IS NULL
WHERE last_name IS NOT NULL
Rule #1
A BTREE can optimise
point searches and ranges
Queries
WHERE last_name >= 'B' AND last_name < 'C'
WHERE last_name BETWEEN 'B' AND 'C'
WHERE last_name LIKE 'B%'
Queries
WHERE last_name LIKE 'B%'
WHERE last_name LIKE '%B%'
WHERE last_name LIKE '%B'
WHERE last_name LIKE 'B_'
WHERE last_name LIKE '_B_'
WHERE last_name LIKE '_B'
Rule #2
A LIKE condition
whose second operand starts with a 'constant string'
is a range
Queries
WHERE first_name = 'Tom'
WHERE last_name = 'Baker'
WHERE first_name = 'Tom' AND last_name = 'Baker'
WHERE last_name = 'Baker' AND first_name = 'Tom'
Rule #3
We can use a whole index
or its leftmost part
Queries
WHERE LEFT(last_name, 2) = 'Ba'
WHERE last_name = CONCAT('Ba', 'ker')
Rule #4
Optimiser cannot make assumptions on functions/expression results.
However, wrapping a constant value into a function will produce another
constant value, which is mostly irrelevant for query optimisation.
Queries
WHERE last_name = first_name
Rule #5
Comparing a column with another results in a comparison
whose operands change at every row.
The optimiser cannot filter out any row in advance.
Queries
WHERE last_name = 'Baker' AND phone = '+44 7739 427279'
Rule #6
We can use an index to restrict the search to a set of rows
And search those rows in a non-optimised fashion
Depending on this set’s size, this could be a brilliant or a terrible strategy
Queries
WHERE last_name = 'Baker' AND first_name > 'Tom'
WHERE last_name > 'Baker' AND first_name = 'Tom'
WHERE last_name > 'Baker' AND first_name > 'Tom'
Queries
WHERE last_name = 'Baker' AND first_name > 'Tom'
WHERE first_name = 'Tom' AND last_name > 'Baker'
WHERE first_name > 'Tom' AND last_name = 'Baker'
WHERE last_name > 'Baker' AND first_name > 'Tom'
Baker, Colin
Baker, Tom
Baker, Walter
Capaldi, Ada
Capaldi, Peter
Whittaker, Jody
Whittaker, Vadim
Rule #7
If we have a range condition on an index column
The next index columns cannot be used
If you prefer:
Index usage stops at the first >
ORDER BY, GROUP BY
Mr Speaker talks to MySQL
Queries
ORDER BY last_name
ORDER BY first_name
ORDER BY last_name, first_name
ORDER BY first_name, last_name
Queries
GROUP BY last_name
GROUP BY first_name
GROUP BY last_name, first_name
GROUP BY first_name, last_name
Rule #8
ORDER BY and GROUP BY can take advantage of an index order
or create an internal temp table
Note: GROUP BY optimisation also depends on the function we’re using
(MAX, COUNT…).
Queries
WHERE last_name > 'Baker' ORDER BY last_name
WHERE last_name = 'Baker' ORDER BY first_name
WHERE last_name > 'Baker' ORDER BY first_name
Rule #9
If we have an ORDER BY / GROUP BY on an index column
The next index columns cannot be used
Multiple Indexes
Queries
Table: {id, a, b, c, d}
idx_a: {a, d}
idx_b: {b}
WHERE a = 10 OR a = 20
WHERE a = 24 OR c = 42
WHERE a = 24 OR d = 42
WHERE a = 24 AND b = 42
WHERE a = 24 OR b = 42
WHERE a = 24 ORDER BY b
GROUP BY a ORDER BY b
Rule #10
Using multiple indexes for AND or OR (intersect) is possible,
but there is a benefit only if we read MANY rows
Using different indexes in WHERE / GROUP BY / ORDER BY
is not possible
Thank you kindly!
https://federico-razzoli.com
info@federico-razzoli.com
MySQL Query Optimisation 101

More Related Content

PDF
How MySQL can boost (or kill) your application
PDF
How MySQL can boost (or kill) your application v2
PPT
SQL212.2 Introduction to SQL using Oracle Module 2
PDF
Advanced MySQL Query and Schema Tuning
PDF
Optimizing Queries with Explain
PDF
56 Query Optimization
PDF
Advanced MySQL Query Tuning
PDF
0888 learning-mysql
How MySQL can boost (or kill) your application
How MySQL can boost (or kill) your application v2
SQL212.2 Introduction to SQL using Oracle Module 2
Advanced MySQL Query and Schema Tuning
Optimizing Queries with Explain
56 Query Optimization
Advanced MySQL Query Tuning
0888 learning-mysql

What's hot (16)

PDF
Efficient Pagination Using MySQL
PDF
Explaining the MySQL Explain
PDF
Mysql query optimization
PPTX
Oraclesql
PPT
Intro To TSQL - Unit 4
PPT
Intro To TSQL - Unit 1
PPTX
Oracle basic queries
DOCX
My Sql concepts
PPT
Intro To TSQL - Unit 3
PPTX
Optimizing MySQL Queries
PPTX
Optimizing queries MySQL
PPT
SQL202.2 Accelerated Introduction to SQL Using SQL Server Module 2
PDF
45 Essential SQL Interview Questions
PPTX
Subqueries, Backups, Users and Privileges
PDF
Python for web security - beginner
PDF
Predicting Future Sale
Efficient Pagination Using MySQL
Explaining the MySQL Explain
Mysql query optimization
Oraclesql
Intro To TSQL - Unit 4
Intro To TSQL - Unit 1
Oracle basic queries
My Sql concepts
Intro To TSQL - Unit 3
Optimizing MySQL Queries
Optimizing queries MySQL
SQL202.2 Accelerated Introduction to SQL Using SQL Server Module 2
45 Essential SQL Interview Questions
Subqueries, Backups, Users and Privileges
Python for web security - beginner
Predicting Future Sale
Ad

Similar to MySQL Query Optimisation 101 (20)

PPTX
MySQL performance tuning
PDF
Introduction to Databases - query optimizations for MySQL
PDF
Covering indexes
PDF
MySQL Indexing : Improving Query Performance Using Index (Covering Index)
PDF
MySQL Indexing
PDF
Scaling MySQL Strategies for Developers
PPT
How to leave the ORM at home and write SQL
PDF
Database Design most common pitfalls
PDF
PostgreSQL 9.5 - Major Features
PPTX
Sql killedserver
PPTX
My SQL Skills Killed the Server
PDF
Etl confessions pg conf us 2017
PPTX
Ledingkart Meetup #2: Scaling Search @Lendingkart
PDF
How to build TiDB
PPT
15 Ways to Kill Your Mysql Application Performance
PDF
MariaDB stored procedures and why they should be improved
PPTX
Optimizando MySQL
ODP
BlaBlaCar Elastic Search Feedback
PDF
Complete+dbt+Bootcamp+slides-plus examples
PDF
Shaping Optimizer's Search Space
MySQL performance tuning
Introduction to Databases - query optimizations for MySQL
Covering indexes
MySQL Indexing : Improving Query Performance Using Index (Covering Index)
MySQL Indexing
Scaling MySQL Strategies for Developers
How to leave the ORM at home and write SQL
Database Design most common pitfalls
PostgreSQL 9.5 - Major Features
Sql killedserver
My SQL Skills Killed the Server
Etl confessions pg conf us 2017
Ledingkart Meetup #2: Scaling Search @Lendingkart
How to build TiDB
15 Ways to Kill Your Mysql Application Performance
MariaDB stored procedures and why they should be improved
Optimizando MySQL
BlaBlaCar Elastic Search Feedback
Complete+dbt+Bootcamp+slides-plus examples
Shaping Optimizer's Search Space
Ad

More from Federico Razzoli (20)

PDF
MariaDB Data Protection: Backup Strategies for the Real World
PDF
MariaDB/MySQL_: Developing Scalable Applications
PDF
Webinar: Designing a schema for a Data Warehouse
PDF
High-level architecture of a complete MariaDB deployment
PDF
Webinar - Unleash AI power with MySQL and MindsDB
PDF
MariaDB Security Best Practices
PDF
A first look at MariaDB 11.x features and ideas on how to use them
PDF
Webinar - MariaDB Temporal Tables: a demonstration
PDF
Webinar - Key Reasons to Upgrade to MySQL 8.0 or MariaDB 10.11
PDF
MariaDB 10.11 key features overview for DBAs
PDF
Recent MariaDB features to learn for a happy life
PDF
Advanced MariaDB features that developers love.pdf
PDF
Automate MariaDB Galera clusters deployments with Ansible
PDF
Creating Vagrant development machines with MariaDB
PDF
MariaDB, MySQL and Ansible: automating database infrastructures
PDF
Playing with the CONNECT storage engine
PDF
MariaDB Temporal Tables
PDF
MySQL and MariaDB Backups
PDF
JSON in MySQL and MariaDB Databases
PDF
MySQL Transaction Isolation Levels (lightning talk)
MariaDB Data Protection: Backup Strategies for the Real World
MariaDB/MySQL_: Developing Scalable Applications
Webinar: Designing a schema for a Data Warehouse
High-level architecture of a complete MariaDB deployment
Webinar - Unleash AI power with MySQL and MindsDB
MariaDB Security Best Practices
A first look at MariaDB 11.x features and ideas on how to use them
Webinar - MariaDB Temporal Tables: a demonstration
Webinar - Key Reasons to Upgrade to MySQL 8.0 or MariaDB 10.11
MariaDB 10.11 key features overview for DBAs
Recent MariaDB features to learn for a happy life
Advanced MariaDB features that developers love.pdf
Automate MariaDB Galera clusters deployments with Ansible
Creating Vagrant development machines with MariaDB
MariaDB, MySQL and Ansible: automating database infrastructures
Playing with the CONNECT storage engine
MariaDB Temporal Tables
MySQL and MariaDB Backups
JSON in MySQL and MariaDB Databases
MySQL Transaction Isolation Levels (lightning talk)

Recently uploaded (20)

PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
Introduction to Artificial Intelligence
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
ai tools demonstartion for schools and inter college
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Digital Strategies for Manufacturing Companies
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Nekopoi APK 2025 free lastest update
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Softaken Excel to vCard Converter Software.pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Odoo POS Development Services by CandidRoot Solutions
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
How Creative Agencies Leverage Project Management Software.pdf
Introduction to Artificial Intelligence
Design an Analysis of Algorithms I-SECS-1021-03
ai tools demonstartion for schools and inter college
wealthsignaloriginal-com-DS-text-... (1).pdf
Reimagine Home Health with the Power of Agentic AI​
Digital Strategies for Manufacturing Companies
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Odoo Companies in India – Driving Business Transformation.pdf
Nekopoi APK 2025 free lastest update
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Which alternative to Crystal Reports is best for small or large businesses.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...

MySQL Query Optimisation 101

  • 2. € whoami ● Federico Razzoli ● Freelance consultant ● Writing SQL since MySQL 2.23 hello@federico-razzoli.com ● I love open source, sharing, Collaboration, win-win, etc ● I love MariaDB, MySQL, Postgres, etc ○ Even Db2, somehow
  • 3. Why is the database important?
  • 4. Remember the Von Neumann machine?
  • 5. It’s always about Data ● Since then, the purpose of hardware and software never changed: ○ Receive data ○ Process data ○ Output data
  • 6. A rose by any other name... ● Feel free to use synonyms ○ Validate ○ Sanitise ○ Parse ○ Persist ○ Normalise / Denormalise ○ Cache ○ Map / Reduce ○ Print ○ Ping ○ ...
  • 7. ...would smell as sweet ● The database of known stars is not a ping package ● You use a DBMS to abstract data management as much as possible ○ Persistence ○ Consistence ○ Queries ○ Fast search ○ … ● That’s why “database is magic”
  • 8. Busy ● But it’s just a very busy person, performing many tasks concurrently ● And each is: ○ Important (must be reliable) ○ Complex (must be fast) ○ Expected (if something goes wrong, you will complain) In practice, the DBMS is usually the bottleneck.
  • 9. Terminology Statement: An SQL command Query: A statement that returns a resultset ...or any other statement :) Resultset: output 0 or more rows Optimiser / Query Planner: Component responsible of deciding a query’s execution plan Optimised query: A query whose execution plan is reasonably good ...this doesn’t imply in any way that the query is fast Database: set of tables (schema) Instance / Server: running MySQL daemon (cluster)
  • 11. When should a query be optimised? mysql> EXPLAIN SELECT * FROM t WHERE c < 10 G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t partitions: NULL type: ALL possible_keys: idx_c key: idx_c key_len: 4 ref: NULL rows: 213030 filtered: 50.00 Extra: Using where 1 row in set, 1 warning (0.01 sec)
  • 12. When should a query be optimised? mysql> SELECT * FROM performance_schema.events_statements_summary_by_digestWHERE DIGEST = '254a65744e661e072103b7a7630dee1c3a3b8e906f19889f7c796aebe7cdd4f8' G *************************** 1. row *************************** SCHEMA_NAME: test DIGEST: 254a65744e661e072103b7a7630dee1c3a3b8e906f19889f7c796aebe7cdd4f8 DIGEST_TEXT: SELECT * FROM `t` WHERE `c` < ? COUNT_STAR: 1 ... SUM_ROWS_AFFECTED: 0 SUM_ROWS_SENT: 57344 SUM_ROWS_EXAMINED: 212992 SUM_CREATED_TMP_DISK_TABLES: 0 SUM_CREATED_TMP_TABLES: 0 SUM_SELECT_FULL_JOIN: 0 SUM_SELECT_FULL_RANGE_JOIN: 0 SUM_SELECT_RANGE: 0 SUM_SELECT_RANGE_CHECK: 0 SUM_SELECT_SCAN: 1 SUM_SORT_MERGE_PASSES: 0 SUM_SORT_RANGE: 0 SUM_SORT_ROWS: 0 SUM_SORT_SCAN: 0 SUM_NO_INDEX_USED: 1 SUM_NO_GOOD_INDEX_USED: 0 FIRST_SEEN: 2019-05-14 00:31:24.078967 LAST_SEEN: 2019-05-14 00:31:24.078967 ... QUERY_SAMPLE_TEXT: SELECT * FROM t WHERE c < 10 QUERY_SAMPLE_SEEN: 2019-05-14 00:31:24.078967 QUERY_SAMPLE_TIMER_WAIT: 117493874000
  • 13. But how do I find impacting queries? ● It depends what you mean by “impacting” ● There are several monitoring methods (USE, etc) ● But 3 philosophies:
  • 14. But how do I find impacting queries? ● It depends what you mean by “impacting” ● There are several monitoring methods (USE, etc) ● But 3 philosophies: ○ Panicking when you hear that something is down or slow
  • 15. But how do I find impacting queries? ● It depends what you mean by “impacting” ● There are several monitoring methods (USE, etc) ● But 3 philosophies: ○ Panicking when you hear that something is down or slow ○ System-centric monitoring
  • 16. But how do I find impacting queries? ● It depends what you mean by “impacting” ● There are several monitoring methods (USE, etc) ● But 3 philosophies: ○ Panicking when you hear that something is down or slow ○ System-centric monitoring ○ User-centric
  • 17. But how do I find impacting queries? ● Panicking when you hear that something is down or slow ● System-centric monitoring ● User-centric You can use them all.
  • 18. Panicking ● Simplest method ● Do nothing do prevent anything ○ Optionally, take a lot of actions to prevent imaginary problems in imaginary ways ○ There is no evidence that your job is useless, so your boss will not fire you
  • 19. System-centric ● pt-query-digest, PMM, etc ● Merge queries into one, normalising its text and replacing parameters ○ SELECT * FROM t WHERE b= 111 AND a = 0 -- comment ○ Select * From t Where a = 24 and b=42; ○ SELECT * FROM t WHERE a = ? AND b = ? ● Sum execution time of each occurrence (Grand Total Time) ● Optimise the queries with highest GTT
  • 20. User-Centric ● Calculate the cost of slowness (users don’t buy, maybe leave 4ever) ● Cost of slowness is different for different ○ URLs ○ Number of users ○ ...other variables that depend on your business ■ (day of month, country, etc) ● Set Service Level Objectives ● Monitor the HTTP calls latency, and the involved services ● Find out what’s slowing them down
  • 22. What makes a query “important”? ● How many times it’s executed ● It’s locking
  • 23. What makes a query slow? ● Number of rows read ○ Read != return ○ Indexes are there to lower the number of reads ● Number of rows written ○ In-memory temp tables are not good ○ On-disk temp tables are worse
  • 24. How do I optimise a query? ● Use indexes properly ● Avoid creation of temp tables, if possible
  • 25. What is an index?
  • 26. Index Types ● BTREE - ordered data structure ● HASH - hash table ● PostgreSQL has much more ● Each storage engine can implement any of both ● InnoDB uses BTREE and internally uses HASH when it thinks it’s better ● The syntax CREATE INDEX USING [BTREE | HASH] is generally useless We will focus on BTREE indexes in InnoDB
  • 27. Index Properties ● Primary key: unique values, not null ● UNIQUE ● Multiple columns ○ The order matters ● Column prefix (only for strings)
  • 28. InnoDB Indexes ● InnoDB tables should always have a Primary Key ● The table is stored ordered by primary key ● The table itself is the primary key ○ Columns “not part of the primary key” simply don’t affect the order of rows
  • 29. InnoDB Indexes Table columns: {a, b, c} Primary key: {a, b} A B C 1 1 4 1 2 1 2 1 9 2 2 3 3 0 3 4 20 0
  • 30. InnoDB Indexes ● Secondary indexes are stored separately ● They are ordered by the indexed column ● Each entry contain a reference to a primary key entry
  • 31. InnoDB Indexes Primary key: {a, b} Index idx_c: {c} c a b 0 4 20 1 1 2 3 2 2 3 3 0 4 1 1 9 2 1
  • 32. Which queries will be faster? Table columns: {a, b, c, d, e} Primary key: {a, b} Index idx_c: {c} ● SELECT * FROM t WHERE a = 1 ● SELECT * FROM t WHERE a = 1 AND b = 2 ● SELECT a, b FROM t WHERE c = 0 ● SELECT d, e FROM t WHERE c = 0
  • 34. More performance considerations? ● Big primary key = big indexes ● Primary key should be append-only INTEGER UNSIGNED AUTO_INCREMENT ● These indexes are duplicates: {a} - {a, b} ● This index is wrong: {a, id}
  • 38. More performance considerations? ● Writing to an index is relatively slow ● Deleting many rows leaves fragmented indexes
  • 39. WHERE is the index?
  • 40. Phone Book ● Indexes are ordered data structures ● Think to them as a phone book Table: {first_name, last_name, phone, address} Index: {last_name, first_name}
  • 41. Phone Book ● I will show you some queries, and you will tell me which can be solved by using the index ● You may not know, but your mind contains a pretty good SQL optimiser Table: {first_name, last_name, phone, address} Index: {last_name, first_name}
  • 42. Queries SELECT * FROM phone_book … WHERE last_name = 'Baker' WHERE last_name IN ('Hartnell','Baker', 'Whittaker') WHERE last_name > 'Baker' WHERE last_name >= 'Baker' WHERE last_name < 'Baker' WHERE last_name <= 'Baker' WHERE last_name <> 'Baker'
  • 43. Queries SELECT * FROM phone_book … WHERE last_name IS NULL WHERE last_name IS NOT NULL
  • 44. Rule #1 A BTREE can optimise point searches and ranges
  • 45. Queries WHERE last_name >= 'B' AND last_name < 'C' WHERE last_name BETWEEN 'B' AND 'C' WHERE last_name LIKE 'B%'
  • 46. Queries WHERE last_name LIKE 'B%' WHERE last_name LIKE '%B%' WHERE last_name LIKE '%B' WHERE last_name LIKE 'B_' WHERE last_name LIKE '_B_' WHERE last_name LIKE '_B'
  • 47. Rule #2 A LIKE condition whose second operand starts with a 'constant string' is a range
  • 48. Queries WHERE first_name = 'Tom' WHERE last_name = 'Baker' WHERE first_name = 'Tom' AND last_name = 'Baker' WHERE last_name = 'Baker' AND first_name = 'Tom'
  • 49. Rule #3 We can use a whole index or its leftmost part
  • 50. Queries WHERE LEFT(last_name, 2) = 'Ba' WHERE last_name = CONCAT('Ba', 'ker')
  • 51. Rule #4 Optimiser cannot make assumptions on functions/expression results. However, wrapping a constant value into a function will produce another constant value, which is mostly irrelevant for query optimisation.
  • 53. Rule #5 Comparing a column with another results in a comparison whose operands change at every row. The optimiser cannot filter out any row in advance.
  • 54. Queries WHERE last_name = 'Baker' AND phone = '+44 7739 427279'
  • 55. Rule #6 We can use an index to restrict the search to a set of rows And search those rows in a non-optimised fashion Depending on this set’s size, this could be a brilliant or a terrible strategy
  • 56. Queries WHERE last_name = 'Baker' AND first_name > 'Tom' WHERE last_name > 'Baker' AND first_name = 'Tom' WHERE last_name > 'Baker' AND first_name > 'Tom'
  • 57. Queries WHERE last_name = 'Baker' AND first_name > 'Tom' WHERE first_name = 'Tom' AND last_name > 'Baker' WHERE first_name > 'Tom' AND last_name = 'Baker' WHERE last_name > 'Baker' AND first_name > 'Tom' Baker, Colin Baker, Tom Baker, Walter Capaldi, Ada Capaldi, Peter Whittaker, Jody Whittaker, Vadim
  • 58. Rule #7 If we have a range condition on an index column The next index columns cannot be used If you prefer: Index usage stops at the first >
  • 60. Mr Speaker talks to MySQL
  • 61. Queries ORDER BY last_name ORDER BY first_name ORDER BY last_name, first_name ORDER BY first_name, last_name
  • 62. Queries GROUP BY last_name GROUP BY first_name GROUP BY last_name, first_name GROUP BY first_name, last_name
  • 63. Rule #8 ORDER BY and GROUP BY can take advantage of an index order or create an internal temp table Note: GROUP BY optimisation also depends on the function we’re using (MAX, COUNT…).
  • 64. Queries WHERE last_name > 'Baker' ORDER BY last_name WHERE last_name = 'Baker' ORDER BY first_name WHERE last_name > 'Baker' ORDER BY first_name
  • 65. Rule #9 If we have an ORDER BY / GROUP BY on an index column The next index columns cannot be used
  • 67. Queries Table: {id, a, b, c, d} idx_a: {a, d} idx_b: {b} WHERE a = 10 OR a = 20 WHERE a = 24 OR c = 42 WHERE a = 24 OR d = 42 WHERE a = 24 AND b = 42 WHERE a = 24 OR b = 42 WHERE a = 24 ORDER BY b GROUP BY a ORDER BY b
  • 68. Rule #10 Using multiple indexes for AND or OR (intersect) is possible, but there is a benefit only if we read MANY rows Using different indexes in WHERE / GROUP BY / ORDER BY is not possible