SlideShare a Scribd company logo
Sergei Petrunia, MariaDB
New features
in MariaDB/MySQL
query optimizer
12:49:092
MySQL/MariaDB optimizer development
● Some features have common heritage
● Big releases:
– MariaDB 5.3/5.5
– MySQL 5.6
– (upcoming) MariaDB 10.0
12:49:093
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
PERFORMANCE_SCHEMA
Engine-independent
statistics
InnoDB persistent statistics
12:49:094
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:095
Subqueries in MySQL
● Subqueries are practially unusable
● e.g. Facebook disabled them in the parser
● Reason - “naive execution”.
12:49:096
Naive subquery execution
● For IN (SELECT... ) subqueries:
select * from hotel
where
hotel.country='USA' and
hotel.name IN (select hotel_stays.hotel
from hotel_stays
where hotel_stays.customer='John Smith')
for (each hotel in USA ) {
if (john smith stayed here) {
…
}
}
● Naive execution:
● Slow!
12:49:097
Naive subquery execution (2)
● For FROM(SELECT …) subquereis:
1. Retrieve all hotels with > 500 rooms, store in a temporary
table big_hotel;
2. Search in big_hotel for hotels near AMS.
● Naive execution:
● Slow!
select *
from
(select *
from hotel
where hotel.rooms > 500
) as big_hotel
where
big_hotel.nearest_aiport='AMS';
12:49:098
New subquery optimizations
● Handle IN (SELECT ...)
● Handle FROM (SELECT …)
● Handle a lot of cases
● Comparison with
PostgreSQL
– ~1000x slower before
– ~same order of magnitude now
● Releases
– MySQL 6.0
– MariaDB 5.5
● Sheeri Kritzer @ Mozilla seems
happy with this one
– MySQL 5.6
● Subset of MariaDB 5.5's
features
12:49:099
Subquery optimizations - summary
● Subqueries were generally unusable before MariaDB
5.3/5.5
● “Core” subquery optimizations are in
– MariaDB 5.3/5.5
– MySQL 5.6
● MariaDB has extra additions
● Further information:
https://kb.askmonty.org/en/subquery-optimizations/
12:49:0910
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0911
Batched Key Access - background
● Big, IO-bound joins were slow
– DBT-3 benchmark could not finish*
● Reason?
● Nested Loops join hits the second table at random
locations.
12:49:0912
Batched Key Access idea
Nested Loops Join Batched Key Access
Speedup reasons
● Fewer disk head movements
● Cache-friendliness
● Prefetch-friendliness
12:49:0913
Batched Key Access benchmark
set join_cache_level=6; – enable BKA
select max(l_extendedprice)
from orders, lineitem
where
l_orderkey=o_orderkey and
o_orderdate between $DATE1 and $DATE2
Run with
● Various join_buffer_size settings
● Various size of $DATE1...$DATE2 range
12:49:0914
Batched Key Access benchmark (2)
-2,000,000 3,000,000 8,000,000 13,000,000 18,000,000 23,000,000 28,000,000 33,000,000
0
500
1000
1500
2000
2500
3000
BKA join performance depending on buffer size
query_size=1, regular
query_size=1, BKA
query_size=2, regular
query_size=2, BKA
query_size=3, regular
query_size=3, BKA
Buffer size, bytes
Querytime,sec
Performance without BKA
Performance with BKA,
given sufficient buffer size
12:49:0915
Batched Key Access summary
● Optimization for big, IO-bound joins
– Orders-of-magnitude speedups
● Available in
– MariaDB 5.3/5.5 (more advanced)
– MySQL 5.6
● Not fully automatic yet
– Needs to be manually enabled
– Need to set buffer sizes.
12:49:0916
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0917
Index Condition Pushdown
alter table lineitem add index s_r (l_shipdate, l_receiptdate);
select count(*) from lineitem
where
l_shipdate between '1993-01-01' and '1993-02-01' and
datediff(l_receiptdate,l_shipdate) > 25 and
l_quantity > 40
● A new feature in MariaDB 5.3/ MySQL 5.6
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
| 1 | SIMPLE | lineitem | range | s_r | s_r | 4 | NULL | 158854 | Using index condition; Using where |
+----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+
1.Read index records in the range
l_shipdate between '1993-01-01' and '1993-02-01'
2.Check the index condition
datediff(l_receiptdate,l_shipdate) > 25
3.Read full table rows
4.Check the WHERE condition
l_quantity > 40
← New!
← Filters out records before
table rows are read
12:49:0918
Index Condition Pushdown - conclusions
Summary
● Applicable to any index-based access (ref, range, etc)
● Checks parts of WHERE after reading the index
● Reduces number of table records to be read
● Speedup can be like in “Using index”
– Great for IO-bound load (5x, 10x)
– Some for CPU-bound workload (2x)
Conclusions
● Have a selective condition on column?
– Put the column into index, at the end.
12:49:0919
Extended keys
● Before: optimizer has limited support for “tail” columns
– 'Using index' supports it
– ORDER BY col1, col2, pk1 support it
● After MariaDB 5.5/ MySQL 5.6
– all parts of optimizer (ref access, range access, etc) can use the “tail”
CREATE TABLE tbl (
pk1 sometype,
pk2 sometype,
...
col1 sometype,
col2 sometype,
...
KEY indexA (col1, col2)
...
PRIMARY KEY (pk1, pk2)
) ENGINE=InnoDB
indexA col1 col2 pk1 pk2
● Secondary indexes in InnoDB have invisible “tail”
12:49:0920
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0921
Better EXPLAIN in MySQL 5.6
● EXPLAIN for UPDATE/DELETE/INSERT … SELECT
– shows query plan for the finding records to update/delete
mysql> explain update customer set c_acctbal = c_acctbal - 100 where c_custkey=12354;
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | customer | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where |
+----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+
● EXPLAIN FORMAT=JSON
– Produces [big] JSON output
– Shows more information:
● Shows conditions attached to tables
● Shows whether “Using temporary; using filesort” is done to handle
GROUP BY or ORDER BY.
● Shows where subqueries are attached
– No other known additions
– Will be in MariaDB 10.0
The most useful addition!
12:49:0922
EXPLAIN FORMAT=JSON
What are the “conditions attached to tables”?
explain
select
count(*)
from
orders, customer
where
customer.c_custkey=orders.o_custkey and
customer.c_mktsegment='BUILDING' and
orders.o_totalprice > customer.c_acctbal and
orders.o_orderpriority='1-URGENT'
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
?
12:49:0923
EXPLAIN FORMAT=JSON (2)
{
"query_block": {
"select_id": 1,
"nested_loop": [
{
"table": {
"table_name": "customer",
"access_type": "ALL",
"possible_keys": [
"PRIMARY"
],
"rows": 1509871,
"filtered": 100,
"attached_condition": "(`dbt3sf10`.`customer`.`c_mktsegment` = 'BUILDING')"
}
},
{
"table": {
"table_name": "orders",
"access_type": "ref",
"possible_keys": [
"i_o_custkey"
],
"key": "i_o_custkey",
"used_key_parts": [
"o_custkey"
],
"key_length": "5",
"ref": [
"dbt3sf10.customer.c_custkey"
],
"rows": 7,
"filtered": 100,
"attached_condition": "((`dbt3sf10`.`orders`.`o_orderpriority` = '1-URGENT') and (`dbt3sf10`.`orders`.`o_totalprice` >
`dbt3sf10`.`customer`.`c_acctbal`))"
}
}
]
}
}
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where |
+----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
12:49:0924
EXPLAIN ANALYZE (kind of)
● Does EXPLAIN match the reality?
● Where is most of the time spent?
● MySQL/MariaDB don't have “EXPLAIN ANALYZE” ...
select
count(*)
from
orders, customer
where
customer.c_custkey=orders.o_custkey and
customer.c_mktsegment='BUILDING' and orders.o_orderpriority='1-URGENT'
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 149415 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
12:49:0925
Traditional solution: Status variables
Problems:
● Only #rows counters
● all tables are counted together
mysql> flush status;
Query OK, 0 rows affected (0.00 sec)
mysql> {run query}
mysql> show status like 'Handler%';
+----------------------------+--------+
| Variable_name | Value |
+----------------------------+--------+
| Handler_commit | 1 |
| Handler_delete | 0 |
| Handler_discover | 0 |
| Handler_icp_attempts | 0 |
| Handler_icp_match | 0 |
| Handler_mrr_init | 0 |
| Handler_mrr_key_refills | 0 |
| Handler_mrr_rowid_refills | 0 |
| Handler_prepare | 0 |
| Handler_read_first | 0 |
| Handler_read_key | 30142 |
| Handler_read_last | 0 |
| Handler_read_next | 303959 |
| Handler_read_prev | 0 |
| Handler_read_rnd | 0 |
| Handler_read_rnd_deleted | 0 |
| Handler_read_rnd_next | 150001 |
| Handler_rollback | 0 |
...
. . .
12:49:0926
Newer solution: userstat
● In Facebook patch, Percona, MariaDB:
mysql> set global userstat=1;
mysql> flush table_statistics;
mysql> flush index_statistics;
mysql> {query}
mysql> show table_statistics;
+--------------+------------+-----------+--------------+-------------------------+
| Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes |
+--------------+------------+-----------+--------------+-------------------------+
| dbt3sf1 | orders | 303959 | 0 | 0 |
| dbt3sf1 | customer | 150000 | 0 | 0 |
+--------------+------------+-----------+--------------+-------------------------+
mysql> show index_statistics;
+--------------+------------+-------------+-----------+
| Table_schema | Table_name | Index_name | Rows_read |
+--------------+------------+-------------+-----------+
| dbt3sf1 | orders | i_o_custkey | 303959 |
+--------------+------------+-------------+-----------+
● Counters are per-table
– Ok as long as you don't have self-joins
● Overhead is negligible
● Counters are server-wide (other queries affect them, too)
12:49:0927
Latest addition: PERFORMANCE_SCHEMA
● Allows to measure *time* spent reading each table
● Has some visible overhead (Facebook's tests: 7%)
● Counters are system-wide
● Still no luck with self-joins
mysql> truncate performance_schema.table_io_waits_summary_by_table;
mysql> {query}
mysql> select
object_schema,
object_name,
count_read,
sum_timer_read, -- this is picoseconds
sum_timer_read / (1000*1000*1000*1000) as read_seconds -- this is seconds
from
performance_schema.table_io_waits_summary_by_table
where
object_schema = 'dbt3sf1' and object_name in ('orders','customer');
+---------------+-------------+------------+----------------+--------------+
| object_schema | object_name | count_read | sum_timer_read | read_seconds |
+---------------+-------------+------------+----------------+--------------+
| dbt3sf1 | orders | 334101 | 5739345397323 | 5.7393 |
| dbt3sf1 | customer | 150001 | 1273653046701 | 1.2737 |
+---------------+-------------+------------+----------------+--------------+
12:49:0928
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
Engine-independent
statistics
InnoDB persistent statistics
PERFORMANCE_SCHEMA
12:49:0929
What is table/index statistics?
select
count(*)
from
customer, orders
where
customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING';
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
| 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 148305 | Using where |
| 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index |
+------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
MariaDB > show table status like 'orders'G
*************************** 1. row ***************************
Name: orders
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 1495152
.............
MariaDB > show keys from orders where key_name='i_o_custkey'G
*************************** 1. row ***************************
Table: orders
Non_unique: 1
Key_name: i_o_custkey
Seq_in_index: 1
Column_name: o_custkey
Collation: A
Cardinality: 212941
Sub_part: NULL
.................
?
1495152 / 212941 = 7
“There are on average 7 orders
for a given c_custkey”
12:49:0930
The problem with index statistics and InnoDB
MySQL 5.5, InnoDB
● Statistics is calculated on-the-fly
– When the table is opened (server restart, DDL)
– When sufficient number of records have been updated
– ...
● Calculation uses random sampling
– @@innodb_stats_sample_pages
● Result:
– Statistics changes without warning
=> Query plans change, without warning
● For example, DBT-3 benchmark
– 22 analytics queries
– Plans-per-query: avg=2.8, max=7.
12:49:0931
Persistent table statistics
Persistent statistics v1
● Percona Server 5.5 (ported to MariaDB 5.5)
– Need to enable it: innodb_use_sys_stats_table=1
● Statistics is stored inside InnoDB
– User-visible through information_schema.innodb_sys_stats (read-only)
● Setting innodb_stats_auto_update=OFF prevents unexpected updates
Persistent statistics v2
● MySQL 5.6
– Enabled by default: innodb_stats_persistent=1
● Stored in regular InnoDB tables
– mysql.innodb_table_stats, mysql.innodb_index_stats
● Setting innodb_stats_auto_recalc=OFF prevents unexpected updates
● Can also specify persistence/auto-recalc as a table option
12:49:0932
Persistent table statistics - summary
● Percona, then MySQL
– Made statistics persistent
– Disallowed automatic updates
● Remaining issue #1: it's still random sampling
– DBT-3 benchmark
– scale=30
– Re-ran EXPLAINS for
benchmark queries
– Counted different query
plans
● Remaining issue #2: limited amount of statistics
– Only on index columns
– Only AVG(#different_values)
12:49:0933
Upcoming: Engine-independent statistics
MariaDB 10.0: Engine-independent statistics
● Collected/used on SQL layer
● No auto updates, only ANALYZE TABLE
– 100% precise statics
● More statistics
– Index statistics (like before)
– Table statistics (like before)
– Column statistics
● MIN/MAX values
● Number of NULL / not NULL values
● Histograms
● => Optimizer will be smarter and more reliable
12:49:0934
Conclusions
● Lots of new query optimizer features recently
– Subqueries now just work
– Big joins are much faster
● Need to turn it on
– More diagnostics
● Even more is coming
● Releases with features
– MariaDB 5.5
– MySQL 5.6,
– (upcoming) MariaDB 10.0
12:49:0935
New optimizer features
Subqueries Batched Key Access
(MRR)
Index Condition Pushdown
Extended Keys
EXPLAIN UPDATE/
DELETE
Subqueries
FROM IN Others
PERFORMANCE_SCHEMA
Engine-independent
statistics
InnoDB persistent statistics
12:49:0936
Thanks
Q & A

More Related Content

PDF
MariaDB 10.0 Query Optimizer
PDF
MariaDB: Engine Independent Table Statistics, including histograms
PDF
Fosdem2012 mariadb-5.3-query-optimizer-r2
PDF
Fosdem2012 replication-features-of-2011
PDF
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
PDF
MariaDB Temporal Tables
PDF
Percona live-2012-optimizer-tuning
PDF
Basic MySQL Troubleshooting for Oracle Database Administrators
MariaDB 10.0 Query Optimizer
MariaDB: Engine Independent Table Statistics, including histograms
Fosdem2012 mariadb-5.3-query-optimizer-r2
Fosdem2012 replication-features-of-2011
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
MariaDB Temporal Tables
Percona live-2012-optimizer-tuning
Basic MySQL Troubleshooting for Oracle Database Administrators

What's hot (20)

PDF
Mysqlconf2013 mariadb-cassandra-interoperability
PDF
Playing with the CONNECT storage engine
PDF
Introduction into MySQL Query Tuning for Dev[Op]s
PDF
Character Encoding - MySQL DevRoom - FOSDEM 2015
PDF
Performance Schema for MySQL Troubleshooting
PDF
MySQL and MariaDB Backups
PDF
Query Optimizer in MariaDB 10.4
PDF
Optimizer Trace Walkthrough
PDF
Introducing new SQL syntax and improving performance with preparse Query Rewr...
PDF
New features in Performance Schema 5.7 in action
PDF
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
PDF
Using histograms to get better performance
PDF
Optimizer features in recent releases of other databases
PDF
Performance Schema for MySQL Troubleshooting
PDF
Efficient Pagination Using MySQL
PDF
MySQL Query tuning 101
PDF
0888 learning-mysql
PDF
Why Use EXPLAIN FORMAT=JSON?
PPTX
MySQLinsanity
PDF
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Mysqlconf2013 mariadb-cassandra-interoperability
Playing with the CONNECT storage engine
Introduction into MySQL Query Tuning for Dev[Op]s
Character Encoding - MySQL DevRoom - FOSDEM 2015
Performance Schema for MySQL Troubleshooting
MySQL and MariaDB Backups
Query Optimizer in MariaDB 10.4
Optimizer Trace Walkthrough
Introducing new SQL syntax and improving performance with preparse Query Rewr...
New features in Performance Schema 5.7 in action
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
Using histograms to get better performance
Optimizer features in recent releases of other databases
Performance Schema for MySQL Troubleshooting
Efficient Pagination Using MySQL
MySQL Query tuning 101
0888 learning-mysql
Why Use EXPLAIN FORMAT=JSON?
MySQLinsanity
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
Ad

Viewers also liked (11)

PDF
Эволюция репликации в MySQL и MariaDB
PDF
Илья Космодемьянский (PostgreSQL-Consulting.com)
PDF
Сергей Житинский, Александр Чистяков (Git in Sky)
PPTX
MyRocks: табличный движок для MySQL на основе RocksDB
PDF
Павел Лузанов, Postgres Professional. «PostgreSQL для пользователей Oracle»
PDF
Профилирование кода на C/C++ в *nix-системах / Александр Алексеев (Postgres P...
PDF
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
PDF
ZSON, или прозрачное сжатие JSON
PDF
Профилирование кода на C/C++ в *nix системах
PDF
Функциональное программирование - Александр Алексеев
PDF
Новые технологии репликации данных в PostgreSQL - Александр Алексеев
Эволюция репликации в MySQL и MariaDB
Илья Космодемьянский (PostgreSQL-Consulting.com)
Сергей Житинский, Александр Чистяков (Git in Sky)
MyRocks: табличный движок для MySQL на основе RocksDB
Павел Лузанов, Postgres Professional. «PostgreSQL для пользователей Oracle»
Профилирование кода на C/C++ в *nix-системах / Александр Алексеев (Postgres P...
NVMf: 5 млн IOPS по сети своими руками / Андрей Николаенко (IBS)
ZSON, или прозрачное сжатие JSON
Профилирование кода на C/C++ в *nix системах
Функциональное программирование - Александр Алексеев
Новые технологии репликации данных в PostgreSQL - Александр Алексеев
Ad

Similar to New features-in-mariadb-and-mysql-optimizers (20)

PDF
2012 09 MariaDB Boston Meetup - MariaDB 是 Mysql 的替代者吗
PDF
介绍 Percona 服务器 XtraDB 和 Xtrabackup
PDF
Advanced Query Optimizer Tuning and Analysis
PDF
What’s new in MariaDB ColumnStore
PDF
Need for Speed: Mysql indexing
PDF
IT Tage 2019 MariaDB 10.4 New Features
PDF
What to expect from MariaDB Platform X5, part 2
PPT
11thingsabout11g 12659705398222 Phpapp01
PPT
11 Things About11g
PDF
MySQL 5.7 in a Nutshell
PDF
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
PDF
12c for Developers - Feb 2014
PDF
PostgreSQL 9.5 - Major Features
DOC
Dbmsmanual
PDF
What’s New in MariaDB Server 10.2
PPTX
OpenWorld Sep14 12c for_developers
PDF
MariaDB 10.4 New Features
PDF
Dissecting Real-World Database Performance Dilemmas
PDF
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
PDF
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
2012 09 MariaDB Boston Meetup - MariaDB 是 Mysql 的替代者吗
介绍 Percona 服务器 XtraDB 和 Xtrabackup
Advanced Query Optimizer Tuning and Analysis
What’s new in MariaDB ColumnStore
Need for Speed: Mysql indexing
IT Tage 2019 MariaDB 10.4 New Features
What to expect from MariaDB Platform X5, part 2
11thingsabout11g 12659705398222 Phpapp01
11 Things About11g
MySQL 5.7 in a Nutshell
Percona xtra db cluster(pxc) non blocking operations, what you need to know t...
12c for Developers - Feb 2014
PostgreSQL 9.5 - Major Features
Dbmsmanual
What’s New in MariaDB Server 10.2
OpenWorld Sep14 12c for_developers
MariaDB 10.4 New Features
Dissecting Real-World Database Performance Dilemmas
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1

More from Sergey Petrunya (20)

PDF
MariaDB's New-Generation Optimizer Hints
PDF
New optimizer features in MariaDB releases before 10.12
PDF
MariaDB's join optimizer: how it works and current fixes
PDF
Improved histograms in MariaDB 10.8
PDF
Improving MariaDB’s Query Optimizer with better selectivity estimates
PDF
JSON Support in MariaDB: News, non-news and the bigger picture
PDF
ANALYZE for Statements - MariaDB's hidden gem
PDF
MariaDB 10.4 - что нового
PDF
MariaDB Optimizer - further down the rabbit hole
PDF
Lessons for the optimizer from running the TPC-DS benchmark
PDF
MariaDB 10.3 Optimizer - where does it stand
PDF
MyRocks in MariaDB | M18
PDF
New Query Optimizer features in MariaDB 10.3
PDF
MyRocks in MariaDB
PDF
Histograms in MariaDB, MySQL and PostgreSQL
PDF
Say Hello to MyRocks
PDF
Common Table Expressions in MariaDB 10.2
PDF
MyRocks in MariaDB: why and how
PDF
MariaDB 10.1 - что нового.
PDF
Window functions in MariaDB 10.2
MariaDB's New-Generation Optimizer Hints
New optimizer features in MariaDB releases before 10.12
MariaDB's join optimizer: how it works and current fixes
Improved histograms in MariaDB 10.8
Improving MariaDB’s Query Optimizer with better selectivity estimates
JSON Support in MariaDB: News, non-news and the bigger picture
ANALYZE for Statements - MariaDB's hidden gem
MariaDB 10.4 - что нового
MariaDB Optimizer - further down the rabbit hole
Lessons for the optimizer from running the TPC-DS benchmark
MariaDB 10.3 Optimizer - where does it stand
MyRocks in MariaDB | M18
New Query Optimizer features in MariaDB 10.3
MyRocks in MariaDB
Histograms in MariaDB, MySQL and PostgreSQL
Say Hello to MyRocks
Common Table Expressions in MariaDB 10.2
MyRocks in MariaDB: why and how
MariaDB 10.1 - что нового.
Window functions in MariaDB 10.2

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Machine learning based COVID-19 study performance prediction
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Spectroscopy.pptx food analysis technology
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Cloud computing and distributed systems.
PPT
Teaching material agriculture food technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Spectral efficient network and resource selection model in 5G networks
MYSQL Presentation for SQL database connectivity
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Network Security Unit 5.pdf for BCA BBA.
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Electronic commerce courselecture one. Pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Machine learning based COVID-19 study performance prediction
NewMind AI Weekly Chronicles - August'25 Week I
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Spectroscopy.pptx food analysis technology
Encapsulation_ Review paper, used for researhc scholars
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Chapter 3 Spatial Domain Image Processing.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Diabetes mellitus diagnosis method based random forest with bat algorithm
Cloud computing and distributed systems.
Teaching material agriculture food technology
Per capita expenditure prediction using model stacking based on satellite ima...
Spectral efficient network and resource selection model in 5G networks

New features-in-mariadb-and-mysql-optimizers

  • 1. Sergei Petrunia, MariaDB New features in MariaDB/MySQL query optimizer
  • 2. 12:49:092 MySQL/MariaDB optimizer development ● Some features have common heritage ● Big releases: – MariaDB 5.3/5.5 – MySQL 5.6 – (upcoming) MariaDB 10.0
  • 3. 12:49:093 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others PERFORMANCE_SCHEMA Engine-independent statistics InnoDB persistent statistics
  • 4. 12:49:094 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 5. 12:49:095 Subqueries in MySQL ● Subqueries are practially unusable ● e.g. Facebook disabled them in the parser ● Reason - “naive execution”.
  • 6. 12:49:096 Naive subquery execution ● For IN (SELECT... ) subqueries: select * from hotel where hotel.country='USA' and hotel.name IN (select hotel_stays.hotel from hotel_stays where hotel_stays.customer='John Smith') for (each hotel in USA ) { if (john smith stayed here) { … } } ● Naive execution: ● Slow!
  • 7. 12:49:097 Naive subquery execution (2) ● For FROM(SELECT …) subquereis: 1. Retrieve all hotels with > 500 rooms, store in a temporary table big_hotel; 2. Search in big_hotel for hotels near AMS. ● Naive execution: ● Slow! select * from (select * from hotel where hotel.rooms > 500 ) as big_hotel where big_hotel.nearest_aiport='AMS';
  • 8. 12:49:098 New subquery optimizations ● Handle IN (SELECT ...) ● Handle FROM (SELECT …) ● Handle a lot of cases ● Comparison with PostgreSQL – ~1000x slower before – ~same order of magnitude now ● Releases – MySQL 6.0 – MariaDB 5.5 ● Sheeri Kritzer @ Mozilla seems happy with this one – MySQL 5.6 ● Subset of MariaDB 5.5's features
  • 9. 12:49:099 Subquery optimizations - summary ● Subqueries were generally unusable before MariaDB 5.3/5.5 ● “Core” subquery optimizations are in – MariaDB 5.3/5.5 – MySQL 5.6 ● MariaDB has extra additions ● Further information: https://kb.askmonty.org/en/subquery-optimizations/
  • 10. 12:49:0910 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 11. 12:49:0911 Batched Key Access - background ● Big, IO-bound joins were slow – DBT-3 benchmark could not finish* ● Reason? ● Nested Loops join hits the second table at random locations.
  • 12. 12:49:0912 Batched Key Access idea Nested Loops Join Batched Key Access Speedup reasons ● Fewer disk head movements ● Cache-friendliness ● Prefetch-friendliness
  • 13. 12:49:0913 Batched Key Access benchmark set join_cache_level=6; – enable BKA select max(l_extendedprice) from orders, lineitem where l_orderkey=o_orderkey and o_orderdate between $DATE1 and $DATE2 Run with ● Various join_buffer_size settings ● Various size of $DATE1...$DATE2 range
  • 14. 12:49:0914 Batched Key Access benchmark (2) -2,000,000 3,000,000 8,000,000 13,000,000 18,000,000 23,000,000 28,000,000 33,000,000 0 500 1000 1500 2000 2500 3000 BKA join performance depending on buffer size query_size=1, regular query_size=1, BKA query_size=2, regular query_size=2, BKA query_size=3, regular query_size=3, BKA Buffer size, bytes Querytime,sec Performance without BKA Performance with BKA, given sufficient buffer size
  • 15. 12:49:0915 Batched Key Access summary ● Optimization for big, IO-bound joins – Orders-of-magnitude speedups ● Available in – MariaDB 5.3/5.5 (more advanced) – MySQL 5.6 ● Not fully automatic yet – Needs to be manually enabled – Need to set buffer sizes.
  • 16. 12:49:0916 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 17. 12:49:0917 Index Condition Pushdown alter table lineitem add index s_r (l_shipdate, l_receiptdate); select count(*) from lineitem where l_shipdate between '1993-01-01' and '1993-02-01' and datediff(l_receiptdate,l_shipdate) > 25 and l_quantity > 40 ● A new feature in MariaDB 5.3/ MySQL 5.6 +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ | 1 | SIMPLE | lineitem | range | s_r | s_r | 4 | NULL | 158854 | Using index condition; Using where | +----+-------------+----------+-------+---------------+------+---------+------+--------+------------------------------------+ 1.Read index records in the range l_shipdate between '1993-01-01' and '1993-02-01' 2.Check the index condition datediff(l_receiptdate,l_shipdate) > 25 3.Read full table rows 4.Check the WHERE condition l_quantity > 40 ← New! ← Filters out records before table rows are read
  • 18. 12:49:0918 Index Condition Pushdown - conclusions Summary ● Applicable to any index-based access (ref, range, etc) ● Checks parts of WHERE after reading the index ● Reduces number of table records to be read ● Speedup can be like in “Using index” – Great for IO-bound load (5x, 10x) – Some for CPU-bound workload (2x) Conclusions ● Have a selective condition on column? – Put the column into index, at the end.
  • 19. 12:49:0919 Extended keys ● Before: optimizer has limited support for “tail” columns – 'Using index' supports it – ORDER BY col1, col2, pk1 support it ● After MariaDB 5.5/ MySQL 5.6 – all parts of optimizer (ref access, range access, etc) can use the “tail” CREATE TABLE tbl ( pk1 sometype, pk2 sometype, ... col1 sometype, col2 sometype, ... KEY indexA (col1, col2) ... PRIMARY KEY (pk1, pk2) ) ENGINE=InnoDB indexA col1 col2 pk1 pk2 ● Secondary indexes in InnoDB have invisible “tail”
  • 20. 12:49:0920 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 21. 12:49:0921 Better EXPLAIN in MySQL 5.6 ● EXPLAIN for UPDATE/DELETE/INSERT … SELECT – shows query plan for the finding records to update/delete mysql> explain update customer set c_acctbal = c_acctbal - 100 where c_custkey=12354; +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ | 1 | SIMPLE | customer | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where | +----+-------------+----------+-------+---------------+---------+---------+------+------+-------------+ ● EXPLAIN FORMAT=JSON – Produces [big] JSON output – Shows more information: ● Shows conditions attached to tables ● Shows whether “Using temporary; using filesort” is done to handle GROUP BY or ORDER BY. ● Shows where subqueries are attached – No other known additions – Will be in MariaDB 10.0 The most useful addition!
  • 22. 12:49:0922 EXPLAIN FORMAT=JSON What are the “conditions attached to tables”? explain select count(*) from orders, customer where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING' and orders.o_totalprice > customer.c_acctbal and orders.o_orderpriority='1-URGENT' +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ ?
  • 23. 12:49:0923 EXPLAIN FORMAT=JSON (2) { "query_block": { "select_id": 1, "nested_loop": [ { "table": { "table_name": "customer", "access_type": "ALL", "possible_keys": [ "PRIMARY" ], "rows": 1509871, "filtered": 100, "attached_condition": "(`dbt3sf10`.`customer`.`c_mktsegment` = 'BUILDING')" } }, { "table": { "table_name": "orders", "access_type": "ref", "possible_keys": [ "i_o_custkey" ], "key": "i_o_custkey", "used_key_parts": [ "o_custkey" ], "key_length": "5", "ref": [ "dbt3sf10.customer.c_custkey" ], "rows": 7, "filtered": 100, "attached_condition": "((`dbt3sf10`.`orders`.`o_orderpriority` = '1-URGENT') and (`dbt3sf10`.`orders`.`o_totalprice` > `dbt3sf10`.`customer`.`c_acctbal`))" } } ] } } +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 1509871 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | dbt3sf10.customer.c_custkey | 7 | Using where | +----+-------------+----------+------+---------------+-------------+---------+-----------------------------+---------+-------------+
  • 24. 12:49:0924 EXPLAIN ANALYZE (kind of) ● Does EXPLAIN match the reality? ● Where is most of the time spent? ● MySQL/MariaDB don't have “EXPLAIN ANALYZE” ... select count(*) from orders, customer where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING' and orders.o_orderpriority='1-URGENT' +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 149415 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+
  • 25. 12:49:0925 Traditional solution: Status variables Problems: ● Only #rows counters ● all tables are counted together mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> {run query} mysql> show status like 'Handler%'; +----------------------------+--------+ | Variable_name | Value | +----------------------------+--------+ | Handler_commit | 1 | | Handler_delete | 0 | | Handler_discover | 0 | | Handler_icp_attempts | 0 | | Handler_icp_match | 0 | | Handler_mrr_init | 0 | | Handler_mrr_key_refills | 0 | | Handler_mrr_rowid_refills | 0 | | Handler_prepare | 0 | | Handler_read_first | 0 | | Handler_read_key | 30142 | | Handler_read_last | 0 | | Handler_read_next | 303959 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_deleted | 0 | | Handler_read_rnd_next | 150001 | | Handler_rollback | 0 | ... . . .
  • 26. 12:49:0926 Newer solution: userstat ● In Facebook patch, Percona, MariaDB: mysql> set global userstat=1; mysql> flush table_statistics; mysql> flush index_statistics; mysql> {query} mysql> show table_statistics; +--------------+------------+-----------+--------------+-------------------------+ | Table_schema | Table_name | Rows_read | Rows_changed | Rows_changed_x_#indexes | +--------------+------------+-----------+--------------+-------------------------+ | dbt3sf1 | orders | 303959 | 0 | 0 | | dbt3sf1 | customer | 150000 | 0 | 0 | +--------------+------------+-----------+--------------+-------------------------+ mysql> show index_statistics; +--------------+------------+-------------+-----------+ | Table_schema | Table_name | Index_name | Rows_read | +--------------+------------+-------------+-----------+ | dbt3sf1 | orders | i_o_custkey | 303959 | +--------------+------------+-------------+-----------+ ● Counters are per-table – Ok as long as you don't have self-joins ● Overhead is negligible ● Counters are server-wide (other queries affect them, too)
  • 27. 12:49:0927 Latest addition: PERFORMANCE_SCHEMA ● Allows to measure *time* spent reading each table ● Has some visible overhead (Facebook's tests: 7%) ● Counters are system-wide ● Still no luck with self-joins mysql> truncate performance_schema.table_io_waits_summary_by_table; mysql> {query} mysql> select object_schema, object_name, count_read, sum_timer_read, -- this is picoseconds sum_timer_read / (1000*1000*1000*1000) as read_seconds -- this is seconds from performance_schema.table_io_waits_summary_by_table where object_schema = 'dbt3sf1' and object_name in ('orders','customer'); +---------------+-------------+------------+----------------+--------------+ | object_schema | object_name | count_read | sum_timer_read | read_seconds | +---------------+-------------+------------+----------------+--------------+ | dbt3sf1 | orders | 334101 | 5739345397323 | 5.7393 | | dbt3sf1 | customer | 150001 | 1273653046701 | 1.2737 | +---------------+-------------+------------+----------------+--------------+
  • 28. 12:49:0928 Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others Engine-independent statistics InnoDB persistent statistics PERFORMANCE_SCHEMA
  • 29. 12:49:0929 What is table/index statistics? select count(*) from customer, orders where customer.c_custkey=orders.o_custkey and customer.c_mktsegment='BUILDING'; +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ | 1 | SIMPLE | customer | ALL | PRIMARY | NULL | NULL | NULL | 148305 | Using where | | 1 | SIMPLE | orders | ref | i_o_custkey | i_o_custkey | 5 | customer.c_custkey | 7 | Using index | +------+-------------+----------+------+---------------+-------------+---------+--------------------+--------+-------------+ MariaDB > show table status like 'orders'G *************************** 1. row *************************** Name: orders Engine: InnoDB Version: 10 Row_format: Compact Rows: 1495152 ............. MariaDB > show keys from orders where key_name='i_o_custkey'G *************************** 1. row *************************** Table: orders Non_unique: 1 Key_name: i_o_custkey Seq_in_index: 1 Column_name: o_custkey Collation: A Cardinality: 212941 Sub_part: NULL ................. ? 1495152 / 212941 = 7 “There are on average 7 orders for a given c_custkey”
  • 30. 12:49:0930 The problem with index statistics and InnoDB MySQL 5.5, InnoDB ● Statistics is calculated on-the-fly – When the table is opened (server restart, DDL) – When sufficient number of records have been updated – ... ● Calculation uses random sampling – @@innodb_stats_sample_pages ● Result: – Statistics changes without warning => Query plans change, without warning ● For example, DBT-3 benchmark – 22 analytics queries – Plans-per-query: avg=2.8, max=7.
  • 31. 12:49:0931 Persistent table statistics Persistent statistics v1 ● Percona Server 5.5 (ported to MariaDB 5.5) – Need to enable it: innodb_use_sys_stats_table=1 ● Statistics is stored inside InnoDB – User-visible through information_schema.innodb_sys_stats (read-only) ● Setting innodb_stats_auto_update=OFF prevents unexpected updates Persistent statistics v2 ● MySQL 5.6 – Enabled by default: innodb_stats_persistent=1 ● Stored in regular InnoDB tables – mysql.innodb_table_stats, mysql.innodb_index_stats ● Setting innodb_stats_auto_recalc=OFF prevents unexpected updates ● Can also specify persistence/auto-recalc as a table option
  • 32. 12:49:0932 Persistent table statistics - summary ● Percona, then MySQL – Made statistics persistent – Disallowed automatic updates ● Remaining issue #1: it's still random sampling – DBT-3 benchmark – scale=30 – Re-ran EXPLAINS for benchmark queries – Counted different query plans ● Remaining issue #2: limited amount of statistics – Only on index columns – Only AVG(#different_values)
  • 33. 12:49:0933 Upcoming: Engine-independent statistics MariaDB 10.0: Engine-independent statistics ● Collected/used on SQL layer ● No auto updates, only ANALYZE TABLE – 100% precise statics ● More statistics – Index statistics (like before) – Table statistics (like before) – Column statistics ● MIN/MAX values ● Number of NULL / not NULL values ● Histograms ● => Optimizer will be smarter and more reliable
  • 34. 12:49:0934 Conclusions ● Lots of new query optimizer features recently – Subqueries now just work – Big joins are much faster ● Need to turn it on – More diagnostics ● Even more is coming ● Releases with features – MariaDB 5.5 – MySQL 5.6, – (upcoming) MariaDB 10.0
  • 35. 12:49:0935 New optimizer features Subqueries Batched Key Access (MRR) Index Condition Pushdown Extended Keys EXPLAIN UPDATE/ DELETE Subqueries FROM IN Others PERFORMANCE_SCHEMA Engine-independent statistics InnoDB persistent statistics