SlideShare a Scribd company logo
Query Optimizer
in MariaDB 10.4
Sergei Petrunia,
Query Optimizer Developer
MariaDB Corporation
2019 MariaDB Developers Unconference
New York
New Optimizer features in MariaDB 10.4
● Optimizer trace
● Sampling for histogram collection
● Rowid filtering
● New default settings
● Condition Pushdown into IN-subqueries
● Condition Pushdown from HAVING into WHERE
New default settings
New default settings for statistics
-histogram_size=0
+histogram_size=254
-histogram_type=SINGLE_PREC_HB
+histogram_type=SINGLE_PREC_HB
-optimizer_use_condition_selectivity=1
+optimizer_use_condition_selectivity=4
1 Use selectivity of predicates as in MariaDB 5.5.
2 Use selectivity of all range predicates supported by indexes.
3 Use selectivity of all range predicates estimated without histogram.
4 Use selectivity of all range predicates estimated with histogram.
● Do use condition selectivity
● Make use of EITS statistics (incl. Histograms if they are available)
-use_stat_tables=NEVER
+use_stat_tables=PREFERABLY_FOR_QUERIES
– But don’t collect stats unless explicitly told to do so
● Do build histograms when collecting EITS statistics
New default settings
-eq_range_index_dive_limit=10
+eq_range_index_dive_limit=200
● Join buffer will auto-size itself
-optimize_join_buffer_size=OFF
+optimize_join_buffer_size=ON
● Use index statistics (cardinality) instead of records_in_range for large IN-lists
– Just following MySQL here
– (can use ANALYZE for statements to see the size)
Sampling for histograms
Histograms in MariaDB
● Introduced in MariaDB 10.0
– Manual command to collect, ANALYZE … PERSISTENT FOR …
– Optimizer settings to use them
– Histogram is collected from ALL table data
●
Other statistics (avg_frequency, avg_length), too.
● Results
– A few users
– Histogram collection is expensive
●
Cost of full table scan + full index scans, and even more than that
Histograms in MariaDB 10.4
● MariaDB 10.4
– “Bernoulli sampling” - roll the dice for each row
– Controlled with @@analyze_sample_percentage
●
100 (the default) – “use all data”
●
0 – (recommended) – “Determine sample ratio automatically”
● MySQL 8.0 also added histograms
– Uses Bernoulli sampling
– @@histogram_generation_max_mem_size=20MB.
● PostgreSQL has genuine random-jump sampling
– default_statistics_target
Histogram collection performance
● See MDEV-17886, (TODO: Vicentiu’s data?)
● Both MariaDB and MySQL: ANALYZE for columns is as fast as full table scan.
ANALYZE TABLE PERSISTENT FOR COLUMNS (...) INDEXES ();
● “Persistent for ALL” will also scan indexes
ANALYZE TABLE PERSISTENT FOR ALL;
● PostgreSQL is much faster with genuine sampling
– Vicentiu’s has a task in progress for this.
Histogram precision
● MariaDB histograms are very compact
– min/max column values, then 1-byte or 2-byte bounds (SINGLE|DOUBLE_PREC_HB)
– 255 bytes per histogram => 128 or 255 buckets max.
● MySQL
– Histogram is stored as JSON, bounds are stored as values
– 100 Buckets by default, max is 1024
●
In our tests, more buckets help in some cases
● PostgreSQL
– Histogram bounds stored as values
– 100 buckets by default, up to 10K allowed
● Testing is still in progress :-(, the obtained data varies.
Problem with correlated conditions
● Possible selectivities
– MIN(1/n, 1/m)
– (1/n) * (1/m)
– 0
select ...
from order_items
where shipdate='2015-12-15' AND item_name='christmas light'
'swimsuit'
Problem with correlated conditions
● PostgreSQL: Multi-variate statistics
– Detects functional dependencies, col1=F(col2)
– Only used for equality predicates
– Also #DISTINCT(a,b)
● MariaDB: MDEV-11107: Use table check constraints in optimizer
– Stalled?
select ...
from order_items
where shipdate='2015-12-15' AND item_name='christmas light'
'swimsuit'
Histograms: conclusions
● 10.4
– Sampling makes ANALYZE TABLE … PERSISTENT FOR COLUMNS
run at full-table-scan speed.
– @@analyze_sample_rows
● Further directions
– Do real sampling (in progress)
– More space for the histograms (?)
– Handle correlations (how?)
Optimizer trace
Optimizer trace
● Available in MySQL since MySQL 5.6
mysql> set optimizer_trace=1;
mysql> <query>;
mysql> select * from
-> information_schema.optimizer_trace;
{
"steps": [
{
"join_preparation": {
"select#": 1,
"steps": [
{
"expanded_query": "/* select#1 */ select `t1`.`col1` AS `col1`,`t1`.`col2`
AS `col2` from `t1` where (`t1`.`col1` < 4)"
}
]
}
},
{
"join_optimization": {
"select#": 1,
"steps": [
{
"condition_processing": {
"condition": "WHERE",
"original_condition": "(`t1`.`col1` < 4)",
"steps": [
{
"transformation": "equality_propagation",
"resulting_condition": "(`t1`.`col1` < 4)"
},
{
"transformation": "constant_propagation",
"resulting_condition": "(`t1`.`col1` < 4)"
},
{
"transformation": "trivial_condition_removal",
"resulting_condition": "(`t1`.`col1` < 4)"
}
]
}
},
{
● Now, similar feature in MariaDB
The goal is to understand the optimizer
● “Why was query plan X not chosen”
– Subquery was not converted into semi-join
●
This would exceed MAX_TABLES
– Subquery materialization was not used
●
Different collations
– Ref acess was not used
●
Incompatible collations
● What changed between the two hosts/versions
– diff trace_from_host1 trace_from_host2
Developer point of view
● The trace is always compiled in
● RAII-objects to start/end writing a trace
● Disabled trace added ~1-2% overhead
● Intend to add more tracing
– Expect to get more output
Rowid filtering
What is PK-filter: in details
SELECT *
FROM orders JOIN lineitem ON o_orderkey=l_orderkey
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND
o_totalprice between 200000 and 230000;
Filter for lineitem table built with condition
l_shipdate BETWEEN '1997-01-01' AND '1997-06-30':
is a container that contains primary keys of rows from lineitem which
l_shipdate value satisfy the above condition.
What is PK-filter: in details
SELECT *
FROM orders JOIN lineitem ON o_orderkey=l_orderkey
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND
o_totalprice between 200000 and 230000;
Filter for lineitem table built with condition
l_shipdate BETWEEN '1997-01-01' AND '1997-06-30':
is a container that contains primary keys of rows from lineitem which
l_shipdate value satisfy the above condition.
What is PK-filter: in details
SELECT *
FROM orders JOIN lineitem ON o_orderkey=l_orderkey
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-02-01' AND
o_totalprice > 200000;
1. There is index i_l_shipdate on
lineitem(l_shipdate)
What is PK-filter: in details
2.
SELECT *
FROM orders JOIN lineitem ON o_orderkey=l_orderkey
WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND
o_totalprice between 200000 and 230000;
Condition pushdown...
SELECT ...
FROM t1
WHERE (a < 2) AND
a IN
(
SELECT c
FROM t2
WHERE … AND (c < 2)
GROUP BY c
);
How condition pushdown is made
SELECT ...
FROM t1
WHERE (a < 2) AND
a IN
(
SELECT c
FROM t2
WHERE ...
GROUP BY c
);
Thanks!

More Related Content

PDF
Optimizer Trace Walkthrough
PDF
Lessons for the optimizer from running the TPC-DS benchmark
PDF
Optimizer features in recent releases of other databases
PDF
ANALYZE for Statements - MariaDB's hidden gem
PDF
Histograms in MariaDB, MySQL and PostgreSQL
PDF
Using histograms to get better performance
PDF
MariaDB Optimizer - further down the rabbit hole
PDF
Improving MariaDB’s Query Optimizer with better selectivity estimates
Optimizer Trace Walkthrough
Lessons for the optimizer from running the TPC-DS benchmark
Optimizer features in recent releases of other databases
ANALYZE for Statements - MariaDB's hidden gem
Histograms in MariaDB, MySQL and PostgreSQL
Using histograms to get better performance
MariaDB Optimizer - further down the rabbit hole
Improving MariaDB’s Query Optimizer with better selectivity estimates

What's hot (19)

PDF
MySQL 8.0 EXPLAIN ANALYZE
PDF
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
PDF
MariaDB: Engine Independent Table Statistics, including histograms
PDF
MariaDB 10.0 Query Optimizer
PPTX
MySQL performance tuning
PDF
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
PDF
MariaDB 10.3 Optimizer - where does it stand
PDF
New features-in-mariadb-and-mysql-optimizers
PDF
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
PDF
Mysqlconf2013 mariadb-cassandra-interoperability
PDF
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
PDF
M|18 Understanding the Query Optimizer
PPTX
Adaptive Query Optimization in 12c
PDF
Introduction into MySQL Query Tuning for Dev[Op]s
PDF
Histograms: Pre-12c and now
PDF
Histograms : Pre-12c and Now
PDF
Optimizer Hints
PDF
MySQL Query tuning 101
PDF
0888 learning-mysql
MySQL 8.0 EXPLAIN ANALYZE
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MariaDB: Engine Independent Table Statistics, including histograms
MariaDB 10.0 Query Optimizer
MySQL performance tuning
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
MariaDB 10.3 Optimizer - where does it stand
New features-in-mariadb-and-mysql-optimizers
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
Mysqlconf2013 mariadb-cassandra-interoperability
Query Optimization with MySQL 5.6: Old and New Tricks - Percona Live London 2013
M|18 Understanding the Query Optimizer
Adaptive Query Optimization in 12c
Introduction into MySQL Query Tuning for Dev[Op]s
Histograms: Pre-12c and now
Histograms : Pre-12c and Now
Optimizer Hints
MySQL Query tuning 101
0888 learning-mysql
Ad

Similar to Query Optimizer in MariaDB 10.4 (20)

PDF
How to use histograms to get better performance
PDF
New optimizer features in MariaDB releases before 10.12
PDF
Advanced Query Optimizer Tuning and Analysis
PDF
MariaDB Optimizer
PDF
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
PDF
What's new in MariaDB Platform X3
PDF
Histogram-in-Parallel-universe-of-MySQL-and-MariaDB
PDF
Percona live-2012-optimizer-tuning
PPTX
Query Optimizer – MySQL vs. PostgreSQL
PDF
[B14] A MySQL Replacement by Colin Charles
PDF
Billion Goods in Few Categories: How Histograms Save a Life?
PDF
Query Optimizer: further down the rabbit hole
PDF
MariaDB - a MySQL Replacement #SELF2014
PDF
What to expect from MariaDB Platform X5, part 2
PDF
MySQL Query Optimisation 101
PDF
MariaDB for Developers and Operators (DevOps)
PDF
Modern solutions for modern database load: improvements in the latest MariaDB...
PPTX
Presentación Oracle Database Migración consideraciones 10g/11g/12c
PDF
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
PPTX
How to use histograms to get better performance
New optimizer features in MariaDB releases before 10.12
Advanced Query Optimizer Tuning and Analysis
MariaDB Optimizer
MariaDB 10 Tutorial - 13.11.11 - Percona Live London
What's new in MariaDB Platform X3
Histogram-in-Parallel-universe-of-MySQL-and-MariaDB
Percona live-2012-optimizer-tuning
Query Optimizer – MySQL vs. PostgreSQL
[B14] A MySQL Replacement by Colin Charles
Billion Goods in Few Categories: How Histograms Save a Life?
Query Optimizer: further down the rabbit hole
MariaDB - a MySQL Replacement #SELF2014
What to expect from MariaDB Platform X5, part 2
MySQL Query Optimisation 101
MariaDB for Developers and Operators (DevOps)
Modern solutions for modern database load: improvements in the latest MariaDB...
Presentación Oracle Database Migración consideraciones 10g/11g/12c
5_MariaDB_What's New in MariaDB Server 10.2 and Big Data Analytics with Maria...
Ad

More from Sergey Petrunya (16)

PDF
MariaDB's New-Generation Optimizer Hints
PDF
MariaDB's join optimizer: how it works and current fixes
PDF
Improved histograms in MariaDB 10.8
PDF
JSON Support in MariaDB: News, non-news and the bigger picture
PDF
MariaDB 10.4 - что нового
PDF
MyRocks in MariaDB | M18
PDF
New Query Optimizer features in MariaDB 10.3
PDF
MyRocks in MariaDB
PDF
Say Hello to MyRocks
PDF
Common Table Expressions in MariaDB 10.2
PDF
MyRocks in MariaDB: why and how
PDF
Эволюция репликации в MySQL и MariaDB
PDF
MariaDB 10.1 - что нового.
PDF
Window functions in MariaDB 10.2
PPTX
MyRocks: табличный движок для MySQL на основе RocksDB
PDF
MariaDB: ANALYZE for statements (lightning talk)
MariaDB's New-Generation Optimizer Hints
MariaDB's join optimizer: how it works and current fixes
Improved histograms in MariaDB 10.8
JSON Support in MariaDB: News, non-news and the bigger picture
MariaDB 10.4 - что нового
MyRocks in MariaDB | M18
New Query Optimizer features in MariaDB 10.3
MyRocks in MariaDB
Say Hello to MyRocks
Common Table Expressions in MariaDB 10.2
MyRocks in MariaDB: why and how
Эволюция репликации в MySQL и MariaDB
MariaDB 10.1 - что нового.
Window functions in MariaDB 10.2
MyRocks: табличный движок для MySQL на основе RocksDB
MariaDB: ANALYZE for statements (lightning talk)

Recently uploaded (20)

PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
Essential Infomation Tech presentation.pptx
PDF
System and Network Administraation Chapter 3
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
AI in Product Development-omnex systems
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
Transform Your Business with a Software ERP System
PPTX
history of c programming in notes for students .pptx
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
ai tools demonstartion for schools and inter college
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Odoo Companies in India – Driving Business Transformation.pdf
Odoo POS Development Services by CandidRoot Solutions
Essential Infomation Tech presentation.pptx
System and Network Administraation Chapter 3
L1 - Introduction to python Backend.pptx
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
How Creative Agencies Leverage Project Management Software.pdf
AI in Product Development-omnex systems
wealthsignaloriginal-com-DS-text-... (1).pdf
Transform Your Business with a Software ERP System
history of c programming in notes for students .pptx
Softaken Excel to vCard Converter Software.pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
ai tools demonstartion for schools and inter college
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...

Query Optimizer in MariaDB 10.4

  • 1. Query Optimizer in MariaDB 10.4 Sergei Petrunia, Query Optimizer Developer MariaDB Corporation 2019 MariaDB Developers Unconference New York
  • 2. New Optimizer features in MariaDB 10.4 ● Optimizer trace ● Sampling for histogram collection ● Rowid filtering ● New default settings ● Condition Pushdown into IN-subqueries ● Condition Pushdown from HAVING into WHERE
  • 4. New default settings for statistics -histogram_size=0 +histogram_size=254 -histogram_type=SINGLE_PREC_HB +histogram_type=SINGLE_PREC_HB -optimizer_use_condition_selectivity=1 +optimizer_use_condition_selectivity=4 1 Use selectivity of predicates as in MariaDB 5.5. 2 Use selectivity of all range predicates supported by indexes. 3 Use selectivity of all range predicates estimated without histogram. 4 Use selectivity of all range predicates estimated with histogram. ● Do use condition selectivity ● Make use of EITS statistics (incl. Histograms if they are available) -use_stat_tables=NEVER +use_stat_tables=PREFERABLY_FOR_QUERIES – But don’t collect stats unless explicitly told to do so ● Do build histograms when collecting EITS statistics
  • 5. New default settings -eq_range_index_dive_limit=10 +eq_range_index_dive_limit=200 ● Join buffer will auto-size itself -optimize_join_buffer_size=OFF +optimize_join_buffer_size=ON ● Use index statistics (cardinality) instead of records_in_range for large IN-lists – Just following MySQL here – (can use ANALYZE for statements to see the size)
  • 7. Histograms in MariaDB ● Introduced in MariaDB 10.0 – Manual command to collect, ANALYZE … PERSISTENT FOR … – Optimizer settings to use them – Histogram is collected from ALL table data ● Other statistics (avg_frequency, avg_length), too. ● Results – A few users – Histogram collection is expensive ● Cost of full table scan + full index scans, and even more than that
  • 8. Histograms in MariaDB 10.4 ● MariaDB 10.4 – “Bernoulli sampling” - roll the dice for each row – Controlled with @@analyze_sample_percentage ● 100 (the default) – “use all data” ● 0 – (recommended) – “Determine sample ratio automatically” ● MySQL 8.0 also added histograms – Uses Bernoulli sampling – @@histogram_generation_max_mem_size=20MB. ● PostgreSQL has genuine random-jump sampling – default_statistics_target
  • 9. Histogram collection performance ● See MDEV-17886, (TODO: Vicentiu’s data?) ● Both MariaDB and MySQL: ANALYZE for columns is as fast as full table scan. ANALYZE TABLE PERSISTENT FOR COLUMNS (...) INDEXES (); ● “Persistent for ALL” will also scan indexes ANALYZE TABLE PERSISTENT FOR ALL; ● PostgreSQL is much faster with genuine sampling – Vicentiu’s has a task in progress for this.
  • 10. Histogram precision ● MariaDB histograms are very compact – min/max column values, then 1-byte or 2-byte bounds (SINGLE|DOUBLE_PREC_HB) – 255 bytes per histogram => 128 or 255 buckets max. ● MySQL – Histogram is stored as JSON, bounds are stored as values – 100 Buckets by default, max is 1024 ● In our tests, more buckets help in some cases ● PostgreSQL – Histogram bounds stored as values – 100 buckets by default, up to 10K allowed ● Testing is still in progress :-(, the obtained data varies.
  • 11. Problem with correlated conditions ● Possible selectivities – MIN(1/n, 1/m) – (1/n) * (1/m) – 0 select ... from order_items where shipdate='2015-12-15' AND item_name='christmas light' 'swimsuit'
  • 12. Problem with correlated conditions ● PostgreSQL: Multi-variate statistics – Detects functional dependencies, col1=F(col2) – Only used for equality predicates – Also #DISTINCT(a,b) ● MariaDB: MDEV-11107: Use table check constraints in optimizer – Stalled? select ... from order_items where shipdate='2015-12-15' AND item_name='christmas light' 'swimsuit'
  • 13. Histograms: conclusions ● 10.4 – Sampling makes ANALYZE TABLE … PERSISTENT FOR COLUMNS run at full-table-scan speed. – @@analyze_sample_rows ● Further directions – Do real sampling (in progress) – More space for the histograms (?) – Handle correlations (how?)
  • 15. Optimizer trace ● Available in MySQL since MySQL 5.6 mysql> set optimizer_trace=1; mysql> <query>; mysql> select * from -> information_schema.optimizer_trace; { "steps": [ { "join_preparation": { "select#": 1, "steps": [ { "expanded_query": "/* select#1 */ select `t1`.`col1` AS `col1`,`t1`.`col2` AS `col2` from `t1` where (`t1`.`col1` < 4)" } ] } }, { "join_optimization": { "select#": 1, "steps": [ { "condition_processing": { "condition": "WHERE", "original_condition": "(`t1`.`col1` < 4)", "steps": [ { "transformation": "equality_propagation", "resulting_condition": "(`t1`.`col1` < 4)" }, { "transformation": "constant_propagation", "resulting_condition": "(`t1`.`col1` < 4)" }, { "transformation": "trivial_condition_removal", "resulting_condition": "(`t1`.`col1` < 4)" } ] } }, { ● Now, similar feature in MariaDB
  • 16. The goal is to understand the optimizer ● “Why was query plan X not chosen” – Subquery was not converted into semi-join ● This would exceed MAX_TABLES – Subquery materialization was not used ● Different collations – Ref acess was not used ● Incompatible collations ● What changed between the two hosts/versions – diff trace_from_host1 trace_from_host2
  • 17. Developer point of view ● The trace is always compiled in ● RAII-objects to start/end writing a trace ● Disabled trace added ~1-2% overhead ● Intend to add more tracing – Expect to get more output
  • 19. What is PK-filter: in details SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND o_totalprice between 200000 and 230000; Filter for lineitem table built with condition l_shipdate BETWEEN '1997-01-01' AND '1997-06-30': is a container that contains primary keys of rows from lineitem which l_shipdate value satisfy the above condition.
  • 20. What is PK-filter: in details SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND o_totalprice between 200000 and 230000; Filter for lineitem table built with condition l_shipdate BETWEEN '1997-01-01' AND '1997-06-30': is a container that contains primary keys of rows from lineitem which l_shipdate value satisfy the above condition.
  • 21. What is PK-filter: in details SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-02-01' AND o_totalprice > 200000; 1. There is index i_l_shipdate on lineitem(l_shipdate)
  • 22. What is PK-filter: in details 2. SELECT * FROM orders JOIN lineitem ON o_orderkey=l_orderkey WHERE l_shipdate BETWEEN '1997-01-01' AND '1997-06-30' AND o_totalprice between 200000 and 230000;
  • 24. SELECT ... FROM t1 WHERE (a < 2) AND a IN ( SELECT c FROM t2 WHERE … AND (c < 2) GROUP BY c ); How condition pushdown is made SELECT ... FROM t1 WHERE (a < 2) AND a IN ( SELECT c FROM t2 WHERE ... GROUP BY c );