SlideShare a Scribd company logo
Cost-Based Query Transformation in
Oracle
Presented by Yuanjia Zhang
Problem to Solve
Tranditional query optimization(in 2006) consists of two phases:
● logical phase: query is rewritten based on heuristic rules.
● physical phase: best implementations are chosen based on cost
estimation.
Some heuristic rules in logical phase should be made in a cost-based manner.
For example:
● Join Reorder
● Group-By Placement
● Subquery Unnesting
● ...
Problem to Solve
A Subquery Unnesting Example
Join SubQuery
Apply
e1 j
Join
e1 jAgg
Overview
● Transformation in Oracle
○ Heuristic Transformations
○ Cost-Based Transformations
● Cost-Based Transformation Framework
○ Overview
○ State Space Search Techniques
○ Interaction between Transformations
○ Optimization Performance
● Performance Study
Heuristic Transformation in Oracle
Subquery Unnesting
Two categories of subquery unnesting:
● unnesting that generates inline views.
● unnesting that merges a subquery into its outer query.
Note that dept_id in employees is a foreign key that references the primary key of departments.
Join
d e
d e
Apply
Heuristic Transformation in Oracle
Join Elimination
Remove tables from a query if there are constraints on the join columns.
Note that dept_id in employees is a foreign key that references the primary key of departments.
Heuristic Transformation in Oracle
Group Pruning
Remove views groups not needed in the outer query blocks.
Cost-Based Transformation in Oracle
Subquery Unnesting
Two categories of subquery unnesting:
● unnesting that generates inline views.
● unnesting that merges a subquery into its outer query.
Join SubQuery
Apply
e1 j
Join
e1 jAgg
Cost-Based Transformation in Oracle
Group-By and Distinct View Merging
Merge view contains group-by or distinct into its outer query block.
Join
e1 jAgg
e2
Join
e1 j
Agg
e2
Cost-Based Transformation in Oracle
Join Predicate Pushdown
Push join predicates into a view.
Join
e1 jAgg e1 j
NestJoin/Apply
l
Cost-Based Transformation in Oracle
Join Factorization
Pull common join tables up to the outer UNION ALL query block.
Union
l
Join Join
de jd el
Union
Join
Join
d
e
je
l
Cost-Based Transformation in Oracle
Expensive Predicate Pullup
Pull expensive predicates up from the originating view to outer query block.
A predicate is considered expensive if it contains
● procedural language,
● user-defined operators,
● subqueries.
This transformation is only considered when rownum(limit) predicate is specified.
Cost-Based Transformation Framework
PhysicalOptimization Component is used to:
● estimate query tree cost,
● generate the final physical execution plan.
The order of applying transformations matters,
so...
common sub-expression factorization, SPJ view
merging, join elimination, subquery unnesting,
group-by (distinct) view merging, group
pruning, predicate move around, set operator
into join conversion, group-by placement,
predicate pullup, join factorization,
disjunction into union-all expansion, star
transformation, and join predicate pushdown
State Space Search Techniques
Definition of state in search space
We have a query consists of N objects(e.g., tables, join edges, predicats, etc.),
and we have M transformatins that can apply on the N objects,
then the state is represented by an M*N bit matrix and
there are 2^(M*N) states totally.
If we only have one transformation, expensive predicate pullup,
then the SQL below have four states:
● [[0, 0]],
● [[0, 1]],
● [[1, 0]],
● [[1, 1]].
State Space Search Techniques
How to search in state space
Four different techniques(only consider one transformation):
● Exhaustive: all 2^N states for N objects are considered.
● Two-pass: only consider 2 states, [[0, 0, ...0]] and [[1, 1, ...1]].
● Linear: a dynamic programming approach that suppose different objects are
independent; N+1 states are considered.
● Iterative:
a. start from an initial state and look for a local minimum state.
b. choose a different initial state and repeat step a until
■ no more new states to be found, or
■ some terminatin condition has been reached.
c. N+1~2^N states are considered.
Interaction between Transformations
Interleaving
When two (or more) cost-based transformations apply on the same object such that one
transformation becomes applicable only after the other has been applied, then these
transformations must be interleaved in order for the optimizer to determine the optimal plan.
We begin at S0, and
we can apply T1 to S1 and
we can apply T2, T3 to S2, S3;
If we don’t interleave T3 after T2, we can’t get to
S3 and the best state Sfinal on the right.
Interaction between Transformations
Juxtaposition
When two or more cost-based transformations apply on the same object in a way that precludes
their sequential application, they must be applied one by one in order for the optimizer to
determine the most optimal plan. This comparison of two or more cost-based transformations is
called juxtaposition.
We begin at S0, and
we can apply T1 to S1 or T2 to S2 but
T1 < T2 in sequential order and
we know if we get to S1,
we can’t get to S2 probably.
If we don’t consider T2 when we apply T1, we can’t
get to S2 and the best state Sfinal on the right.
Interaction between Transformations
Juxtaposition
An example: view merging and join predicate push down must be juxtaposed with each other.
Join
e1 jAgge1 jAgg
NestJoin/Apply
Join
e1 j
Agg
e2
view mergingJPPD
Optimization Performance
Reuse of Query Sub-Tree Cost Annotations
We have a query with two subquery and
a transformation which can be applied on subquery.
Then we have four states:
Cost information of Qs1, Qs2, T(Qs1), T(Qs2) can be used.
Performance Study
Dataset
● 14000 tables representing HR, Financial, Order Entry, CRM, Supply Chain…
● 241000 queries
○ the average number of tables in a query is 8,
○ most of the queries are of simple Sel/Proj/Join type,
○ 8% of these queries have subqueries, GROUP-BY, DISTINCT or UNION ALL.
Result
● 5910 execution plans changed.
● the total run time improved by 20% on
average.
● 18% affected queries degraded by 40%.
● the top 5% of longest running queries
improved 27%.
● optimization time increased by only 40%.
Any Questions ?
关注 PingCAP 官方微信
了解更多技术干货
Thank You!

More Related Content

PDF
Insertion sort
PPTX
Insertion sort
PPTX
Insertion sort
PDF
A unique sorting algorithm with linear time &amp; space complexity
PPTX
Solution of eigenvalue problem using Jacobi Method
PPT
Amortized analysis
PPTX
Choice of weighting function and expansion function in cem
PDF
Linear search
Insertion sort
Insertion sort
Insertion sort
A unique sorting algorithm with linear time &amp; space complexity
Solution of eigenvalue problem using Jacobi Method
Amortized analysis
Choice of weighting function and expansion function in cem
Linear search

What's hot (20)

PPTX
Applications of analytic functions and vector calculus
DOCX
MODULE 5-Searching and-sorting
PPT
composing procedures
PPTX
Parallel First-Order Operations
PPT
recursive problem_solving
PDF
Sql Server Query Parameterization
PPT
Chap06alg
PPT
Chap06alg
PPT
Parallel Algorithms- Sorting and Graph
PDF
Bs,qs,divide and conquer 1
PPTX
Quicksort Algorithm..simply defined through animations..!!
PPT
PDF
A Benchmark for Interpretability Methods in Deep Neural Networks
ODP
Quick sort
PDF
Unit ii divide and conquer -1
PDF
Graph-to-Graph Transformer for Transition-based Dependency Parsing
PPTX
Complexity of algorithms
PPTX
Lec 2 algorithms efficiency complexity
PPTX
Merge sort
PDF
Seminar@KRDB 2012 - Montali - Verification of Relational Data-Centric Systems...
Applications of analytic functions and vector calculus
MODULE 5-Searching and-sorting
composing procedures
Parallel First-Order Operations
recursive problem_solving
Sql Server Query Parameterization
Chap06alg
Chap06alg
Parallel Algorithms- Sorting and Graph
Bs,qs,divide and conquer 1
Quicksort Algorithm..simply defined through animations..!!
A Benchmark for Interpretability Methods in Deep Neural Networks
Quick sort
Unit ii divide and conquer -1
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Complexity of algorithms
Lec 2 algorithms efficiency complexity
Merge sort
Seminar@KRDB 2012 - Montali - Verification of Relational Data-Centric Systems...
Ad

Similar to Paper reading: Cost-based Query Transformation in Oracle (20)

ODP
Oracle SQL Advanced
PDF
PROCESS OPTIMIZATION (CHEN 421) LECTURE 1.pdf
PPT
From Declarative to Imperative Operation Specifications (ER 2007)
PDF
Predictable reactive state management - ngrx
PPTX
Producer Short Run production in microeconomics
PDF
A brief introduction to Searn Algorithm
PDF
RPA Summer School Studio Session 2 - The Fundamentals of UiPath Studio .pdf
PDF
UiPath Studio Session 2 - The Fundamentals of UiPath Studio - Final Slides.pdf
ODP
Refactoring: Improving the design of existing code
PDF
Apache Calcite: One Frontend to Rule Them All
PDF
Important Concepts for Machine Learning
PPTX
Transaction Management, Recovery and Query Processing.pptx
PDF
Understanding React hooks | Walkingtree Technologies
PDF
Machine Learning.pdf
PDF
Ch24 efficient algorithms
PPTX
An efficient use of temporal difference technique in Computer Game Learning
DOCX
Layout planning
PPT
Concepts of predictive control
PPT
CH1.ppt
PDF
Chapter 3.Simplex Method hand out last.pdf
Oracle SQL Advanced
PROCESS OPTIMIZATION (CHEN 421) LECTURE 1.pdf
From Declarative to Imperative Operation Specifications (ER 2007)
Predictable reactive state management - ngrx
Producer Short Run production in microeconomics
A brief introduction to Searn Algorithm
RPA Summer School Studio Session 2 - The Fundamentals of UiPath Studio .pdf
UiPath Studio Session 2 - The Fundamentals of UiPath Studio - Final Slides.pdf
Refactoring: Improving the design of existing code
Apache Calcite: One Frontend to Rule Them All
Important Concepts for Machine Learning
Transaction Management, Recovery and Query Processing.pptx
Understanding React hooks | Walkingtree Technologies
Machine Learning.pdf
Ch24 efficient algorithms
An efficient use of temporal difference technique in Computer Game Learning
Layout planning
Concepts of predictive control
CH1.ppt
Chapter 3.Simplex Method hand out last.pdf
Ad

More from PingCAP (20)

PPTX
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
PDF
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
PPTX
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
PPTX
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
PPTX
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
PPTX
[Paper Reading] QAGen: Generating query-aware test databases
PDF
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
PDF
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
PDF
[Paperreading] Paxos made easy (by sen han)
PPTX
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
PDF
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
PDF
The Dark Side Of Go -- Go runtime related problems in TiDB in production
PDF
TiDB DevCon 2020 Opening Keynote
PDF
Finding Logic Bugs in Database Management Systems
PDF
Chaos Practice in PingCAP
PDF
TiDB at PayPay
PPTX
Paper Reading: FPTree
PPTX
Paper Reading: Smooth Scan
PPTX
Paper Reading: Flexible Paxos
PPTX
Paper reading: HashKV and beyond
[Paper Reading] Efficient Query Processing with Optimistically Compressed Has...
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paperreading] Paxos made easy (by sen han)
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
The Dark Side Of Go -- Go runtime related problems in TiDB in production
TiDB DevCon 2020 Opening Keynote
Finding Logic Bugs in Database Management Systems
Chaos Practice in PingCAP
TiDB at PayPay
Paper Reading: FPTree
Paper Reading: Smooth Scan
Paper Reading: Flexible Paxos
Paper reading: HashKV and beyond

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Approach and Philosophy of On baking technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Electronic commerce courselecture one. Pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
cuic standard and advanced reporting.pdf
Encapsulation theory and applications.pdf
Big Data Technologies - Introduction.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
MYSQL Presentation for SQL database connectivity
Approach and Philosophy of On baking technology
MIND Revenue Release Quarter 2 2025 Press Release
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
The AUB Centre for AI in Media Proposal.docx
Digital-Transformation-Roadmap-for-Companies.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Electronic commerce courselecture one. Pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Understanding_Digital_Forensics_Presentation.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Per capita expenditure prediction using model stacking based on satellite ima...
Programs and apps: productivity, graphics, security and other tools
cuic standard and advanced reporting.pdf

Paper reading: Cost-based Query Transformation in Oracle

  • 1. Cost-Based Query Transformation in Oracle Presented by Yuanjia Zhang
  • 2. Problem to Solve Tranditional query optimization(in 2006) consists of two phases: ● logical phase: query is rewritten based on heuristic rules. ● physical phase: best implementations are chosen based on cost estimation. Some heuristic rules in logical phase should be made in a cost-based manner. For example: ● Join Reorder ● Group-By Placement ● Subquery Unnesting ● ...
  • 3. Problem to Solve A Subquery Unnesting Example Join SubQuery Apply e1 j Join e1 jAgg
  • 4. Overview ● Transformation in Oracle ○ Heuristic Transformations ○ Cost-Based Transformations ● Cost-Based Transformation Framework ○ Overview ○ State Space Search Techniques ○ Interaction between Transformations ○ Optimization Performance ● Performance Study
  • 5. Heuristic Transformation in Oracle Subquery Unnesting Two categories of subquery unnesting: ● unnesting that generates inline views. ● unnesting that merges a subquery into its outer query. Note that dept_id in employees is a foreign key that references the primary key of departments. Join d e d e Apply
  • 6. Heuristic Transformation in Oracle Join Elimination Remove tables from a query if there are constraints on the join columns. Note that dept_id in employees is a foreign key that references the primary key of departments.
  • 7. Heuristic Transformation in Oracle Group Pruning Remove views groups not needed in the outer query blocks.
  • 8. Cost-Based Transformation in Oracle Subquery Unnesting Two categories of subquery unnesting: ● unnesting that generates inline views. ● unnesting that merges a subquery into its outer query. Join SubQuery Apply e1 j Join e1 jAgg
  • 9. Cost-Based Transformation in Oracle Group-By and Distinct View Merging Merge view contains group-by or distinct into its outer query block. Join e1 jAgg e2 Join e1 j Agg e2
  • 10. Cost-Based Transformation in Oracle Join Predicate Pushdown Push join predicates into a view. Join e1 jAgg e1 j NestJoin/Apply l
  • 11. Cost-Based Transformation in Oracle Join Factorization Pull common join tables up to the outer UNION ALL query block. Union l Join Join de jd el Union Join Join d e je l
  • 12. Cost-Based Transformation in Oracle Expensive Predicate Pullup Pull expensive predicates up from the originating view to outer query block. A predicate is considered expensive if it contains ● procedural language, ● user-defined operators, ● subqueries. This transformation is only considered when rownum(limit) predicate is specified.
  • 13. Cost-Based Transformation Framework PhysicalOptimization Component is used to: ● estimate query tree cost, ● generate the final physical execution plan. The order of applying transformations matters, so... common sub-expression factorization, SPJ view merging, join elimination, subquery unnesting, group-by (distinct) view merging, group pruning, predicate move around, set operator into join conversion, group-by placement, predicate pullup, join factorization, disjunction into union-all expansion, star transformation, and join predicate pushdown
  • 14. State Space Search Techniques Definition of state in search space We have a query consists of N objects(e.g., tables, join edges, predicats, etc.), and we have M transformatins that can apply on the N objects, then the state is represented by an M*N bit matrix and there are 2^(M*N) states totally. If we only have one transformation, expensive predicate pullup, then the SQL below have four states: ● [[0, 0]], ● [[0, 1]], ● [[1, 0]], ● [[1, 1]].
  • 15. State Space Search Techniques How to search in state space Four different techniques(only consider one transformation): ● Exhaustive: all 2^N states for N objects are considered. ● Two-pass: only consider 2 states, [[0, 0, ...0]] and [[1, 1, ...1]]. ● Linear: a dynamic programming approach that suppose different objects are independent; N+1 states are considered. ● Iterative: a. start from an initial state and look for a local minimum state. b. choose a different initial state and repeat step a until ■ no more new states to be found, or ■ some terminatin condition has been reached. c. N+1~2^N states are considered.
  • 16. Interaction between Transformations Interleaving When two (or more) cost-based transformations apply on the same object such that one transformation becomes applicable only after the other has been applied, then these transformations must be interleaved in order for the optimizer to determine the optimal plan. We begin at S0, and we can apply T1 to S1 and we can apply T2, T3 to S2, S3; If we don’t interleave T3 after T2, we can’t get to S3 and the best state Sfinal on the right.
  • 17. Interaction between Transformations Juxtaposition When two or more cost-based transformations apply on the same object in a way that precludes their sequential application, they must be applied one by one in order for the optimizer to determine the most optimal plan. This comparison of two or more cost-based transformations is called juxtaposition. We begin at S0, and we can apply T1 to S1 or T2 to S2 but T1 < T2 in sequential order and we know if we get to S1, we can’t get to S2 probably. If we don’t consider T2 when we apply T1, we can’t get to S2 and the best state Sfinal on the right.
  • 18. Interaction between Transformations Juxtaposition An example: view merging and join predicate push down must be juxtaposed with each other. Join e1 jAgge1 jAgg NestJoin/Apply Join e1 j Agg e2 view mergingJPPD
  • 19. Optimization Performance Reuse of Query Sub-Tree Cost Annotations We have a query with two subquery and a transformation which can be applied on subquery. Then we have four states: Cost information of Qs1, Qs2, T(Qs1), T(Qs2) can be used.
  • 20. Performance Study Dataset ● 14000 tables representing HR, Financial, Order Entry, CRM, Supply Chain… ● 241000 queries ○ the average number of tables in a query is 8, ○ most of the queries are of simple Sel/Proj/Join type, ○ 8% of these queries have subqueries, GROUP-BY, DISTINCT or UNION ALL. Result ● 5910 execution plans changed. ● the total run time improved by 20% on average. ● 18% affected queries degraded by 40%. ● the top 5% of longest running queries improved 27%. ● optimization time increased by only 40%.
  • 21. Any Questions ? 关注 PingCAP 官方微信 了解更多技术干货 Thank You!