SlideShare a Scribd company logo
www.informatik-aktuell.de
Randolf Geist – IT-Tage 2015 – Oracle Parallel Execution – Analyse und Troubleshooting
 Independent consultant
 Performance Troubleshooting
 In-house workshops
 Cost-Based Optimizer
 Performance By Design
 Oracle ACE Director
 Member of OakTable Network
 Parallel Execution introduction
 Major challenges
 Parallel unfriendly examples
 Distribution skew examples
 How to measure distribution of work
 Fixing work distribution skew
 Oracle Database Enterprise Edition includes
the powerful Parallel Execution feature that
allows spreading the processing of a single
SQL statement execution across multiple
worker processes
 The feature is fully integrated into the Cost
Based Optimizer as well as the execution
runtime engine and automatically distributes
the work across the so called Parallel Workers
 Simple generic parallelization example
Task: Compute sum of 8 numbers
1+8=9, 9+7=16, 16+9=25,...
1+8+7+9+6+2+6+3= ???
n=8 numbers, 7 computation steps required
Serial execution: 7 time units
Simple generic parallelization example
4 workers
3+ time units
1 + 8
= 9
9 + 7
= 16
6 + 2
= 8
6 + 3
= 9
9 + 16
= 25
8 + 9
= 17
25 + 17
= 42
Coordinator
Parallel Execution doesn’t mean “work smarter”
You’re actually willing to accept to “work harder”
Could also be called: “Brute force” approach
So with Parallel Execution there
might be the problem that it
doesn’t work “hard enough”
 Two major challenges
 Can the given task be divided into sub-tasks that can
efficiently and independently be processed by the
workers? (“Parallel Unfriendly”)
 Can all assigned workers be kept busy all the time?
 Parallel Execution can only reduce runtime as
expected if all workers are kept busy
 Possibly only a few or a single worker will be
active and have to do all the work
 In this case Parallel Execution can actually be
slower than serial execution
 There is a need to measure how busy the
workers are kept
 Note that this measure doesn’t tell you
anything about the efficiency of the actual
operation / execution plan
 But an otherwise efficient Parallel Execution
plan can only scale if the expected number of
workers is kept busy ideally all the time
 Note that it says “can scale” – if your system
cannot scale the required resources (like I/O)
you just end up with more workers waiting
 More reasons why Oracle Parallel Execution
might not reduce runtime as expected:
 Parallel DML/DDL gotchas
 “Downgrade” at execution time (less workers
assigned than expected)
 Overhead of Parallel Execution implementation
 Limitations of Parallel Execution implementation
Parallel DML / DDL gotchas
 DML / DDL part can run parallel or serial
 Query part can run parallel or serial
Parallel CTAS but serial query
-------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) |
| 3 | LOAD AS SELECT | T4 | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | Q1,01 | PCWP | |
| 5 | PX SEND ROUND-ROBIN| :TQ10000 | | S->P | RND-ROBIN |
|* 6 | HASH JOIN | | | | |
| 7 | TABLE ACCESS FULL| T2 | | | |
| 8 | TABLE ACCESS FULL| T2 | | | |
-------------------------------------------------------------------------
Serial CTAS but parallel query
--------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | LOAD AS SELECT | T4 | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) |
|* 4 | HASH JOIN BUFFERED | | Q1,02 | PCWP | |
| 5 | PX RECEIVE | | Q1,02 | PCWP | |
| 6 | PX SEND HASH | :TQ10000 | Q1,00 | P->P | HASH |
| 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 8 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | |
| 9 | PX RECEIVE | | Q1,02 | PCWP | |
| 10 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH |
| 11 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 12 | TABLE ACCESS FULL| T2 | Q1,01 | PCWP | |
--------------------------------------------------------------------------
 Other reasons why Oracle Parallel Execution
might not scale as expected:
 Parallel DML/DDL gotchas
 “Downgrade” at execution time (less workers
assigned than expected)
 Overhead of Parallel Execution implementation
 Limitations of Parallel Execution implementation
“Parallel Forced Serial” Example
------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | PX COORDINATOR FORCED SERIAL| | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | P->S | QC (RAND) |
| 3 | HASH UNIQUE | | Q1,03 | PCWP | |
| 4 | PX RECEIVE | | Q1,03 | PCWP | |
| 5 | PX SEND HASH | :TQ10002 | Q1,02 | P->P | HASH |
|* 6 | HASH JOIN BUFFERED | | Q1,02 | PCWP | |
| 7 | PX RECEIVE | | Q1,02 | PCWP | |
| 8 | PX SEND HASH | :TQ10000 | Q1,00 | P->P | HASH |
| 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 10 | TABLE ACCESS FULL | T2 | Q1,00 | PCWP | |
| 11 | PX RECEIVE | | Q1,02 | PCWP | |
| 12 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH |
| 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
| 14 | TABLE ACCESS FULL | T2 | Q1,01 | PCWP | |
------------------------------------------------------------------------------
 Two major challenges
 Can the given task be divided into sub-tasks that can
efficiently and independently be processed by the
workers? (“Parallel Unfriendly”)
 Can all assigned workers be kept busy all the time?
 Parallel Execution introduction
 Major challenges
 Parallel unfriendly examples
 Distribution skew examples
 How to measure distribution of work
select median(id) from t2;
-----------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | SORT GROUP BY | | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM)| :TQ10000 | Q1,00 | P->S | QC (RAND) |
| 4 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 5 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | |
-----------------------------------------------------------------------
create table t3 parallel
as
select * from t2
where rownum <= 10000000;
-----------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ20001 | Q2,01 | P->S | QC (RAND) |
| 3 | LOAD AS SELECT | T3 | Q2,01 | PCWP | |
| 4 | PX RECEIVE | | Q2,01 | PCWP | |
| 5 | PX SEND ROUND-ROBIN | :TQ20000 | | S->P | RND-ROBIN |
|* 6 | COUNT STOPKEY | | | | |
| 7 | PX COORDINATOR | | | | |
| 8 | PX SEND QC (RANDOM) | :TQ10000 | Q1,00 | P->S | QC (RAND) |
|* 9 | COUNT STOPKEY | | Q1,00 | PCWC | |
| 10 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 11 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | |
-----------------------------------------------------------------------------
create table t3 parallel
as select * from (select a.*,
lag(filler, 1) over (order by id) as prev_filler
from t2 a);
--------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------
| 0 | CREATE TABLE STATEMENT | | | | |
| 1 | PX COORDINATOR | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ20001 | Q2,01 | P->S | QC (RAND) |
| 3 | LOAD AS SELECT | T3 | Q2,01 | PCWP | |
| 4 | PX RECEIVE | | Q2,01 | PCWP | |
| 5 | PX SEND ROUND-ROBIN | :TQ20000 | | S->P | RND-ROBIN |
| 6 | VIEW | | | | |
| 7 | WINDOW BUFFER | | | | |
| 8 | PX COORDINATOR | | | | |
| 9 | PX SEND QC (ORDER) | :TQ10001 | Q1,01 | P->S | QC (ORDER) |
| 10 | SORT ORDER BY | | Q1,01 | PCWP | |
| 11 | PX RECEIVE | | Q1,01 | PCWP | |
| 12 | PX SEND RANGE | :TQ10000 | Q1,00 | P->P | RANGE |
| 13 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
| 14 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | |
--------------------------------------------------------------------------------
All these examples have one thing in common:
If the Query Coordinator (non-parallel part)
needs to perform a significant part of the overall
work, Parallel Execution won’t reduce the
runtime as expected
 Two major challenges
 Can the given task be divided into sub-tasks that can
efficiently and independently be processed by the
workers? (“Parallel Unfriendly”)
 Can all assigned workers be kept busy all the time?
Randolf Geist – IT-Tage 2015 – Oracle Parallel Execution – Analyse und Troubleshooting
Randolf Geist – IT-Tage 2015 – Oracle Parallel Execution – Analyse und Troubleshooting
 Parallel Execution introduction
 Major challenges
 Parallel unfriendly examples
 Distribution skew examples
 How to measure distribution of work
 Fixing work distribution skew
 Extended SQL tracing
 Generates one trace file per Parallel Worker process
in database server trace directory
 For cross-instance Parallel Execution this means files
spread across more than one server
 Each Parallel Worker trace file lists, among others,
the number of rows produced per plan line
 Extended SQL tracing
Allows detecting data distribution skew
Usually requires to reproduce the issue
Tedious to collect and skim through dozens of trace
files (no tool known that automates that job)
No information about relevance of skew
 DBMS_XPLAN.DISPLAY_CURSOR +
Rowsource statistics
 Based on same internal code instrumentation as
extended SQL trace
 Feeds back rowsource statistics (rows produced on
execution plan line level, timing, I/O stats) into
Library Cache
 Useful for analyzing serial executions
 DBMS_XPLAN.DISPLAY_CURSOR +
Rowsource statistics
Doesn’t cope with well multiple Parallel Execution
Servers / multiple DFO trees or Cross Instance RAC
execution
No information about rows produced per Parallel
Execution Server
=> Therefore not able to detect distribution skew
 Analyzing Data Distribution Skew
 Oracle since a long time offers a special view called
V$PQ_TQSTAT
 It is populated after a successful Parallel Execution
in the session of the Query Coordinator
 It lists the amount of data send via the “Table
Queues” / “Virtual Tables”
 In theory this allows exact analysis which operations
caused an uneven data distribution
 V$PQ_TQSTAT skewed execution example
DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES PROCESS
---------- ---------- ---------- ---------- ---------- ----------
1 0 Producer 500807 54396692 P008
1 0 Producer 499193 54259853 P009
1 0 Consumer 500038 54332349 P010
1 0 Consumer 499962 54324196 P011
1 1 Producer 499826 57267724 P008
1 1 Producer 500174 57357253 P009
1 1 Consumer 499933 57304789 P010
1 1 Consumer 500067 57320188 P011
1 2 Producer 462069 52963939 P010
1 2 Producer 537931 61681312 P011
1 2 Consumer 500212 57346574 P008
1 2 Consumer 499788 57298677 P009
1 3 Producer 500038 57337041 P010
1 3 Producer 499962 57328456 P011
1 3 Consumer 0 48 P008
1 3 Consumer 1000000 114665449 P009
1 4 Producer 0 24 P008
1 4 Producer 1000000 116668401 P009
1 4 Consumer 1000000 116668425 QC
Easy to identify whether all workers are kept
busy all the time or not
Easy to identify if there was a problem with
work distribution
Shows actual parallel degree used (“Parallel
Downgrade”)
Supports RAC
Reports are not persisted and will be flushed
from memory quite quickly on busy systems
No easy identification and therefore no
systematic troubleshooting which plan
operations cause a work distribution problem
Lacks some precision regarding Parallel
Execution details
 Real-Time SQL Monitoring allows detecting
Parallel Execution skew in the following way:
 In the “Activity” tab
 The average active sessions will be less than the DOP of
the operation, allows to detect both temporal and data
distribution skew
 In the “Parallel” tab
 The DB Time recorded per Parallel Slave will show an
uneven distribution of active work time
 Real-Time SQL Monitoring “Activity” tab –
skewed execution example
 Real-Time SQL Monitoring “Parallel” tab –
skewed execution example
 One very useful approach is using Active
Session History (ASH)
 ASH samples active sessions once a second
 Activity of Parallel Workers over time can
easily be analyzed
 From 11g on the ASH data even contains a
reference to the execution plan line, so a
relation between Parallel Worker activity and
execution plan line based on ASH is possible
 Custom queries on ASH data required for
detailed analysis
 XPLAN_ASH tool runs these queries for a
given SQL_ID
 Advantage of ASH is the availability of
retained historic ASH data via AWR on disk
 Information can be extracted even for SQL
executions as long ago as the retention
configured for AWR
 Parallel Execution introduction
 Major challenges
 Parallel unfriendly examples
 Distribution skew examples
 How to measure distribution of work
 Fixing work distribution skew
Fixing Work Distribution Skew
 Influence Parallel Distribution: Data volume
estimates, PQ_DISTRIBUTE / PQ_[NO]MAP /
PQ_SKEW (12c+) hint
 Limit impact by influencing join order
 Rewrite queries trying to limit or avoid skew
 Remap skewed data changing the value distribution
Fixing Work Distribution Skew
 Automatic join skew detection in 12c (PQ_SKEW)
 Supports only parallel HASH JOINs
 Supports at present only simple probe row sources
(no result of other joins supported)
 Histogram showing popular values required (or
forced via PQ_SKEW hint)
----------------------------------------------------------------------------------
| Id | Operation | Name | TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | SORT AGGREGATE | | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) |
| 4 | SORT AGGREGATE | | Q1,02 | PCWP | |
|* 5 | HASH JOIN | | Q1,02 | PCWP | |
| 6 | PX RECEIVE | | Q1,02 | PCWP | |
| 7 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | P->P | HYBRID HASH|
| 8 | STATISTICS COLLECTOR | | Q1,00 | PCWC | |
| 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | |
|* 10 | TABLE ACCESS FULL | T_1 | Q1,00 | PCWP | |
| 11 | PX RECEIVE | | Q1,02 | PCWP | |
| 12 | PX SEND HYBRID HASH (SKEW)| :TQ10001 | Q1,01 | P->P | HYBRID HASH|
| 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | |
|* 14 | TABLE ACCESS FULL | T_2 | Q1,01 | PCWP | |
----------------------------------------------------------------------------------
Fixing Work Distribution Skew
 More information:
 http://oracle-
randolf.blogspot.com/2014/08/parallel-execution-
skew-summary.html
Q & A

More Related Content

PPTX
Analysing and troubleshooting Parallel Execution IT Tage 2015
PPTX
Adaptive Query Optimization in 12c
PDF
PoC Oracle Exadata - Retour d'expérience
PDF
Managing Statistics for Optimal Query Performance
PDF
Histograms : Pre-12c and Now
PPT
Do You Know The 11g Plan?
PDF
Oracle statistics by example
PDF
Histograms: Pre-12c and now
Analysing and troubleshooting Parallel Execution IT Tage 2015
Adaptive Query Optimization in 12c
PoC Oracle Exadata - Retour d'expérience
Managing Statistics for Optimal Query Performance
Histograms : Pre-12c and Now
Do You Know The 11g Plan?
Oracle statistics by example
Histograms: Pre-12c and now

What's hot (19)

PDF
ThruPut Manager AE+: Automation for Production Control and Capacity Management
PDF
SQLd360
PDF
Histograms in 12c era
PPTX
Quickly Locate Poorly Performing DB2 for z/OS Batch SQL
PDF
SQL Macros - Game Changing Feature for SQL Developers?
PDF
Hash join use memory optimization
PDF
Is your SQL Exadata-aware?
PPT
Thomas+Niewel+ +Oracletuning
PDF
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
PDF
PostgreSQL Portland Performance Practice Project - Database Test 2 Howto
PDF
Riyaj: why optimizer_hates_my_sql_2010
PDF
Xilinx timing closure
PDF
MariaDB: Engine Independent Table Statistics, including histograms
PPT
Informix Warehouse Accelerator (IWA) features in version 12.1
PPT
04 assemblylinebalancing
PDF
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
PPT
PDF
PostgreSQL Portland Performance Practice Project - Database Test 2 Tuning
PPTX
Part1 of SQL Tuning Workshop - Understanding the Optimizer
ThruPut Manager AE+: Automation for Production Control and Capacity Management
SQLd360
Histograms in 12c era
Quickly Locate Poorly Performing DB2 for z/OS Batch SQL
SQL Macros - Game Changing Feature for SQL Developers?
Hash join use memory optimization
Is your SQL Exadata-aware?
Thomas+Niewel+ +Oracletuning
ANALYZE for executable statements - a new way to do optimizer troubleshooting...
PostgreSQL Portland Performance Practice Project - Database Test 2 Howto
Riyaj: why optimizer_hates_my_sql_2010
Xilinx timing closure
MariaDB: Engine Independent Table Statistics, including histograms
Informix Warehouse Accelerator (IWA) features in version 12.1
04 assemblylinebalancing
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
PostgreSQL Portland Performance Practice Project - Database Test 2 Tuning
Part1 of SQL Tuning Workshop - Understanding the Optimizer
Ad

Viewers also liked (18)

PDF
CL PROJECT 2 SITE ANALYSIS
PPTX
food and beverages
PPS
El Capricho de Gaudi
PPTX
benefits of comenius
PPTX
English250
PDF
Telangana Praja Front Manifesto
DOCX
Uas k eamanan komputer
PDF
Certificate (ENG)
PDF
CV_Mohan_revised_2015-July
PDF
Original booking start in Aarohan Luxury Project @ 9910750427.
PDF
UN Causedirect Pres 16122015 Draft 2
PPTX
Trabajo infor memoria rom05
PDF
AMIS_Market_Monitor_Nov 2015
PPS
Las Desamortizaciones
PDF
Presentation Business-Media
DOCX
Verizon Training (2)
PPS
Pantanal-Mato Grosso-Brasil
PPTX
Data types
CL PROJECT 2 SITE ANALYSIS
food and beverages
El Capricho de Gaudi
benefits of comenius
English250
Telangana Praja Front Manifesto
Uas k eamanan komputer
Certificate (ENG)
CV_Mohan_revised_2015-July
Original booking start in Aarohan Luxury Project @ 9910750427.
UN Causedirect Pres 16122015 Draft 2
Trabajo infor memoria rom05
AMIS_Market_Monitor_Nov 2015
Las Desamortizaciones
Presentation Business-Media
Verizon Training (2)
Pantanal-Mato Grosso-Brasil
Data types
Ad

Similar to Randolf Geist – IT-Tage 2015 – Oracle Parallel Execution – Analyse und Troubleshooting (20)

PDF
Parallel Execution With Oracle Database 12c - Masterclass
PDF
Px execution in rac
PPTX
Oracle optimizer bootcamp
PPT
Tracing Parallel Execution (UKOUG 2006)
PPT
Introduction to Parallel Execution
PPTX
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
PDF
Parallel Query on Exadata
PDF
Oracle Parallel Distribution and 12c Adaptive Plans
PPTX
Data warehouse 26 exploiting parallel technologies
PPT
11 Things About 11gr2
PDF
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
PDF
POLARDB for MySQL - Parallel Query
PDF
Oracle 12c Parallel Execution New Features
PDF
APEX Connect 2019 - SQL Tuning 101
PDF
APEX Connect 2019 - successful application development
PDF
Feb14 successful development
PPTX
PLSQL Advanced
PPTX
database slide on modern techniques for optimizing database queries.pptx
PDF
Properly Use Parallel DML for ETL
DOC
Datastage parallell jobs vs datastage server jobs
Parallel Execution With Oracle Database 12c - Masterclass
Px execution in rac
Oracle optimizer bootcamp
Tracing Parallel Execution (UKOUG 2006)
Introduction to Parallel Execution
Christo Kutrovsky - Maximize Data Warehouse Performance with Parallel Queries
Parallel Query on Exadata
Oracle Parallel Distribution and 12c Adaptive Plans
Data warehouse 26 exploiting parallel technologies
11 Things About 11gr2
OracleDatabase12cPXNewFeatures_ITOUG_2018.pdf
POLARDB for MySQL - Parallel Query
Oracle 12c Parallel Execution New Features
APEX Connect 2019 - SQL Tuning 101
APEX Connect 2019 - successful application development
Feb14 successful development
PLSQL Advanced
database slide on modern techniques for optimizing database queries.pptx
Properly Use Parallel DML for ETL
Datastage parallell jobs vs datastage server jobs

Recently uploaded (20)

PDF
natwest.pdf company description and business model
PPTX
Introduction-to-Food-Packaging-and-packaging -materials.pptx
DOCX
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
PPTX
S. Anis Al Habsyi & Nada Shobah - Klasifikasi Hambatan Depresi.pptx
PDF
Tunisia's Founding Father(s) Pitch-Deck 2022.pdf
PPTX
Introduction to Effective Communication.pptx
PDF
oil_refinery_presentation_v1 sllfmfls.pdf
PPTX
lesson6-211001025531lesson plan ppt.pptx
PPT
The Effect of Human Resource Management Practice on Organizational Performanc...
PPTX
An Unlikely Response 08 10 2025.pptx
PPTX
_ISO_Presentation_ISO 9001 and 45001.pptx
PPTX
The spiral of silence is a theory in communication and political science that...
PPTX
Tour Presentation Educational Activity.pptx
PPTX
water for all cao bang - a charity project
DOC
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
PPTX
Impressionism_PostImpressionism_Presentation.pptx
PPTX
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
DOCX
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PDF
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
PPTX
Emphasizing It's Not The End 08 06 2025.pptx
natwest.pdf company description and business model
Introduction-to-Food-Packaging-and-packaging -materials.pptx
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
S. Anis Al Habsyi & Nada Shobah - Klasifikasi Hambatan Depresi.pptx
Tunisia's Founding Father(s) Pitch-Deck 2022.pdf
Introduction to Effective Communication.pptx
oil_refinery_presentation_v1 sllfmfls.pdf
lesson6-211001025531lesson plan ppt.pptx
The Effect of Human Resource Management Practice on Organizational Performanc...
An Unlikely Response 08 10 2025.pptx
_ISO_Presentation_ISO 9001 and 45001.pptx
The spiral of silence is a theory in communication and political science that...
Tour Presentation Educational Activity.pptx
water for all cao bang - a charity project
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
Impressionism_PostImpressionism_Presentation.pptx
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
Emphasizing It's Not The End 08 06 2025.pptx

Randolf Geist – IT-Tage 2015 – Oracle Parallel Execution – Analyse und Troubleshooting

  • 3.  Independent consultant  Performance Troubleshooting  In-house workshops  Cost-Based Optimizer  Performance By Design  Oracle ACE Director  Member of OakTable Network
  • 4.  Parallel Execution introduction  Major challenges  Parallel unfriendly examples  Distribution skew examples  How to measure distribution of work  Fixing work distribution skew
  • 5.  Oracle Database Enterprise Edition includes the powerful Parallel Execution feature that allows spreading the processing of a single SQL statement execution across multiple worker processes  The feature is fully integrated into the Cost Based Optimizer as well as the execution runtime engine and automatically distributes the work across the so called Parallel Workers
  • 6.  Simple generic parallelization example Task: Compute sum of 8 numbers 1+8=9, 9+7=16, 16+9=25,... 1+8+7+9+6+2+6+3= ??? n=8 numbers, 7 computation steps required Serial execution: 7 time units
  • 7. Simple generic parallelization example 4 workers 3+ time units 1 + 8 = 9 9 + 7 = 16 6 + 2 = 8 6 + 3 = 9 9 + 16 = 25 8 + 9 = 17 25 + 17 = 42 Coordinator
  • 8. Parallel Execution doesn’t mean “work smarter” You’re actually willing to accept to “work harder” Could also be called: “Brute force” approach
  • 9. So with Parallel Execution there might be the problem that it doesn’t work “hard enough”
  • 10.  Two major challenges  Can the given task be divided into sub-tasks that can efficiently and independently be processed by the workers? (“Parallel Unfriendly”)  Can all assigned workers be kept busy all the time?
  • 11.  Parallel Execution can only reduce runtime as expected if all workers are kept busy  Possibly only a few or a single worker will be active and have to do all the work  In this case Parallel Execution can actually be slower than serial execution  There is a need to measure how busy the workers are kept
  • 12.  Note that this measure doesn’t tell you anything about the efficiency of the actual operation / execution plan  But an otherwise efficient Parallel Execution plan can only scale if the expected number of workers is kept busy ideally all the time  Note that it says “can scale” – if your system cannot scale the required resources (like I/O) you just end up with more workers waiting
  • 13.  More reasons why Oracle Parallel Execution might not reduce runtime as expected:  Parallel DML/DDL gotchas  “Downgrade” at execution time (less workers assigned than expected)  Overhead of Parallel Execution implementation  Limitations of Parallel Execution implementation
  • 14. Parallel DML / DDL gotchas  DML / DDL part can run parallel or serial  Query part can run parallel or serial
  • 15. Parallel CTAS but serial query ------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | Q1,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT | T4 | Q1,01 | PCWP | | | 4 | PX RECEIVE | | Q1,01 | PCWP | | | 5 | PX SEND ROUND-ROBIN| :TQ10000 | | S->P | RND-ROBIN | |* 6 | HASH JOIN | | | | | | 7 | TABLE ACCESS FULL| T2 | | | | | 8 | TABLE ACCESS FULL| T2 | | | | -------------------------------------------------------------------------
  • 16. Serial CTAS but parallel query -------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | -------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | LOAD AS SELECT | T4 | | | | | 2 | PX COORDINATOR | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) | |* 4 | HASH JOIN BUFFERED | | Q1,02 | PCWP | | | 5 | PX RECEIVE | | Q1,02 | PCWP | | | 6 | PX SEND HASH | :TQ10000 | Q1,00 | P->P | HASH | | 7 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 8 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | | | 9 | PX RECEIVE | | Q1,02 | PCWP | | | 10 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH | | 11 | PX BLOCK ITERATOR | | Q1,01 | PCWC | | | 12 | TABLE ACCESS FULL| T2 | Q1,01 | PCWP | | --------------------------------------------------------------------------
  • 17.  Other reasons why Oracle Parallel Execution might not scale as expected:  Parallel DML/DDL gotchas  “Downgrade” at execution time (less workers assigned than expected)  Overhead of Parallel Execution implementation  Limitations of Parallel Execution implementation
  • 18. “Parallel Forced Serial” Example ------------------------------------------------------------------------------ | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | | | 1 | PX COORDINATOR FORCED SERIAL| | | | | | 2 | PX SEND QC (RANDOM) | :TQ10003 | Q1,03 | P->S | QC (RAND) | | 3 | HASH UNIQUE | | Q1,03 | PCWP | | | 4 | PX RECEIVE | | Q1,03 | PCWP | | | 5 | PX SEND HASH | :TQ10002 | Q1,02 | P->P | HASH | |* 6 | HASH JOIN BUFFERED | | Q1,02 | PCWP | | | 7 | PX RECEIVE | | Q1,02 | PCWP | | | 8 | PX SEND HASH | :TQ10000 | Q1,00 | P->P | HASH | | 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 10 | TABLE ACCESS FULL | T2 | Q1,00 | PCWP | | | 11 | PX RECEIVE | | Q1,02 | PCWP | | | 12 | PX SEND HASH | :TQ10001 | Q1,01 | P->P | HASH | | 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | | | 14 | TABLE ACCESS FULL | T2 | Q1,01 | PCWP | | ------------------------------------------------------------------------------
  • 19.  Two major challenges  Can the given task be divided into sub-tasks that can efficiently and independently be processed by the workers? (“Parallel Unfriendly”)  Can all assigned workers be kept busy all the time?
  • 20.  Parallel Execution introduction  Major challenges  Parallel unfriendly examples  Distribution skew examples  How to measure distribution of work
  • 21. select median(id) from t2; ----------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ----------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | | 1 | SORT GROUP BY | | | | | | 2 | PX COORDINATOR | | | | | | 3 | PX SEND QC (RANDOM)| :TQ10000 | Q1,00 | P->S | QC (RAND) | | 4 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 5 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | | -----------------------------------------------------------------------
  • 22. create table t3 parallel as select * from t2 where rownum <= 10000000; ----------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ----------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ20001 | Q2,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT | T3 | Q2,01 | PCWP | | | 4 | PX RECEIVE | | Q2,01 | PCWP | | | 5 | PX SEND ROUND-ROBIN | :TQ20000 | | S->P | RND-ROBIN | |* 6 | COUNT STOPKEY | | | | | | 7 | PX COORDINATOR | | | | | | 8 | PX SEND QC (RANDOM) | :TQ10000 | Q1,00 | P->S | QC (RAND) | |* 9 | COUNT STOPKEY | | Q1,00 | PCWC | | | 10 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 11 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | | -----------------------------------------------------------------------------
  • 23. create table t3 parallel as select * from (select a.*, lag(filler, 1) over (order by id) as prev_filler from t2 a); -------------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | -------------------------------------------------------------------------------- | 0 | CREATE TABLE STATEMENT | | | | | | 1 | PX COORDINATOR | | | | | | 2 | PX SEND QC (RANDOM) | :TQ20001 | Q2,01 | P->S | QC (RAND) | | 3 | LOAD AS SELECT | T3 | Q2,01 | PCWP | | | 4 | PX RECEIVE | | Q2,01 | PCWP | | | 5 | PX SEND ROUND-ROBIN | :TQ20000 | | S->P | RND-ROBIN | | 6 | VIEW | | | | | | 7 | WINDOW BUFFER | | | | | | 8 | PX COORDINATOR | | | | | | 9 | PX SEND QC (ORDER) | :TQ10001 | Q1,01 | P->S | QC (ORDER) | | 10 | SORT ORDER BY | | Q1,01 | PCWP | | | 11 | PX RECEIVE | | Q1,01 | PCWP | | | 12 | PX SEND RANGE | :TQ10000 | Q1,00 | P->P | RANGE | | 13 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | | 14 | TABLE ACCESS FULL| T2 | Q1,00 | PCWP | | --------------------------------------------------------------------------------
  • 24. All these examples have one thing in common: If the Query Coordinator (non-parallel part) needs to perform a significant part of the overall work, Parallel Execution won’t reduce the runtime as expected
  • 25.  Two major challenges  Can the given task be divided into sub-tasks that can efficiently and independently be processed by the workers? (“Parallel Unfriendly”)  Can all assigned workers be kept busy all the time?
  • 28.  Parallel Execution introduction  Major challenges  Parallel unfriendly examples  Distribution skew examples  How to measure distribution of work  Fixing work distribution skew
  • 29.  Extended SQL tracing  Generates one trace file per Parallel Worker process in database server trace directory  For cross-instance Parallel Execution this means files spread across more than one server  Each Parallel Worker trace file lists, among others, the number of rows produced per plan line
  • 30.  Extended SQL tracing Allows detecting data distribution skew Usually requires to reproduce the issue Tedious to collect and skim through dozens of trace files (no tool known that automates that job) No information about relevance of skew
  • 31.  DBMS_XPLAN.DISPLAY_CURSOR + Rowsource statistics  Based on same internal code instrumentation as extended SQL trace  Feeds back rowsource statistics (rows produced on execution plan line level, timing, I/O stats) into Library Cache  Useful for analyzing serial executions
  • 32.  DBMS_XPLAN.DISPLAY_CURSOR + Rowsource statistics Doesn’t cope with well multiple Parallel Execution Servers / multiple DFO trees or Cross Instance RAC execution No information about rows produced per Parallel Execution Server => Therefore not able to detect distribution skew
  • 33.  Analyzing Data Distribution Skew  Oracle since a long time offers a special view called V$PQ_TQSTAT  It is populated after a successful Parallel Execution in the session of the Query Coordinator  It lists the amount of data send via the “Table Queues” / “Virtual Tables”  In theory this allows exact analysis which operations caused an uneven data distribution
  • 34.  V$PQ_TQSTAT skewed execution example DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES PROCESS ---------- ---------- ---------- ---------- ---------- ---------- 1 0 Producer 500807 54396692 P008 1 0 Producer 499193 54259853 P009 1 0 Consumer 500038 54332349 P010 1 0 Consumer 499962 54324196 P011 1 1 Producer 499826 57267724 P008 1 1 Producer 500174 57357253 P009 1 1 Consumer 499933 57304789 P010 1 1 Consumer 500067 57320188 P011 1 2 Producer 462069 52963939 P010 1 2 Producer 537931 61681312 P011 1 2 Consumer 500212 57346574 P008 1 2 Consumer 499788 57298677 P009 1 3 Producer 500038 57337041 P010 1 3 Producer 499962 57328456 P011 1 3 Consumer 0 48 P008 1 3 Consumer 1000000 114665449 P009 1 4 Producer 0 24 P008 1 4 Producer 1000000 116668401 P009 1 4 Consumer 1000000 116668425 QC
  • 35. Easy to identify whether all workers are kept busy all the time or not Easy to identify if there was a problem with work distribution Shows actual parallel degree used (“Parallel Downgrade”) Supports RAC
  • 36. Reports are not persisted and will be flushed from memory quite quickly on busy systems No easy identification and therefore no systematic troubleshooting which plan operations cause a work distribution problem Lacks some precision regarding Parallel Execution details
  • 37.  Real-Time SQL Monitoring allows detecting Parallel Execution skew in the following way:  In the “Activity” tab  The average active sessions will be less than the DOP of the operation, allows to detect both temporal and data distribution skew  In the “Parallel” tab  The DB Time recorded per Parallel Slave will show an uneven distribution of active work time
  • 38.  Real-Time SQL Monitoring “Activity” tab – skewed execution example
  • 39.  Real-Time SQL Monitoring “Parallel” tab – skewed execution example
  • 40.  One very useful approach is using Active Session History (ASH)  ASH samples active sessions once a second  Activity of Parallel Workers over time can easily be analyzed  From 11g on the ASH data even contains a reference to the execution plan line, so a relation between Parallel Worker activity and execution plan line based on ASH is possible
  • 41.  Custom queries on ASH data required for detailed analysis  XPLAN_ASH tool runs these queries for a given SQL_ID  Advantage of ASH is the availability of retained historic ASH data via AWR on disk  Information can be extracted even for SQL executions as long ago as the retention configured for AWR
  • 42.  Parallel Execution introduction  Major challenges  Parallel unfriendly examples  Distribution skew examples  How to measure distribution of work  Fixing work distribution skew
  • 43. Fixing Work Distribution Skew  Influence Parallel Distribution: Data volume estimates, PQ_DISTRIBUTE / PQ_[NO]MAP / PQ_SKEW (12c+) hint  Limit impact by influencing join order  Rewrite queries trying to limit or avoid skew  Remap skewed data changing the value distribution
  • 44. Fixing Work Distribution Skew  Automatic join skew detection in 12c (PQ_SKEW)  Supports only parallel HASH JOINs  Supports at present only simple probe row sources (no result of other joins supported)  Histogram showing popular values required (or forced via PQ_SKEW hint)
  • 45. ---------------------------------------------------------------------------------- | Id | Operation | Name | TQ |IN-OUT| PQ Distrib | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | | | 1 | SORT AGGREGATE | | | | | | 2 | PX COORDINATOR | | | | | | 3 | PX SEND QC (RANDOM) | :TQ10002 | Q1,02 | P->S | QC (RAND) | | 4 | SORT AGGREGATE | | Q1,02 | PCWP | | |* 5 | HASH JOIN | | Q1,02 | PCWP | | | 6 | PX RECEIVE | | Q1,02 | PCWP | | | 7 | PX SEND HYBRID HASH | :TQ10000 | Q1,00 | P->P | HYBRID HASH| | 8 | STATISTICS COLLECTOR | | Q1,00 | PCWC | | | 9 | PX BLOCK ITERATOR | | Q1,00 | PCWC | | |* 10 | TABLE ACCESS FULL | T_1 | Q1,00 | PCWP | | | 11 | PX RECEIVE | | Q1,02 | PCWP | | | 12 | PX SEND HYBRID HASH (SKEW)| :TQ10001 | Q1,01 | P->P | HYBRID HASH| | 13 | PX BLOCK ITERATOR | | Q1,01 | PCWC | | |* 14 | TABLE ACCESS FULL | T_2 | Q1,01 | PCWP | | ----------------------------------------------------------------------------------
  • 46. Fixing Work Distribution Skew  More information:  http://oracle- randolf.blogspot.com/2014/08/parallel-execution- skew-summary.html
  • 47. Q & A