Five Tuning Tips For Your Datawarehouse

Five Tuning Tips For Your Data Warehouse Jeff Moss

My First Presentation Yes, my very first presentation For BIRT SIG For UKOUG Useful Advice from friends and colleagues Use graphics where appropriate Find a friendly or familiar face in the audience Imagine your audience is naked! … but like Oracle, be careful when combining advice!

Be Careful Combining Advice! Thanks for the opportunity Mark!

Agenda My background Five tips Partition for success Squeeze your data with data segment compression Make the most of your PGA memory Beware of temporal data affecting the optimizer Find out where your query is at Questions

My Background Independent Consultant 13 years Oracle experience Blog: http://oramossoracle.blogspot.com/ Focused on warehousing / VLDB since 1998 First project UK Music Sales Data Mart Produces BBC Radio 1 Top 40 chart and many more 2 billion row sales fact table 1 Tb total database size Currently working with Eon UK (Powergen) 4Tb Production Warehouse, 8Tb total storage Oracle Product Stack

What Is Partitioning ? “ Partitioning addresses key issues in supporting very large tables and indexes by letting you decompose them into smaller and more manageable pieces called partitions .” Oracle Database Concepts Manual, 10gR2 Introduced in Oracle 8.0 Numerous improvements since Subpartitioning adds another level of decomposition Partitions and Subpartitions are logical containers

Partition To Tablespace Mapping Partitions map to tablespaces Partition can only be in One tablespace Tablespace can hold many partitions Highest granularity is One tablespace per partition Lowest granularity is One tablespace for all the partitions Tablespace volatility Read / Write Read Only P_JAN_2005 P_FEB_2005 P_MAR_2005 P_APR_2005 P_MAY_2005 P_JUN_2005 P_JUL_2005 P_AUG_2005 P_SEP_2005 P_OCT_2005 P_NOV_2005 P_DEC_2005 T_Q1_2005 T_Q2_2005 T_Q3_2005 T_Q4_2005 T_Q1_2006 P_JAN_2006 P_FEB_2006 P_MAR_2006 T_Q3_2005 Read / Write Read Only

Why Partition ? - Performance Improved query performance Pruning or elimination Partition wise joins Read only partitions Quicker checkpointing Quicker backup Quicker recovery … but it depends on mapping of: partition:tablespace:datafile SELECT SUM(sales) FROM part_tab WHERE sales_date BETWEEN ‘01-JAN-2005’ AND ’30-JUN-2005’ Sales Fact Table * Oracle 10gR2 Data Warehousing Manual JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC

Why Partition ? - Manageability Archiving Use a rolling window approach ALTER TABLE … ADD/SPLIT/DROP PARTITION… Easier ETL Processing Build a new dataset in a staging table Add indexes and constraints Collect statistics Then swap the staging table for a partition on the target ALTER TABLE…EXCHANGE PARTITION… Easier Maintenance Table partition move, e.g. to compress data Local Index partition rebuild

Why Partition ? - Scalability Partition is generally consistent and predictable Assuming an appropriate partitioning key is used …and data has an even distribution across the key Read only approach Scalable backups - read only tablespaces are ignored …so partitions in those tablespaces are ignored Pruning allows consistent query performance

Why Partition ? - Availability Offline data impact minimised … depending on granularity Quicker recovery Pruned data not missed EXCHANGE PARTITION Allows offline build Quick swap over P_JAN_2005 P_FEB_2005 P_MAR_2005 P_APR_2005 P_MAY_2005 P_JUN_2005 P_JUL_2005 P_AUG_2005 P_SEP_2005 P_OCT_2005 P_NOV_2005 P_DEC_2005 T_Q1_2005 T_Q2_2005 T_Q3_2005 T_Q4_2005 T_Q1_2006 P_JAN_2006 P_FEB_2006 P_MAR_2006 T_Q3_2005 Read / Write Read Only

Fact Table Partitioning Transaction Date Load Date Easier ETL Processing Each load deals with only 1 partition No use to end user queries! Can’t prune – Full scans! Harder ETL Processing But still uses EXCHANGE PARTITION Useful to end user queries Allows full pruning capability 07-JAN-2005 Customer 1 09-JAN-2005 15-JAN-2005 Customer 2 17-JAN-2005 January Partition February Partition 22-JAN-2005 Customer 3 01-FEB-2005 02-FEB-2005 Customer 4 05-FEB-2005 26-FEB-2005 Customer 5 28-FEB-2005 March Partition 06-MAR-2005 Customer 2 07-MAR-2005 12-MAR-2005 Customer 3 15-MAR-2005 Tran Date Customer Load Date April Partition 21-JAN-2005 Customer 7 04-APR-2005 09-APR-2005 Customer 9 10-APR-2005 07-JAN-2005 Customer 1 09-JAN-2005 15-JAN-2005 Customer 2 17-JAN-2005 21-JAN-2005 Customer 7 04-APR-2005 22-JAN-2005 Customer 3 01-FEB-2005 January Partition February Partition 02-FEB-2005 Customer 4 05-FEB-2005 26-FEB-2005 Customer 5 28-FEB-2005 March Partition 06-MAR-2005 Customer 2 07-MAR-2005 12-MAR-2005 Customer 3 15-MAR-2005 Tran Date Customer Load Date April Partition 09-APR-2005 Customer 9 10-APR-2005

Watch out for… Partition exchange and table statistics 1 Partition stats updated … but Global stats are NOT! Affects queries accessing multiple partitions Solution Gather stats on staging table prior to EXCHANGE Gather stats on partitioned table using GLOBAL Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2

Partitioning Feature: Characteristic Reason Matrix    Partition Truncation     Exchange Partition    Archiving    Pruning (Partition Elimination)   Partition wise joins  Parallel DML     Local Indexes    Read Only Partitions Availability Scalability Manageability Performance Characteristic: Feature:

What Is Data Segment Compression ? Compresses data by eliminating intra block repeated column values Reduces the space required for a segment …but only if there are appropriate repeats! Self contained Lossless algorithm

Where Can Data Segment Compression Be Used ? Can be used with a number of segment types Heap & Nested Tables Range or List Partitions Materialized Views Can’t be used with Subpartitions Hash Partitions Indexes – but they have row level compression IOT External Tables Tables that are part of a Cluster LOBs

How Does Segment Compression Work ? Database Block Symbol Table Row Data Area 100 Call to discuss bill amount TEL NO YES 3 TEL 4 NO 5 YES 2 Call to discuss bill amount 1 100 1 2 3 4 5 101 Call to discuss new product MAIL NO N/A 8 MAIL 9 N/A 7 Call to discuss new product 6 101 6 7 8 4 9 102 Call to discuss new product TEL YES N/A 10 7 3 5 9 10 102 ID DESCRIPTION CONTACT TYPE OUTCOME FOLLOWUP

Pros & Cons Pros Saves space Reduces LIO / PIO Speeds up backup/recovery Improves query response time Transparent To readers … and writers Decreases time to perform some DML Deletes should be quicker Bulk inserts may be quicker Cons Increases CPU load Can only be used on Direct Path operations CTAS Serial Inserts using INSERT /*+ APPEND */ Parallel Inserts (PDML) ALTER TABLE…MOVE… Direct Path SQL*Loader Increases time to perform some DML Bulk inserts may be slower Updates are slower

Ordering Your Data For Maximum Benefits Colocate data to maximise compression benefits For maximum compression Minimise the total space required by the segment Identify most “compressable” column(s) For optimal access We know how the data is to be queried Order the data by Access path columns Then the next most “compressable” column(s) Uniformly distributed Colocated 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5

Get Max Compression Order Package PROCEDURE mgmt_p_get_max_compress_order Argument Name Type In/Out Default? ------------------------------ ----------------------- ------ -------- P_TABLE_OWNER VARCHAR2 IN DEFAULT P_TABLE_NAME VARCHAR2 IN P_PARTITION_NAME VARCHAR2 IN DEFAULT P_SAMPLE_SIZE NUMBER IN DEFAULT P_PREFIX_COLUMN1 VARCHAR2 IN DEFAULT P_PREFIX_COLUMN2 VARCHAR2 IN DEFAULT P_PREFIX_COLUMN3 VARCHAR2 IN DEFAULT BEGIN mgmt_p_get_max_compress_order(p_table_owner => ‘AE_MGMT’ ,p_table_name =>’BIG_TABLE’ ,p_sample_size =>10000); END: / Running mgmt_p_get_max_compress_order... ---------------------------------------------------------------------------------------------------- Table : BIG_TABLE Sample Size : 10000 Unique Run ID: 25012006232119 ORDER BY Prefix: ---------------------------------------------------------------------------------------------------- Creating MASTER Table : TEMP_MASTER_25012006232119 Creating COLUMN Table 1: COL1 Creating COLUMN Table 2: COL2 Creating COLUMN Table 3: COL3 ---------------------------------------------------------------------------------------------------- The output below lists each column in the table and the number of blocks/rows and space used when the table data is ordered by only that column, or in the case where a prefix has been specified, where the table data is ordered by the prefix and then that column. From this one can determine if there is a specific ORDER BY which can be applied to to the data in order to maximise compression within the table whilst, in the case of a a prefix being present, ordering data as efficiently as possible for the most common access path(s). ---------------------------------------------------------------------------------------------------- NAME COLUMN BLOCKS ROWS SPACE_GB ============================== ============================== ============ ============ ======== TEMP_COL_001_25012006232119 COL1 290 10000 .0022 TEMP_COL_002_25012006232119 COL2 345 10000 .0026 TEMP_COL_003_25012006232119 COL3 555 10000 .0042

Data Warehousing Specifics Star Schema compresses better than Normalized More redundant data Focus on… Fact Tables and Summaries in Star Schema Transaction tables in Normalized Schema Performance Impact 1 Space Savings Star schema: 67% Normalized: 24% Query Elapsed Times Star schema: 16.5% Normalized: 10% 1 - Table Compression in Oracle 9iR2: A Performance Analysis

Things To Watch Out For DROP COLUMN is awkward ORA-39726: Unsupported add/drop column operation on compressed tables Uncompress the table and try again - still gives ORA-39726! After UPDATEs data is uncompressed Performance impact Row migration Use appropriate physical design settings PCTFREE 0 - pack each block Large blocksize - reduce overhead / increase repeats per block

PGA Memory: What For ? Sorts Standard sorts [SORT] Buffer [BUFFER] Group By [GROUP BY (SORT)] Connect By [CONNECT-BY (SORT)] Rollup [ROLLUP (SORT)] Window [WINDOW (SORT)] Hash Joins [HASH-JOIN] Indexes Maintenance [IDX MAINTENANCE SOR] Bitmap Merge [BITMAP MERGE] Bitmap Create [BITMAP CREATE] Write Buffers [LOAD WRITE BUFFERS] Serial Process PGA Dedicated Server Cursors Variables Sort Area [] V$SQL_WORKAREA.OPERATION_TYPE

PGA Memory Management: Manual The “old” way of doing things Still available though – even in 10g R2 Configuring ALTER SESSION SET WORKAREA_SIZE_POLICY=MANUAL; Initialisation parameter: WORKAREA_SIZE_POLICY=MANUAL Set memory parameters yourself HASH_AREA_SIZE SORT_AREA_SIZE SORT_AREA_RETAINED_SIZE BITMAP_MERGE_AREA_SIZE CREATE_BITMAP_AREA_SIZE Optimal values depend on the type of work 1 One size does not fit all! 1 - Richmond Shee: If Your Memory Serves You Right

PGA Memory Management: Automatic The “new” way from 9i R1 Default OFF in 9i R1/R2 Enabled by setting at session/instance level: WORKAREA_SIZE_POLICY=AUTO PGA_AGGREGATE_TARGET > 0 Default ON since 10g R1 Oracle dynamically manages the available memory to suit the workload But of course, it’s not perfect! Jože Senegačnik - Advanced Management Of Working Areas In Oracle 9i/10g, presented at UKOUG 2005

Auto PGA Parameters: Pre 10gR2 WORKAREA_SIZE_POLICY Set to AUTO PGA_AGGREGATE_TARGET The target for summed PGA across all processes Can be exceeded if too small Over Allocation _PGA_MAX_SIZE Target maximum PGA size for a single process Default is a fixed value of 200Mb Hidden / Undocumented Parameter Usual caveats apply

Auto PGA Parameters : Pre 10gR2 _SMM_MAX_SIZE Limit for a single workarea operation for one process Derived Default LEAST(5% of PGA_AGGREGATE_TARGET , 50% of _PGA_MAX_SIZE) Hits limit of 100Mb When PGA_AGGREGATE_TARGET is >= 2000Mb And _PGA_MAX_SIZE is left at default of 200Mb Hidden / Undocumented Parameter Usual caveats apply

Auto PGA Parameters : Pre 10gR2 _SMM_PX_MAX_SIZE Limit for all the parallel slaves of a single workarea operation Derived Default 30% of PGA_AGGREGATE_TARGET Hidden / Undocumented Parameter Usual caveats apply Parallel slaves still limited _SMM_MAX_SIZE Impacts only when… PGA_AGGREGATE_TARGET: 3000Mb _PGA_MAX_SIZE = 200Mb _SMM_MAX_SIZE = 100Mb _SMM_PX_MAX_SIZE = 900Mb Session 1 100Mb Session 2 100Mb Session 3 100Mb Session 4 100Mb Session 5 100Mb Session 6 100Mb Session 7 100Mb Session 8 100Mb Session 9 75Mb Session 10 75Mb Session 11 75Mb Session 12 75Mb Session 1 75Mb Session 2 75Mb Session 3 75Mb Session 4 75Mb Session 5 75Mb Session 6 75Mb Session 7 75Mb Session 8 75Mb

10gR2 Improvements _SMM_MAX_SIZE now the driver More advanced algorithm _PGA_MAX_SIZE = 2 * _SMM_MAX_SIZE Parallel operations _SMM_PX_MAX_SIZE = 50% * PGA_AGGREGATE_TARGET When DOP <=5 then _smm_max_size is used When DOP > 5 _smm_px_max_size / DOP is used Jože Senegačnik - Advanced Management Of Working Areas In Oracle 9i/10g, presented at UKOUG 2005 10% * PGA_AGGREGATE_TARGET 1000Mb + 100Mb 500Mb – 1000Mb 20% * PGA_AGGREGATE_TARGET <= 500Mb _SMM_MAX_SIZE PGA_AGGREGATE_TARGET

PGA Target Advisor select trunc(pga_target_for_estimate/1024/1024) pga_target_for_estimate , to_char(pga_target_factor * 100,'999.9') ||'%' pga_target_factor , trunc(bytes_processed/1024/1024) bytes_processed , trunc(estd_extra_bytes_rw/1024/1024) estd_extra_bytes_rw , to_char(estd_pga_cache_hit_percentage,'999') || '%' estd_pga_cache_hit_percentage , estd_overalloc_count from v$pga_target_advice / PGA Target For PGA Tgt Estimated Extra Estimated PGA Estimated Estimate Mb Factor Bytes Processed Bytes Read/Written Cache Hit % Overallocation Count -------------- ------- ---------------- ------------------ --------------- -------------------- 5,376 12.5% 5,884,017 7,279,799 45% 113 10,752 25.0% 5,884,017 3,593,510 62% 8 21,504 50.0% 5,884,017 3,140,993 65% 0 32,256 75.0% 5,884,017 3,104,894 65% 0 43,008 100.0% 5,884,017 2,300,826 72% 0 51,609 120.0% 5,884,017 2,189,160 73% 0 60,211 140.0% 5,884,017 2,189,160 73% 0 68,812 160.0% 5,884,017 2,189,160 73% 0 77,414 180.0% 5,884,017 2,189,160 73% 0 86,016 200.0% 5,884,017 2,189,160 73% 0 129,024 300.0% 5,884,017 2,189,160 73% 0 172,032 400.0% 5,884,017 2,189,160 73% 0 258,048 600.0% 5,884,017 2,189,160 73% 0

Beware Of Temporal Data Affecting The Optimizer Slowly Changing Dimensions Cover ranges of time “ From” and “To” DATE columns define applicability Need BETWEEN operator to retrieve rows for a reporting point in time SELECT * FROM d_customer WHERE ’15/01/2005’ BETWEEN valid_from AND valid_to Month 1 1 st Jan, 2004 Month 2 1 st Feb, 2004 CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE 487438 Jeff Moss I & C 839398 Mark Rittman SME D_CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE VALID_FROM VALID_TO 487438 Jeff Moss SME 01/01/2004 31/01/2004 487438 Jeff Moss I & C 01/02/2004 839398 Mark Rittman SME 01/02/2004 CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE 487438 Jeff Moss SME D_CUSTOMER CUSTOMER_ID NAME CUSTOMER_TYPE VALID_FROM VALID_TO 487438 Jeff Moss SME 01/01/2004

Dependent Predicates When multiple predicates exist, individual selectivities are combined using standard probability math 1 : P1 AND P2 S(P1 & P2) = S(P1) * S(P2) P1 OR P2 S(P1 | P2) = S(P1) + S(P2) – [S(P1) * S(P2)] Only valid if the predicates are independent otherwise… Incorrect selectivity estimate Incorrect cardinality estimate Potentially suboptimal execution plan BETWEEN is multiple predicates! Also known as Correlated Columns 2 1 – Wolfgang Breitling, Fallacies Of The Cost Based Optimizer 2 – Jonathan Lewis, Cost-Based Oracle Fundamentals, Chapter 6

Some Test Tables… Consider these 3 test tables… 12 records in an SCD type table TEST_12_DISTINCT_TD TEST_2_DISTINCT_TD TEST_1_DISTINCT_TD

Optimizer Gets Incorrect Cardinality select * from test_1_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 1 Jeff 01-JAN-05 31-DEC-05 2 Mark 01-FEB-05 31-DEC-05 3 Doug 01-MAR-05 31-DEC-05 4 Niall 01-APR-05 31-DEC-05 5 Tom 01-MAY-05 31-DEC-05 6 Jonathan 01-JUN-05 31-DEC-05 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 10 rows selected. Execution Plan ---------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 264 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_1_DISTINCT_TD | 11 | 264 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------

…And Again select * from test_2_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 4 rows selected. Execution Plan ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 11 | 264 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_2_DISTINCT_TD | 11 | 264 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------

… And Again select * from test_12_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 10 Anjo 01-OCT-05 31-OCT-05 1 row selected. Execution Plan ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4 | 96 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_12_DISTINCT_TD | 4 | 96 | 3 (0)| 00:00:01 | -----------------------------------------------------------------------------------------

Workarounds Ignore it If your query still gets the right plan of course! Hints Force the optimizer to do as you tell it Stored outlines Adjust statistics held against the table Affects any SQL that accesses that object Optimizer Profile (10g) Offline Optimisation 1 Dynamic sampling level 4 or above Samples “ single table predicates that reference 2 or more columns ” Takes extra time during the parse – minimal but often worth it 1 - Jonathan Lewis: Cost-Based Oracle Fundamentals, Chapter 2

Dynamic Sampling With A Hint select /*+ dynamic_sampling(test_1_distinct_td,4) */ * from test_1_distinct_td where to_date('09-OCT-2005','DD-MON-YYYY') between from_date and to_date; KEY NON_KEY_AT FROM_DATE TO_DATE ---------- ---------- --------- --------- 1 Jeff 01-JAN-05 31-DEC-05 2 Mark 01-FEB-05 31-DEC-05 3 Doug 01-MAR-05 31-DEC-05 4 Niall 01-APR-05 31-DEC-05 5 Tom 01-MAY-05 31-DEC-05 6 Jonathan 01-JUN-05 31-DEC-05 7 Lisa 01-JUL-05 31-DEC-05 8 Cary 01-AUG-05 31-DEC-05 9 Mogens 01-SEP-05 31-DEC-05 10 Anjo 01-OCT-05 31-DEC-05 10 rows selected. Execution Plan ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 240 | 3 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| TEST_1_DISTINCT_TD | 10 | 240 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------

Find Out Where Your Query Is At Data Warehouses are big, big, BIG! Big on rows Big on disk storage Big on hardware Big SQL statements issued Lots of data to scan, join and sort Many operations Long running So where is my long running query at ? No solid answers here, just food for thought…

A “Big” Query Execution Plan | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 124 | | 49722 (10)| | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ20006 | 1 | 124 | | 49722 (10)| | 3 | HASH JOIN | | 1 | 124 | | 49722 (10)| | 4 | BUFFER SORT | | | | | | | 5 | PX RECEIVE | | 207K| 9510K| | 25982 (9)| | 6 | PX SEND BROADCAST | :TQ20000 | 207K| 9510K| | 25982 (9)| | 7 | VIEW | | 207K| 9510K| | 25982 (9)| | 8 | WINDOW SORT | | 207K| 10M| 26M| 25982 (9)| | 9 | MERGE JOIN | | 207K| 10M| | 25976 (9)| | 10 | TABLE ACCESS BY INDEX ROWID| AML_T_ANALYSIS_DATE | 1 | 22 | | 2 (0)| | 11 | INDEX UNIQUE SCAN | AML_I_ANL_PK | 1 | | | 0 (0)| | 12 | SORT AGGREGATE | | 1 | 9 | | | | 13 | PX COORDINATOR | | | | | | | 14 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 9 | | | | 15 | SORT AGGREGATE | | 1 | 9 | | | | 16 | PX BLOCK ITERATOR | | 1 | 9 | | 2 (0)| | 17 | TABLE ACCESS FULL | AML_T_ANALYSIS_DATE | 1 | 9 | | 2 (0)| | 18 | FILTER | | | | | | | 19 | FILTER | | | | | | | 20 | TABLE ACCESS FULL | AML_T_BILLING_ACCOUNT_DIM| 82M| 2371M| | 5457 (5)| | 21 | HASH JOIN | | 18M| 1340M| | 23704 (10)| | 22 | HASH JOIN | | 10M| 500M| | 17005 (11)| | 23 | PX RECEIVE | | 10M| 265M| | 11304 (14)| | 24 | PX SEND HASH | :TQ20003 | 10M| 265M| | 11304 (14)| | 25 | BUFFER SORT | | 1 | 124 | | | | 26 | VIEW | AML_V_MD_CUH_SID | 10M| 265M| | 11304 (14)| | 27 | HASH JOIN | | 10M| 337M| | 11304 (14)| | 28 | PX RECEIVE | | 17M| 310M| | 5228 (18)| | 29 | PX SEND HASH | :TQ20001 | 17M| 310M| | 5228 (18)| | 30 | PX BLOCK ITERATOR | | 17M| 310M| | 5228 (18)| | 31 | TABLE ACCESS FULL | AML_T_MEASURE_DIM | 17M| 310M| | 5228 (18)| | 32 | PX RECEIVE | | 34M| 461M| | 5958 (10)| | 33 | PX SEND HASH | :TQ20002 | 34M| 461M| | 5958 (10)| | 34 | PX BLOCK ITERATOR | | 34M| 461M| | 5958 (10)| | 35 | TABLE ACCESS FULL | AML_T_CUSTOMER_DIM | 34M| 461M| | 5958 (10)| | 36 | PX RECEIVE | | 55M| 1212M| | 5562 (3)| | 37 | PX SEND HASH | :TQ20004 | 55M| 1212M| | 5562 (3)| | 38 | PX BLOCK ITERATOR | | 55M| 1212M| | 5562 (3)| | 39 | TABLE ACCESS FULL | AML_T_CUSTOMER_DIM | 55M| 1212M| | 5562 (3)| | 40 | PX RECEIVE | | 94M| 2516M| | 6483 (5)| | 41 | PX SEND HASH | :TQ20005 | 94M| 2516M| | 6483 (5)| | 42 | PX BLOCK ITERATOR | | 94M| 2516M| | 6483 (5)| | 43 | MAT_VIEW ACCESS FULL | AML_M_CD_BAD | 94M| 2516M| | 6483 (5)| Sorts Aggregations Hash joins Merge joins Table scans Materialized View scans Analytics Parallel Query Pruning Temp Space Use

V$ Views To The Rescue ? V$SESSION – Identify your session V$SQL_PLAN – Get the execution plan operations V$SQL_WORKAREA – Get all the work areas which will be required V$SESSION_LONGOPS – Get information on long plan operations V$SQL_WORKAREA_ACTIVE – Get the work area(s) being used right now V$SESSION SID SERIAL# PROGRAM USERNAME SQL_ID SQL_CHILD_NUMBER SQL_ADDRESS SQL_HASH_VALUE V$SQL_PLAN SQL_ID CHILD_NUMBER ADDRESS HASH_VALUE OPERATION ID PARENT_ID V$SESSION_LONGOPS SID SERIAL# OPNAME TARGET MESSAGE SQL_ID SQL_ADDRESS SQL_HASH_VALUE ELAPSED_SECONDS V$SQL_WORKAREA_ACTIVE SQL_ID SQL_HASH_VALUE WORKAREA_ADDRESS OPERATION_ID OPERATION_TYPE POLICY SID QCSID ACTIVE_TIME V$SQL_WORKAREA SQL_ID CHILD_NUMBER WORKAREA_ADDRESS OPERATION_ID OPERATION_TYPE

Problems V$SQL_PLAN Bug Service Request: 4990863.992 Broken in 10gR1, Works in 10gR2 PARENT_ID corruption Can’t link rows in this view to their parents as the values are corrupted due to this bug Shows up in TEMP TABLE TRANSFORMATION operations Multiple Work Areas can be active…or None Some operations are not shown in Long ops V$SESSION sql_id may not be the executing cursor E.g. for refreshing Materialized View * Test case for bug: http://www.oramoss.demon.co.uk/Code/test_error_v_sql_plan.sql

References: Papers Table Compression in Oracle 9iR2: A Performance Analysis Table Compression in Oracle 9iR2: An Oracle White Paper “Fallacies Of The Cost Based Optimizer”, Wolfgang Breitling “ Scaling To Infinity, Partitioning In Oracle Data Warehouses”, Tim Gorman Advanced Management Of Working Areas in Oracle 9i/10g, UKOUG 2005, Joze Senegacnik Oracle9i Memory Management: Easier Than Ever, Oracle Open World 2002 , Sushil Kumar Working with Automatic PGA , Christo Kutrovsky Optimising Oracle9i Instance Memory, Ramaswamy, Ramesh Oracle Metalink Note 223730.1 : Automatic PGA Memory Managment in 9i Oracle Metalink Note 147806.1 : Oracle9i New Feature: Automated SQL Execution Memory Management Oracle Metalink Note 148346.1 : Oracle9i Monitoring Automated SQL Execution Memory Management Memory Management and Latching Improvements in Oracle Database 9i and 10g , Oracle Open World 2005, Tanel Pőder If Your Memory Serves You Right… , IOUG Live! 2004, April 2004, Toronto, Canada, Richmond Shee Decision Speed: Table Compression In Action

References: Online Presentation / Code http://www.oramoss.demon.co.uk/presentations/fivetuningtipsforyourdatawarehouse.ppt http://www.oramoss.demon.co.uk/Code/mgmt_p_get_max_compression_order.prc http://www.oramoss.demon.co.uk/Code/test_dml_performance_delete.sql http://www.oramoss.demon.co.uk/Code/test_dml_performance_insert.sql http://www.oramoss.demon.co.uk/Code/test_dml_performance_update.sql http://www.oramoss.demon.co.uk/Code/test_error_v_sql_plan.sql htt p://www.oramoss.demon.co.uk/Code/run_big_query.sql htt p://www.oramoss.demon.co.uk/Code/run_big_query_parallel.sql htt p://www.oramoss.demon.co.uk/Code/get_query_progress.sql

Five Tuning Tips For Your Datawarehouse

More Related Content

What's hot (20)

Similar to Five Tuning Tips For Your Datawarehouse (20)

Recently uploaded (20)

Five Tuning Tips For Your Datawarehouse