Nested loop join technique - part2

Table of Contents
Background.................................................................................................................................................... 2
Test Recipes ................................................................................................................................................... 3
Test Cases and Results .................................................................................................................................. 4
It’s Number Time ........................................................................................................................................... 5
Conclusion ..................................................................................................................................................... 6
References ..................................................................................................................................................... 7

Nested Loop Join Technique – Part
2(What’s the new thingin 11g?)
Background
Oracle introduces some improvements in 11g to optimize Nested Loop Join (NLJ).Apart from
that, new join technique also has been introduced (known as Table Batching). The impact of this
technique is as good as table prefetching (introduced in 9i) that we have seen in part1 of this series. To
be precise, this new technique will work more efficient in sorted non-unique index (in unique index, we
won’t see much different between all those 3 techniques). In most of the cases, we won’t see again the
classic NLJ (in the execution plan) in 11g until unless we specify it purposefully using SQL hint. In this
exercise, we will try to see the improvements which have been done by Oracle in 11g and also see what
is the different of batching and prefetching technique.
Before we move forward, let’s see the different of the execution plan diagram between those 3
techniques (classic, prefetching and batching) and how we instruct Oracle to use it (details are as below).
In batching technique, Oracle creates 2 NLJs. The first NLJ (the inner one) is for joining outer (driving)
table and available index of inner (or if I am allowed to call it: driven) table. The second NLJis for joining
above (previous) result with inner table. What make me confuse is that the second NLJ doesn’t have
information of cost, rows, etc. It looks like the second NLJ is created for clarity reason (to make it more
readable, compare to prefetching – even though the internal mechanism is different, we will see the
details in attached XLS file).
Classic Technique

Prefetching Technique

Batching Technique

Test Recipes
As a starting point, 5 tables will be created with 10,000 rows each and exactly10 rows per block,
using “MINIMIZE RECORDS_PER_BLOCK” command. The purpose is to get a good figure of the number.
In addition to those tables, 4 indexes will be created in the 4 inner tables (except DRIVER). The index
itself will be having BLEVEL=2 (I have to use PCTFREE=99 to force it), so the index height is 3 (ROOT 
BRANCH  LEAF). These steps are exactly the same steps that I had followed during 10g testing.
1. DRIVER, driving (outer) table.

2. T_UNIQ_SORTED, inner table with unique index on ID column and sorted data.
3. T_UNIQ_UNSORTED, inner table with unique index on ID column and scattered data/
random ordered.
4. T_NON_UNIQ_SORTED, inner table with non-unique index on ID column and sorted data.
5. T_NON_UNIQ_UNSORTED, inner table with non-unique index on ID column and scattered
data.

Create Tables and
Indexes.LST

Test Cases and Results
To be able to make “fair-enough” comparison, I am following these steps in this exercise. The
idea is to put as much as block in the buffer to minimize or remove physical IO completely.
1. Flush buffer_cache
2. Warm up the buffer by:
a. Select all data from outer table, DRIVER(full table scan)
b. Scan inner table using index access (full index scan)
3. Begin snapper process from separate session
4. Execute each test case and turn on event 10046 to trace SQL wait event and event 10200 to
dump consistent gets activity.

5. End snapper process

DBA series - Nested
Loop Join Technique - part2.xlsx

It’s Number Time
Below table give us enough information to see that there is only small different between 3
techniques in 11g. Oracle makes an optimization in the code level which impact in all 3 techniques. It is
not the same for 10g case, where we can see the different in few statistics. Let’s have a quick look on
below session statistics:
1. The result for all 3 techniques in 11g is equal or very close to the result of prefetching in 10g.
2. “consistent gets” is reduced from 42,000(10g classic method)to 34,000(10g prefetching and
all techniques in 11g). The same thing also happened for “cache buffers chains”.
3. Even though the result of “consistent gets” related statistics are similar each other, we can
see that “sql execute elapsed time” is varying for 11g. Batching technique is the fastest one
while classic NLJ is the slowest (it is slower compare to 10g as well)
4. In 11g, the result of “buffer is pinned count” is varying and prefetching technique is able to
pin more buffer compare to the other 2 techniques.
5. The new “consistent gets from cache (fastpath)” in 11g has relation with system parameter
“_fastpin_enable” (default to 1 in 11g). This parameter control how Oracle handles repeated
access to particular buffer for optimization.
I ran all these exercise once so I might miss something here. In this case, you always have a chance to
rerun all these exercises and share it with me 

Apart from that optimization, I am going to highlight one more statistic which is also impacting
total consistent get. That statistic is “SQL*Net roundtrips to/from client”. During 10g and 11g test, this
statistic always gives similar result. If we look further, we will see the same number for consistent get for

ROOT index. The number (668) is close to the result of “SQL*Net roundtrips to/from client”. Below are
the details.

So, from where the 668 is coming? It has relation with array size in sqlplus. During the test, I use default
array size, which is 15 in my test environment
Since I have 10,000 records in my table, and the size of array is 15, so Oracle has to send the result-set in:
ceil(10,000 / 15) = 667times
But have 1 extra in the result. Don’t worry, it is common in Oracle world that sometimes there is “plus
one, +1” in the calculation (for example, see “_table_scan_cost_plus_one”parameter), so 667 + 1 = 668.
When I reran the test with array size of 100, it gave me below result.

Simple calculation: ceil(10,000 / 100) + 1 = 100 + 1 = 101. So, we need to
consider also the size of array or fetch size (when we have bulk operation) since it has impact
on the number of consistent get.

Conclusion
1. Table prefetching brings a significant improvement for non-unique index in nested loop join.

2. Oracle has done some improvement in 11g and also has introduced new technique for NLJ,
this makes huge differences compare to 10g.
3. In 11g, Oracle introduces new statistic, “consistent gets from cache (fastpath)” which has
relation with system parameter “_fastpin_enable” (default to 1 in 11g). This parameter
control how Oracle handles repeated access to particular buffer for optimization. But the
impact of this optimization will be more efficient only for sorted data. For scattered data,
Oracle won’t be able to optimize much, because consecutive number might be stored in
different data block. So index with low clustering_factor will get benefit from this
optimization.
4. Consider array size or fetch size since it has impact on the consistent get.

References
http://hoopercharles.wordpress.com/2011/01/24/watching-consistent-gets-10200-tracefile-parser/
http://dioncho.wordpress.com/2010/08/16/batching-nlj-optimization-and-ordering/
http://blog.tanelpoder.com/2013/02/18/manual-before-and-after-snapshot-support-insnapper-v4/
-heri-

Nested loop join technique - part2

More Related Content

What's hot (19)

Viewers also liked (16)

Similar to Nested loop join technique - part2 (20)

Recently uploaded (20)

Nested loop join technique - part2