SlideShare a Scribd company logo
Header here
www.ukoug.org 11
Histograms:
Histograms are used by the optimizer to compute the selectivity of filter and
join predicates in case of skewed data distribution. Prior to 12c, two types
of histograms could be created: frequency histograms and height-balanced
histograms. 12c introduces top frequency and hybrid histograms which are
designed to overcome the limitations of their precursors. This article discusses
the need for histograms, the interpretation of various types of histograms and
the evolution of histograms from 11g to 12c.
Anju Garg, Corporate Trainer
Pre-12c & Now
Need for Histograms
When a SQL statement is issued, the optimizer generates an
optimum execution plan based on the information available
to it. If data is uniformly distributed across various values in a
column and table statistics have been gathered, the optimizer
estimates cardinality (row count) accurately and makes correct
decision with respect to access method, join order and join
method to be used. But if data distribution is skewed, the
optimizer might make an incorrect estimate for the cardinality
and choose a bad execution path. For example, consider a table
HR.HIST having a skewed data distribution in column ID as
shown in Figure 1.1.
Pre-12c Histograms
Prior to Oracle 12c, two types of histograms could be created (as
shown in Figure 1.2):
-	 Frequency histograms
-	 Height-balanced histograms
Technology
FIGURE 1.1
FIGURE 1.2
12 www.ukoug.org
SUMMER 15
Technology: Anju Garg
OracleScene
D I G I T A L
Frequency Histograms
A frequency histogram is a frequency distribution which records each different value and its exact cardinality. A frequency
histogram is created when
-	Requested no. of buckets (Nb) = No. of distinct values (NDV) and
-	 NDV = 254 (2,048 in 12c).
A frequency histogram with 26 buckets for ID column can be created as under:
TABLE 2.1
SQLexec dbms_stats.gather_table_stats -
(ownname = ‘HR’,tabname = ‘HIST’,method_opt = ‘FOR COLUMNS ID’, cascade = true);
SQL select table_name, column_name, histogram, num_distinct, num_buckets
from dba_tab_col_statistics
where table_name = ‘HIST’ and column_name = ‘ID’;
TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS
---------- --------------- --------------- ------------ -----------
HIST ID FREQUENCY 26 26
The histogram can be viewed from DBA_HISTOGRAMS (as shown in Table 2.2)
SQL select ENDPOINT_VALUE, ENDPOINT_NUMBER
from dba_histograms
where table_name = ‘HIST’
and column_name = ‘ID’;
ENDPOINT_VALUE ENDPOINT_NUMBER
-------------- ---------------
1 4
2 6
3 7
4 9
5 10
6 12
7 15
8 65
9 68
10 70
11 76
12 82
13 88
14 91
15 96
16 99
17 102
18 103
19 104
20 109
21 111
22 112
23 113
24 115
25 118
26 120
TABLE 2.2
FIGURE 2.2
Interpreting Frequency Histogram
It can be seen from Table 2.2 that a frequency histogram with 26 buckets, one for each distinct value, has been created.
•	 ENDPOINT_VALUE - The value in a bucket.
•	 ENDPOINT_NUMBER - Cumulative frequency
Thus, you can find out the exact counts for each of the distinct values in the data. For example, the optimizer makes an accurate
estimate of 50 rows for ID = 8 and uses FTS access path as desired (Table 2.3) even though column ID is indexed.
TABLE 2.3
SQLexplain plan for select * from hr.hist where id = 8;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------Plan hash value: 538080257
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50 | 50200 | 7 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| HIST | 50 | 50200 | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------
Technology: Anju Garg
www.ukoug.org 13
Thus, prior to 12c, frequency histograms could be used to accurately estimate the frequencies if NDV = 254.
Height-balanced Histograms
A height-balanced histogram is created if NDV  254 or Nb  NDV. This histogram distributes the count of all rows evenly across
all histogram buckets, so all buckets will have almost exactly the same number of rows. A height-balanced histogram is much less
precise and can’t really capture information about more than 127 popular values.
To create height balanced histogram, specify no. of buckets = 20 ( NDV (=26) )
DB11gexec dbms_stats.gather_table_stats -
(ownname = ‘HR’, tabname = ‘HIST’,method_opt = ‘FOR COLUMNS ID size 20’, cascade = true);
It can be seen that the height-balanced histogram has been created as No. of buckets (20)  NDV (26) (Table 2.4).
DB11gselect table_name, column_name, histogram, num_distinct, num_buckets
from dba_tab_col_statistics
where table_name = ‘HIST’ and column_name = ‘ID’;
TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS
---------- --------------- --------------- ------------ -----------
HIST ID HEIGHT BALANCED 26 20
TABLE 2.4
The height-balanced histogram that has been created can be viewed in Table 2.5.
DB11gselect ENDPOINT_VALUE, ENDPOINT_NUMBER
from dba_histograms
where table_name = ‘HIST’
and column_name = ‘ID’;
ENDPOINT_VALUE ENDPOINT_NUMBER
-------------- --------------
1 0
2 1
6 2
8 10
9 11
11 12
12 13
13 14
14 15
15 16
17 17
20 18
24 19
26 20
14 rows selected.
TABLE 2.5 FIGURE 2.3
Interpreting Height-balanced Histogram
•	 Bucket size = Total no. of rows / Nb = 120 / 20 = 6
•	 ENDPOINT_NUMBER - A number uniquely identifying a bucket
•	 For bucket with ENDPOINT_NUMBER 0, ENDPOINT_VALUE = the lowest value (1 here)
•	 For buckets with ENDPOINT_NUMBER  0, ENDPOINT_VALUE = largest value stored in that bucket
Note that when storing the histogram selection, Oracle doesn’t store repetitions of end point values. If there are multiple buckets
with same end points, only one bucket is stored with its highest end point number. For example, there are 8 buckets (3 - 10)
containing the value 8. The histogram stores only one entry with the highest ENDPOINT_NUMBER, i.e. 10.
The optimizer decides the popularity of a value by the number of buckets having that value as its end point. Since value 8 is the
endpoint of multiple buckets, it is considered as a popular value. The cardinality of a popular value is derived as the product of
bucket size and the number of buckets having the value as their end point. For example, cardinality for value 8 = no. of buckets
having 8 as end point * bucket size i.e. 8 * 6 = 48 (actual = 50).
14 www.ukoug.org
SUMMER 15
Technology: Anju Garg
OracleScene
D I G I T A L
TABLE 2.6
DB11g explain plan for select * from hr.hist where id =8;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 538080257
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 48 | 48192 | 7 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| HIST | 48 | 48192 | 7 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(“ID”=8)
If we search for an unpopular value i.e. the value which is not an end point or is the end point of only one bucket, the optimizer
calculates the cardinality as (number of rows in table)*density where density is calculated by the optimizer using an internal
algorithm based on factors such as the number of buckets and the NDV. For example, consider two unpopular values:
ID = 15 occurs 5 times and is an end point of one bucket
ID = 3 occurs once and is not an end point
It can be seen that the number of rows estimated for both the unpopular values is same i.e. 3 (Table 2.7 and Table 2.8).
TABLE 2.7
DB11gexplain plan for select * from hr.hist where id = 15;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
Plan hash value: 4058847011
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 3012 | 2 (0)| 00:00:01
| 1 | TABLE ACCESS BY INDEX ROWID| HIST | 3 | 3012 | 2 (0)| 00:00:01
|* 2 | INDEX RANGE SCAN | HIST_IDX | 3 | | 1 (0)| 00:00:01
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(“ID”=15)
TABLE 2.8
DB11gexplain plan for select * from hr.hist where id = 3;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
Plan hash value: 4058847011
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |Cost (%CPU)| Time
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 3012 | 2 (0)| 00:00:01
| 1 | TABLE ACCESS BY INDEX ROWID| HIST | 3 | 3012 | 2 (0)| 00:00:01
|* 2 | INDEX RANGE SCAN | HIST_IDX | 3 | | 1 (0)| 00:00:01
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access”ID”=3)
Hence, it can be inferred that height-balanced histograms simply decide the cardinality of value based on the popularity of a value
which depends on the number of buckets having the value as end point.
Issues with Histograms in 11g
- Frequency histograms are accurate, but can be created only for NDV = 254
- Height-balanced histograms may cause the optimizer to choose a suboptimal plan in cases where a value is an end point of only
one bucket, but almost fills up another bucket. In such a scenario the value might be considered unpopular.
Technology: Anju Garg
www.ukoug.org 15
In 12c, frequency histograms can be created for up to
2048 distinct values, which implies that we can now
have accurate cardinality estimates for a large range of
NDVs. Moreover, two new types of histograms have been
introduced: Top-n-frequency and hybrid, which aim at
resolving the misestimates cropping up due to use of height
balanced histograms.
Top Frequency Histograms
If a small number of distinct values dominate the data set, the database performs a full table scan and creates a top frequency
histogram by using the small number of extremely popular distinct values. A top frequency histogram can produce a better
histogram for highly popular values by ignoring statistically insignificant unpopular values. The decision whether data is dominated
by popular values is made based on a threshold p which is defined as (1-(1/Nb))*100 where Nb = No. of buckets.
If percentage of rows occupied by the top Nb frequent values is equal to or greater than threshold p, a top frequency histogram is
created else a hybrid histogram will be created.
Threshold p for 20 buckets can be calculated as:
p = (1 - (1/Nb))*100 = (1 - (1/20))*100 = 95.0
There are 120 rows in table HR.HIST.
Hence a top frequency histogram will be created if the top 20 most popular values occupy more than 95% of rows. i.e. 114 rows.
As can be seen from Table 3.1, there are 114 rows having ID’s occurring top 20 times.
Hence, a top frequency histogram is created (Table 3.2), in
this case when statistics are gathered for bucket size = 20 and
ESTIMATE_PERCENT = AUTO_SAMPLE_SIZE (default).
TABLE 3.2
DB12cexec dbms_stats.gather_table_stats -
(ownname = ‘HR’, tabname = ‘HIST’, method_opt = ‘FOR COLUMNS ID size 20’, cascade = true);
	 select table_name, column_name, histogram, num_distinct, num_buckets
	 from dba_tab_col_statistics
	 where table_name = ‘HIST’ and column_name = ‘ID’;
TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS
---------- --------------- --------------- ------------ -----------
HIST ID TOP-FREQUENCY 26 20
The top-frequency histogram can be queried from dba_histograms as in Table 3.3.
FIGURE 3.1
TABLE 3.1
SQLselect sum (cnt)
from (select id, count(*) cnt from hr.hist
group by id
order by count(*) desc)
where rownum = 20;
SUM(CNT)
----------
114
16 www.ukoug.org
SUMMER 15
Technology: Anju Garg
OracleScene
D I G I T A L
DB12cselect ENDPOINT_VALUE, ENDPOINT_NUMBER
	 from dba_histograms
	 where table_name = ‘HIST’
	 and column_name = ‘ID’;
ENDPOINT_VALUE ENDPOINT_NUMBER
-------------- ---------------
1 4
2 6
4 8
6 10
7 13
8 63
9 66
10 68
11 74
12 80
13 86
14 89
15 94
16 97
17 100
20 105
21 107
24 109
25 112
26 114
20 rows selected
TABLE 3.3 FIGURE 3.2
Interpreting Top Frequency Histogram
•	 ENDPOINT_VALUE represents key value (ID)
•	 ENDPOINT_NUMBER represents cumulative frequency
•	 Since NDV (26)  Nb (20), only 20 values are captured which occur most frequently
•	 Frequencies of least occurring 6 values (bottom 5%) have not been stored
It can be seen that a top frequency histogram makes an accurate cardinality estimate for both id = 15 (Table 3.4) and 3 (Table 3.5)
which were considered non-popular values in the height-balanced histogram.
TABLE 3.4
DB12cexplain plan for select * from hr.hist where id = 15;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
Plan hash value: 3950962134
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 |5020 | 2 (0)| 00:00:01
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| HIST | 5 |5020 | 2 (0)| 00:00:01
|* 2 | INDEX RANGE SCAN |HIST_IDX| 5 | | 1 (0)| 00:00:01
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(“ID”=15)
TABLE 3.5
DB12cexplain plan for select * from hr.hist where id = 3;
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------
Plan hash value: 3950962134
---------------------------------------------------------------------------------| Id | Operation |
Name |Rows| Bytes|Cost(%CPU)|Time
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 |1004 | 2 (0)| 00:00:01
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED|HIST | 1 |1004 | 2 (0)| 00:00:01
|* 2 | INDEX RANGE SCAN |HIST_IDX| 1 | | 1 (0)| 00:00:01
---------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(“ID”=3)
Thus the problem with height balanced histograms, of not being able to estimate the frequency of unpopular values accurately,
has been resolved by top frequency histograms in cases when a small number of distinct values dominate the majority of distinct
values. This histogram is gathered using a full table scan of a table. The occurrences of popular values are accurately captured at the
expense of not capturing the data for least occurring values.
Technology: Anju Garg
www.ukoug.org 17
Hybrid Histograms
A hybrid histogram is so called as it combines the characteristics of both height-based histograms and frequency histograms. As we
saw earlier, the height-balanced histogram may produce inaccurate estimates for:
•	 a value that is not an end point
•	 a value that is an end point of only one bucket
•	 a value that is an end point of multiple buckets and almost fills up the last bucket
A hybrid histogram attempts to overcome above shortcomings as it has following features:
•	For each end point in the histogram, it stores the ENDPOINT_REPEAT_COUNT value, which is the number of times the end point
value is repeated. Thus, it has an accurate frequency of end point values.
•	As compared to a height-balanced histogram where a value having frequency greater than bucket size could be spread across
multiple buckets, a hybrid histogram stores all the occurrences of every value in the same bucket, i.e. a value cannot span
multiple buckets. As a result, it can capture more end points.
•	Similar to a height-balanced histogram, a bucket in a hybrid histogram can contain more than one value.
An after effect of this implementation is variable bucket size. Since each value possibly having a different frequency will be
contained entirely in one bucket only and one bucket can even have more than one value, buckets of different size may result.
A histogram with 20 buckets will be created as a hybrid histogram if rows having top 20 most popular IDs are less than threshold p
for 20 buckets.
p = (1 - (1/nb))*100 = (1 - (1/20))*100 = 95.0
On deleting 20 rows with ID = 8 from table HR.HIST, it qualifies for hybrid histogram creation as no. of rows having id’s occurring top
20 times = 94 (Table 3.7) which is less than 95% of rows. i.e. 95 rows.
TABLE 3.6
DB12cdelete from hr.hist where id = 8 and rownum =20;
commit;
select count(*) from hr.hist;
COUNT(*)
----------
100
TABLE 3.7
DB12cselect sum (cnt)
from (select id, count(*) cnt from hr.hist
group by id
order by count(*) desc)
where rownum = 20;
SUM(CNT)
----------
94
It can be seen from that, a hybrid histogram with 20 buckets has been created (Table 3.8 and Table 3.9).
TABLE 3.8
DB12cexec dbms_stats.gather_table_stats -
(ownname = ‘HR’,tabname = ‘HIST’, method_opt = ‘FOR COLUMNS ID size 20’, cascade = true);
DB12c select table_name, column_name, histogram, num_distinct, num_buckets
	 from dba_tab_col_statistics
	 where table_name = ‘HIST’ and column_name = ‘ID’;
TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS
---------- --------------- --------------- ------------ -----------
HIST ID HYBRID 26 20
DB12cselect ENDPOINT_VALUE, ENDPOINT_NUMBER,
ENDPOINT_REPEAT_COUNT RPT_CNT
	 from dba_histograms
	 where table_name = ‘HIST’
	 and column_name = ‘ID’;
ENDPOINT_VALUE ENDPOINT_NUMBER RPT_CNT
-------------- --------------- ----------
1 4 4
3 7 1
5 10 1
7 15 3
8 45 30
10 50 2
11 56 6
12 62 6
13 68 6
14 71 3
15 76 5
16 79 3
17 82 3
19 84 1
20 89 5
21 91 2
22 92 1
23 93 1
24 95 2
26 100 2
20 rows selected.
TABLE 3.9
18 www.ukoug.org
SUMMER 15
Technology: Anju Garg
OracleScene
D I G I T A L
Interpreting Hybrid Histogram
•	 ENDPOINT_VALUE: The largest value in a bucket
•	 ENDPOINT_NUMBER: Cumulative frequency.
	 The difference of 2 consecutive ENDPOINT_NUMBER’s gives
	 the bucket size.
•	 ENDPOINT_REPEAT_COUNT: Frequency of endpoint
Based on the above information, data has been arranged in buckets as shown in fig 3.3. It can be seen that Hybrid histogram
captures more endpoints (20 = Nb) as compared to Height Balanced histogram (14) and can estimate their cardinality accurately.
Thus, it is evident that Hybrid histograms have features of both frequency and height balanced histograms. Features similar to
frequency histograms:
•	 All occurrences of a value are placed in one bucket
•	 ENDPOINT_NUMBER stores cumulative frequency
Features similar to height-balanced histograms:
•	 One bucket can contain multiple values.
FIGURE 3.3
Summary
•	 In 12c, a frequency histogram can be created for NDV = 2048.
•	 Top frequency and hybrid histograms are designed to overcome flaws of height-balanced histograms.
•	 Top frequency and hybrid histograms are created only if ESTIMATE_PERCENT = AUTO_SAMPLE_SIZE.
•	Top frequency histograms accurately estimate the frequencies for only top occurring values if a small number of values
dominate the data set.
•	 Hybrid histograms have features of both frequency and height-balanced histograms
•	Hybrid histograms capture more end points as compared to height-balanced histograms and estimate their frequency
accurately.
References
•	 http://docs.oracle.com/database/121/TGSQL/tgsql_histo.htm#TGSQL366
•	 http://jimczuprynski.files.wordpress.com/2014/04/czuprynski-select-q2-2014.pdf
•	 http://jonathanlewis.wordpress.com/2013/09/01/histograms/
ABOUT
THE
AUTHOR
Anju Garg
Corporate Trainer
Anju Garg is an Oracle Ace Associate with over 12 years of experience in the IT industry in
various roles. Since 2010, she has been involved in teaching and has trained more than
100 DBAs from across the world in various core DBA technologies like RAC, Data guard,
Performance Tuning, SQL statement tuning, Database Administration etc. Anju is
passionate about learning and has a keen interest in RAC and Performance Tuning,
sharing her knowledge via her technical blog.
Blog: 	 http://oracleinaction.com

More Related Content

PDF
Histograms : Pre-12c and Now
PPTX
Adaptive Query Optimization in 12c
PPT
Do You Know The 11g Plan?
PPTX
Adapting to Adaptive Plans on 12c
PPTX
SQL Plan Directives explained
PDF
12c SQL Plan Directives
PDF
Histograms in MariaDB, MySQL and PostgreSQL
PDF
Query Optimizer in MariaDB 10.4
Histograms : Pre-12c and Now
Adaptive Query Optimization in 12c
Do You Know The 11g Plan?
Adapting to Adaptive Plans on 12c
SQL Plan Directives explained
12c SQL Plan Directives
Histograms in MariaDB, MySQL and PostgreSQL
Query Optimizer in MariaDB 10.4

What's hot (19)

PDF
Optimizer Trace Walkthrough
ODP
Basic Query Tuning Primer - Pg West 2009
PPTX
Oracle 12c SPM
PPTX
A few things about the Oracle optimizer - 2013
PDF
Oracle Parallel Distribution and 12c Adaptive Plans
PDF
Explaining the Postgres Query Optimizer
 
PDF
Using histograms to get better performance
PDF
Lessons for the optimizer from running the TPC-DS benchmark
PPTX
Full Table Scan: friend or foe
ODP
The PostgreSQL Query Planner
PDF
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
PDF
SQL Macros - Game Changing Feature for SQL Developers?
PDF
Oracle statistics by example
PDF
Optimizer features in recent releases of other databases
PDF
ANALYZE for Statements - MariaDB's hidden gem
PDF
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
PDF
How the Postgres Query Optimizer Works
 
PPTX
CBO Basics: Cardinality
DOCX
Checking clustering factor to detect row migration
Optimizer Trace Walkthrough
Basic Query Tuning Primer - Pg West 2009
Oracle 12c SPM
A few things about the Oracle optimizer - 2013
Oracle Parallel Distribution and 12c Adaptive Plans
Explaining the Postgres Query Optimizer
 
Using histograms to get better performance
Lessons for the optimizer from running the TPC-DS benchmark
Full Table Scan: friend or foe
The PostgreSQL Query Planner
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
SQL Macros - Game Changing Feature for SQL Developers?
Oracle statistics by example
Optimizer features in recent releases of other databases
ANALYZE for Statements - MariaDB's hidden gem
Common Table Expressions in MariaDB 10.2 (Percona Live Amsterdam 2016)
How the Postgres Query Optimizer Works
 
CBO Basics: Cardinality
Checking clustering factor to detect row migration
Ad

Viewers also liked (16)

PDF
Oracle ACFS High Availability NFS Services (HANFS) Part-I
PPTX
Oracle ACFS High Availability NFS Services (HANFS)
PPT
Adop and maintenance task presentation 151015
PPTX
AOUG_11Nov2016_Challenges_with_EBS12_2
PPTX
E business suite r12.2 changes for database administrators
PDF
Oracle RAC, Data Guard, and Pluggable Databases: When MAA Meets Multitenant (...
PPT
Oracle Weblogic for EBS and obiee (R12.2)
PDF
Policy based cluster management in oracle 12c
PDF
Indexes and Indexing in Oracle 12c
PDF
Oracle E-Business Suite R12.2.5 on Database 12c: Install, Patch and Administer
PDF
EBS-technical_upgrade_best_practices 12.1 or 12.2
PPTX
Oracle EBS Upgrade to 12.2.5.1
PDF
Oracle e-business suite R12 step by step Installation
PPTX
Oracle EBS R12.2 - Deployment and System Administration
PPT
Bill Gates, Who is he?
PDF
Oracle RAC 12c Release 2 - Overview
Oracle ACFS High Availability NFS Services (HANFS) Part-I
Oracle ACFS High Availability NFS Services (HANFS)
Adop and maintenance task presentation 151015
AOUG_11Nov2016_Challenges_with_EBS12_2
E business suite r12.2 changes for database administrators
Oracle RAC, Data Guard, and Pluggable Databases: When MAA Meets Multitenant (...
Oracle Weblogic for EBS and obiee (R12.2)
Policy based cluster management in oracle 12c
Indexes and Indexing in Oracle 12c
Oracle E-Business Suite R12.2.5 on Database 12c: Install, Patch and Administer
EBS-technical_upgrade_best_practices 12.1 or 12.2
Oracle EBS Upgrade to 12.2.5.1
Oracle e-business suite R12 step by step Installation
Oracle EBS R12.2 - Deployment and System Administration
Bill Gates, Who is he?
Oracle RAC 12c Release 2 - Overview
Ad

Similar to Histograms: Pre-12c and now (20)

PDF
On Seeing Double in V$SQL_Thomas_Kytepdf
PPTX
Ground Breakers Romania: Explain the explain_plan
PPT
Myth busters - performance tuning 101 2007
PPTX
Structure Query Language Advance Training
PPTX
PDF
Databricks Sql cheatseet for professional exam
PPT
Less08 Schema
PPTX
Oraclesql
PDF
Polymorphic Table Functions in 18c
ODP
Basic Query Tuning Primer
PPT
11 Things About 11gr2
DOCX
Trig
PDF
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
DOC
SQLQueries
PPT
Oracle tips and tricks
PDF
Advanced tips of dbms statas
PPT
Select To Order By
DOCX
Exploring collections with example
On Seeing Double in V$SQL_Thomas_Kytepdf
Ground Breakers Romania: Explain the explain_plan
Myth busters - performance tuning 101 2007
Structure Query Language Advance Training
Databricks Sql cheatseet for professional exam
Less08 Schema
Oraclesql
Polymorphic Table Functions in 18c
Basic Query Tuning Primer
11 Things About 11gr2
Trig
【Maclean liu技术分享】拨开oracle cbo优化器迷雾,探究histogram直方图之秘 0321
SQLQueries
Oracle tips and tricks
Advanced tips of dbms statas
Select To Order By
Exploring collections with example

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
project resource management chapter-09.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Hybrid model detection and classification of lung cancer
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
August Patch Tuesday
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
Tartificialntelligence_presentation.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Zenith AI: Advanced Artificial Intelligence
Encapsulation theory and applications.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Encapsulation_ Review paper, used for researhc scholars
Agricultural_Statistics_at_a_Glance_2022_0.pdf
project resource management chapter-09.pdf
Unlocking AI with Model Context Protocol (MCP)
Univ-Connecticut-ChatGPT-Presentaion.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Hybrid model detection and classification of lung cancer
1 - Historical Antecedents, Social Consideration.pdf
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
A novel scalable deep ensemble learning framework for big data classification...
August Patch Tuesday
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
A comparative study of natural language inference in Swahili using monolingua...
Tartificialntelligence_presentation.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Zenith AI: Advanced Artificial Intelligence

Histograms: Pre-12c and now

  • 1. Header here www.ukoug.org 11 Histograms: Histograms are used by the optimizer to compute the selectivity of filter and join predicates in case of skewed data distribution. Prior to 12c, two types of histograms could be created: frequency histograms and height-balanced histograms. 12c introduces top frequency and hybrid histograms which are designed to overcome the limitations of their precursors. This article discusses the need for histograms, the interpretation of various types of histograms and the evolution of histograms from 11g to 12c. Anju Garg, Corporate Trainer Pre-12c & Now Need for Histograms When a SQL statement is issued, the optimizer generates an optimum execution plan based on the information available to it. If data is uniformly distributed across various values in a column and table statistics have been gathered, the optimizer estimates cardinality (row count) accurately and makes correct decision with respect to access method, join order and join method to be used. But if data distribution is skewed, the optimizer might make an incorrect estimate for the cardinality and choose a bad execution path. For example, consider a table HR.HIST having a skewed data distribution in column ID as shown in Figure 1.1. Pre-12c Histograms Prior to Oracle 12c, two types of histograms could be created (as shown in Figure 1.2): - Frequency histograms - Height-balanced histograms Technology FIGURE 1.1 FIGURE 1.2
  • 2. 12 www.ukoug.org SUMMER 15 Technology: Anju Garg OracleScene D I G I T A L Frequency Histograms A frequency histogram is a frequency distribution which records each different value and its exact cardinality. A frequency histogram is created when - Requested no. of buckets (Nb) = No. of distinct values (NDV) and - NDV = 254 (2,048 in 12c). A frequency histogram with 26 buckets for ID column can be created as under: TABLE 2.1 SQLexec dbms_stats.gather_table_stats - (ownname = ‘HR’,tabname = ‘HIST’,method_opt = ‘FOR COLUMNS ID’, cascade = true); SQL select table_name, column_name, histogram, num_distinct, num_buckets from dba_tab_col_statistics where table_name = ‘HIST’ and column_name = ‘ID’; TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS ---------- --------------- --------------- ------------ ----------- HIST ID FREQUENCY 26 26 The histogram can be viewed from DBA_HISTOGRAMS (as shown in Table 2.2) SQL select ENDPOINT_VALUE, ENDPOINT_NUMBER from dba_histograms where table_name = ‘HIST’ and column_name = ‘ID’; ENDPOINT_VALUE ENDPOINT_NUMBER -------------- --------------- 1 4 2 6 3 7 4 9 5 10 6 12 7 15 8 65 9 68 10 70 11 76 12 82 13 88 14 91 15 96 16 99 17 102 18 103 19 104 20 109 21 111 22 112 23 113 24 115 25 118 26 120 TABLE 2.2 FIGURE 2.2 Interpreting Frequency Histogram It can be seen from Table 2.2 that a frequency histogram with 26 buckets, one for each distinct value, has been created. • ENDPOINT_VALUE - The value in a bucket. • ENDPOINT_NUMBER - Cumulative frequency Thus, you can find out the exact counts for each of the distinct values in the data. For example, the optimizer makes an accurate estimate of 50 rows for ID = 8 and uses FTS access path as desired (Table 2.3) even though column ID is indexed. TABLE 2.3 SQLexplain plan for select * from hr.hist where id = 8; select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ---------------------------------------------------------------------------------Plan hash value: 538080257 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 50 | 50200 | 7 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| HIST | 50 | 50200 | 7 (0)| 00:00:01 | --------------------------------------------------------------------------
  • 3. Technology: Anju Garg www.ukoug.org 13 Thus, prior to 12c, frequency histograms could be used to accurately estimate the frequencies if NDV = 254. Height-balanced Histograms A height-balanced histogram is created if NDV 254 or Nb NDV. This histogram distributes the count of all rows evenly across all histogram buckets, so all buckets will have almost exactly the same number of rows. A height-balanced histogram is much less precise and can’t really capture information about more than 127 popular values. To create height balanced histogram, specify no. of buckets = 20 ( NDV (=26) ) DB11gexec dbms_stats.gather_table_stats - (ownname = ‘HR’, tabname = ‘HIST’,method_opt = ‘FOR COLUMNS ID size 20’, cascade = true); It can be seen that the height-balanced histogram has been created as No. of buckets (20) NDV (26) (Table 2.4). DB11gselect table_name, column_name, histogram, num_distinct, num_buckets from dba_tab_col_statistics where table_name = ‘HIST’ and column_name = ‘ID’; TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS ---------- --------------- --------------- ------------ ----------- HIST ID HEIGHT BALANCED 26 20 TABLE 2.4 The height-balanced histogram that has been created can be viewed in Table 2.5. DB11gselect ENDPOINT_VALUE, ENDPOINT_NUMBER from dba_histograms where table_name = ‘HIST’ and column_name = ‘ID’; ENDPOINT_VALUE ENDPOINT_NUMBER -------------- -------------- 1 0 2 1 6 2 8 10 9 11 11 12 12 13 13 14 14 15 15 16 17 17 20 18 24 19 26 20 14 rows selected. TABLE 2.5 FIGURE 2.3 Interpreting Height-balanced Histogram • Bucket size = Total no. of rows / Nb = 120 / 20 = 6 • ENDPOINT_NUMBER - A number uniquely identifying a bucket • For bucket with ENDPOINT_NUMBER 0, ENDPOINT_VALUE = the lowest value (1 here) • For buckets with ENDPOINT_NUMBER 0, ENDPOINT_VALUE = largest value stored in that bucket Note that when storing the histogram selection, Oracle doesn’t store repetitions of end point values. If there are multiple buckets with same end points, only one bucket is stored with its highest end point number. For example, there are 8 buckets (3 - 10) containing the value 8. The histogram stores only one entry with the highest ENDPOINT_NUMBER, i.e. 10. The optimizer decides the popularity of a value by the number of buckets having that value as its end point. Since value 8 is the endpoint of multiple buckets, it is considered as a popular value. The cardinality of a popular value is derived as the product of bucket size and the number of buckets having the value as their end point. For example, cardinality for value 8 = no. of buckets having 8 as end point * bucket size i.e. 8 * 6 = 48 (actual = 50).
  • 4. 14 www.ukoug.org SUMMER 15 Technology: Anju Garg OracleScene D I G I T A L TABLE 2.6 DB11g explain plan for select * from hr.hist where id =8; select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------- Plan hash value: 538080257 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 48 | 48192 | 7 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL| HIST | 48 | 48192 | 7 (0)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(“ID”=8) If we search for an unpopular value i.e. the value which is not an end point or is the end point of only one bucket, the optimizer calculates the cardinality as (number of rows in table)*density where density is calculated by the optimizer using an internal algorithm based on factors such as the number of buckets and the NDV. For example, consider two unpopular values: ID = 15 occurs 5 times and is an end point of one bucket ID = 3 occurs once and is not an end point It can be seen that the number of rows estimated for both the unpopular values is same i.e. 3 (Table 2.7 and Table 2.8). TABLE 2.7 DB11gexplain plan for select * from hr.hist where id = 15; select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------- Plan hash value: 4058847011 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 3012 | 2 (0)| 00:00:01 | 1 | TABLE ACCESS BY INDEX ROWID| HIST | 3 | 3012 | 2 (0)| 00:00:01 |* 2 | INDEX RANGE SCAN | HIST_IDX | 3 | | 1 (0)| 00:00:01 --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(“ID”=15) TABLE 2.8 DB11gexplain plan for select * from hr.hist where id = 3; select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------- Plan hash value: 4058847011 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |Cost (%CPU)| Time --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 3012 | 2 (0)| 00:00:01 | 1 | TABLE ACCESS BY INDEX ROWID| HIST | 3 | 3012 | 2 (0)| 00:00:01 |* 2 | INDEX RANGE SCAN | HIST_IDX | 3 | | 1 (0)| 00:00:01 --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access”ID”=3) Hence, it can be inferred that height-balanced histograms simply decide the cardinality of value based on the popularity of a value which depends on the number of buckets having the value as end point. Issues with Histograms in 11g - Frequency histograms are accurate, but can be created only for NDV = 254 - Height-balanced histograms may cause the optimizer to choose a suboptimal plan in cases where a value is an end point of only one bucket, but almost fills up another bucket. In such a scenario the value might be considered unpopular.
  • 5. Technology: Anju Garg www.ukoug.org 15 In 12c, frequency histograms can be created for up to 2048 distinct values, which implies that we can now have accurate cardinality estimates for a large range of NDVs. Moreover, two new types of histograms have been introduced: Top-n-frequency and hybrid, which aim at resolving the misestimates cropping up due to use of height balanced histograms. Top Frequency Histograms If a small number of distinct values dominate the data set, the database performs a full table scan and creates a top frequency histogram by using the small number of extremely popular distinct values. A top frequency histogram can produce a better histogram for highly popular values by ignoring statistically insignificant unpopular values. The decision whether data is dominated by popular values is made based on a threshold p which is defined as (1-(1/Nb))*100 where Nb = No. of buckets. If percentage of rows occupied by the top Nb frequent values is equal to or greater than threshold p, a top frequency histogram is created else a hybrid histogram will be created. Threshold p for 20 buckets can be calculated as: p = (1 - (1/Nb))*100 = (1 - (1/20))*100 = 95.0 There are 120 rows in table HR.HIST. Hence a top frequency histogram will be created if the top 20 most popular values occupy more than 95% of rows. i.e. 114 rows. As can be seen from Table 3.1, there are 114 rows having ID’s occurring top 20 times. Hence, a top frequency histogram is created (Table 3.2), in this case when statistics are gathered for bucket size = 20 and ESTIMATE_PERCENT = AUTO_SAMPLE_SIZE (default). TABLE 3.2 DB12cexec dbms_stats.gather_table_stats - (ownname = ‘HR’, tabname = ‘HIST’, method_opt = ‘FOR COLUMNS ID size 20’, cascade = true); select table_name, column_name, histogram, num_distinct, num_buckets from dba_tab_col_statistics where table_name = ‘HIST’ and column_name = ‘ID’; TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS ---------- --------------- --------------- ------------ ----------- HIST ID TOP-FREQUENCY 26 20 The top-frequency histogram can be queried from dba_histograms as in Table 3.3. FIGURE 3.1 TABLE 3.1 SQLselect sum (cnt) from (select id, count(*) cnt from hr.hist group by id order by count(*) desc) where rownum = 20; SUM(CNT) ---------- 114
  • 6. 16 www.ukoug.org SUMMER 15 Technology: Anju Garg OracleScene D I G I T A L DB12cselect ENDPOINT_VALUE, ENDPOINT_NUMBER from dba_histograms where table_name = ‘HIST’ and column_name = ‘ID’; ENDPOINT_VALUE ENDPOINT_NUMBER -------------- --------------- 1 4 2 6 4 8 6 10 7 13 8 63 9 66 10 68 11 74 12 80 13 86 14 89 15 94 16 97 17 100 20 105 21 107 24 109 25 112 26 114 20 rows selected TABLE 3.3 FIGURE 3.2 Interpreting Top Frequency Histogram • ENDPOINT_VALUE represents key value (ID) • ENDPOINT_NUMBER represents cumulative frequency • Since NDV (26) Nb (20), only 20 values are captured which occur most frequently • Frequencies of least occurring 6 values (bottom 5%) have not been stored It can be seen that a top frequency histogram makes an accurate cardinality estimate for both id = 15 (Table 3.4) and 3 (Table 3.5) which were considered non-popular values in the height-balanced histogram. TABLE 3.4 DB12cexplain plan for select * from hr.hist where id = 15; select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------- Plan hash value: 3950962134 --------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 |5020 | 2 (0)| 00:00:01 | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| HIST | 5 |5020 | 2 (0)| 00:00:01 |* 2 | INDEX RANGE SCAN |HIST_IDX| 5 | | 1 (0)| 00:00:01 --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(“ID”=15) TABLE 3.5 DB12cexplain plan for select * from hr.hist where id = 3; select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------- Plan hash value: 3950962134 ---------------------------------------------------------------------------------| Id | Operation | Name |Rows| Bytes|Cost(%CPU)|Time --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 |1004 | 2 (0)| 00:00:01 | 1 | TABLE ACCESS BY INDEX ROWID BATCHED|HIST | 1 |1004 | 2 (0)| 00:00:01 |* 2 | INDEX RANGE SCAN |HIST_IDX| 1 | | 1 (0)| 00:00:01 --------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(“ID”=3) Thus the problem with height balanced histograms, of not being able to estimate the frequency of unpopular values accurately, has been resolved by top frequency histograms in cases when a small number of distinct values dominate the majority of distinct values. This histogram is gathered using a full table scan of a table. The occurrences of popular values are accurately captured at the expense of not capturing the data for least occurring values.
  • 7. Technology: Anju Garg www.ukoug.org 17 Hybrid Histograms A hybrid histogram is so called as it combines the characteristics of both height-based histograms and frequency histograms. As we saw earlier, the height-balanced histogram may produce inaccurate estimates for: • a value that is not an end point • a value that is an end point of only one bucket • a value that is an end point of multiple buckets and almost fills up the last bucket A hybrid histogram attempts to overcome above shortcomings as it has following features: • For each end point in the histogram, it stores the ENDPOINT_REPEAT_COUNT value, which is the number of times the end point value is repeated. Thus, it has an accurate frequency of end point values. • As compared to a height-balanced histogram where a value having frequency greater than bucket size could be spread across multiple buckets, a hybrid histogram stores all the occurrences of every value in the same bucket, i.e. a value cannot span multiple buckets. As a result, it can capture more end points. • Similar to a height-balanced histogram, a bucket in a hybrid histogram can contain more than one value. An after effect of this implementation is variable bucket size. Since each value possibly having a different frequency will be contained entirely in one bucket only and one bucket can even have more than one value, buckets of different size may result. A histogram with 20 buckets will be created as a hybrid histogram if rows having top 20 most popular IDs are less than threshold p for 20 buckets. p = (1 - (1/nb))*100 = (1 - (1/20))*100 = 95.0 On deleting 20 rows with ID = 8 from table HR.HIST, it qualifies for hybrid histogram creation as no. of rows having id’s occurring top 20 times = 94 (Table 3.7) which is less than 95% of rows. i.e. 95 rows. TABLE 3.6 DB12cdelete from hr.hist where id = 8 and rownum =20; commit; select count(*) from hr.hist; COUNT(*) ---------- 100 TABLE 3.7 DB12cselect sum (cnt) from (select id, count(*) cnt from hr.hist group by id order by count(*) desc) where rownum = 20; SUM(CNT) ---------- 94 It can be seen from that, a hybrid histogram with 20 buckets has been created (Table 3.8 and Table 3.9). TABLE 3.8 DB12cexec dbms_stats.gather_table_stats - (ownname = ‘HR’,tabname = ‘HIST’, method_opt = ‘FOR COLUMNS ID size 20’, cascade = true); DB12c select table_name, column_name, histogram, num_distinct, num_buckets from dba_tab_col_statistics where table_name = ‘HIST’ and column_name = ‘ID’; TABLE_NAME COLUMN_NAME HISTOGRAM NUM_DISTINCT NUM_BUCKETS ---------- --------------- --------------- ------------ ----------- HIST ID HYBRID 26 20 DB12cselect ENDPOINT_VALUE, ENDPOINT_NUMBER, ENDPOINT_REPEAT_COUNT RPT_CNT from dba_histograms where table_name = ‘HIST’ and column_name = ‘ID’; ENDPOINT_VALUE ENDPOINT_NUMBER RPT_CNT -------------- --------------- ---------- 1 4 4 3 7 1 5 10 1 7 15 3 8 45 30 10 50 2 11 56 6 12 62 6 13 68 6 14 71 3 15 76 5 16 79 3 17 82 3 19 84 1 20 89 5 21 91 2 22 92 1 23 93 1 24 95 2 26 100 2 20 rows selected. TABLE 3.9
  • 8. 18 www.ukoug.org SUMMER 15 Technology: Anju Garg OracleScene D I G I T A L Interpreting Hybrid Histogram • ENDPOINT_VALUE: The largest value in a bucket • ENDPOINT_NUMBER: Cumulative frequency. The difference of 2 consecutive ENDPOINT_NUMBER’s gives the bucket size. • ENDPOINT_REPEAT_COUNT: Frequency of endpoint Based on the above information, data has been arranged in buckets as shown in fig 3.3. It can be seen that Hybrid histogram captures more endpoints (20 = Nb) as compared to Height Balanced histogram (14) and can estimate their cardinality accurately. Thus, it is evident that Hybrid histograms have features of both frequency and height balanced histograms. Features similar to frequency histograms: • All occurrences of a value are placed in one bucket • ENDPOINT_NUMBER stores cumulative frequency Features similar to height-balanced histograms: • One bucket can contain multiple values. FIGURE 3.3 Summary • In 12c, a frequency histogram can be created for NDV = 2048. • Top frequency and hybrid histograms are designed to overcome flaws of height-balanced histograms. • Top frequency and hybrid histograms are created only if ESTIMATE_PERCENT = AUTO_SAMPLE_SIZE. • Top frequency histograms accurately estimate the frequencies for only top occurring values if a small number of values dominate the data set. • Hybrid histograms have features of both frequency and height-balanced histograms • Hybrid histograms capture more end points as compared to height-balanced histograms and estimate their frequency accurately. References • http://docs.oracle.com/database/121/TGSQL/tgsql_histo.htm#TGSQL366 • http://jimczuprynski.files.wordpress.com/2014/04/czuprynski-select-q2-2014.pdf • http://jonathanlewis.wordpress.com/2013/09/01/histograms/ ABOUT THE AUTHOR Anju Garg Corporate Trainer Anju Garg is an Oracle Ace Associate with over 12 years of experience in the IT industry in various roles. Since 2010, she has been involved in teaching and has trained more than 100 DBAs from across the world in various core DBA technologies like RAC, Data guard, Performance Tuning, SQL statement tuning, Database Administration etc. Anju is passionate about learning and has a keen interest in RAC and Performance Tuning, sharing her knowledge via her technical blog. Blog: http://oracleinaction.com