SlideShare a Scribd company logo
Cost Based Optimizer – 2 of 2 Hotsos Enterprises, Ltd. Grapevine, Texas Oracle.  Performance.  Now. [email_address]
Agenda Cost Based Optimizer and its impact on performance Skewed Data Histograms Impact Performance (Logical I/O Impact) Performance (Join Strategy) Bind Variables Cardinality and Cost Conclusion
Cost Based Optimizer
Cost Based Optimizer (CBO) The CBO in reality is a complex decision making software Use several Database Initialization Parameters These are listed in the 10053 trace file Uses several session level initialization parameter These are parameters at the session level that override the database initialization parameters Uses statistics about the objects (Tables, Indexes) Hints to the optimizer Uses Statistics about the system (CPU, Disk etc) Use this information and makes decisions on the “best way” to generate an execution plan Use Information about the skew of the column if that information is gathered
CBO will be part of your life if you keep working with Oracle. The cost-based query optimizer (CBO)… Uses data from a variety of sources Estimates the costs of several execution plans Chooses the plan it estimates to be the least expensive Characteristics Adapts to changing circumstances Frustrating if you don’t know what it considers as input Works great if you know how to use it But produces very poor results if you lie to it The only query optimizer supported by Oracle Corporation from release 10 onward
The cost-based query optimizer chooses the plan that it computes as having the lowest estimated cost. Don’t assume the following are identical CBO’s estimated cost of an execution plan The actual cost of an execution plan CBO’s cost estimate can be imperfect Are your CBO inputs perfect? CBO isn’t perfect, but by 9.2 it’s almost always good enough Without properly collected statistics, the CBO will use RBO if no statistics exist on any object in the statement use default statistics if statistics exist for a single object in the statement but not others use dynamic sampling to generate statistics (based on parameter setting and Oracle version)
Cost Based Optimizer
Execution plan changes can result in profoundly different application performance. Table size change Device latency change Execution plan change Type C performance changes are the most profound size change performance change performance change performance change
Recap The CBO is a complex piece of software It uses several data points to calculate the cost of the execution plan and will choose the plan with the lowest cost It is dynamic and will adapt to changing data better than the Rule Based Optimizer A good understanding of the Cost Based Optimizer is imperative in understanding the rationale behind some of the choices
Skewed Data
Skewed Data Skewed Data is where the data distribution is not uniform A good example is the owner column for dba_objects The column is highly skewed Select owner,count(*) from dba_objects  Group by owner;
Some kinds of data skew naturally; some don’t. Guaranteed to be skewed E.g., status attribute (open | closed) of a sales order table Possibly not skewed E.g., sale date attribute of a sales order table
Histograms
What are the costs and benefits of histograms? Benefits of histograms CBO sometimes needs the information to make good decisions Costs of histograms Computing histograms will consume extra computing capacity during the statistics collection Some CPU time and extra latching is required during plan determination for the optimizer to consider histograms
Histograms provide the optimizer with better information from which to derive an execution plan for a query. A histogram is a graphic representation of frequency distribution by means of rectangles whose widths represent class intervals and whose heights represent corresponding frequencies Oracle implements histograms in two ways Height-balanced – created if column  NDV  >  SIZE Frequency – created if column  NDV  <=  SIZE
Types of Histograms Frequency Every distinct value in the column will have a count of how many occurrences of that value Height Balanced Histograms All histogram entries will have the same value but a range for the columns will be used
Frequency Histogram
Height Balanced Histogram
Histograms can be gathered by setting the parameter for  METHOD_OPT . For a specific column: FOR COLUMNS column_x SIZE <n|REPEAT|AUTO|SKEWONLY> For all the columns in a table: FOR ALL COLUMNS For only the columns that have an index: FOR ALL INDEXED COLUMNS EXEC DBMS_STATS.GATHER_TABLE_STATS( ownname=>'OP', tabname=>'my_table',  method_opt=>'FOR COLUMNS column_x SIZE 10')
Histograms are not useful in all cases. Histograms are not useful for columns with the following characteristics: All (or most) predicates on the column use bind variables The column data is uniformly distributed The column is unique and is used only with equality predicates Data distribution changes frequently and statistics aren't collected to match
Even in the most recent Oracle versions, histogram optimization doesn’t completely work with bind variables. Oracle version 8 Use of bind variables prohibits histogram optimization Oracle version 9 and above Oracle query optimizer “peeks” at bind value to use histogram optimization But only on initial hard parse of a query
Be prepared for how application developers might have worked around skew problems. The old-fashioned RBO technique Create the index Hard-code the selective query with “ status=1 ” Hard-code the un-selective query with “ status+0=1 ” A CBO technique Create the index Hard-code the selective query with  /*+ index(t) */ Hard-code the un-selective query with  /*+ full(t) */ Don’t resort to either of these!
Where Histogram Information is Stored DBA_TAB_HISTOGRAMS DBA_TAB_COL_STATISTICS
Demo Histogram Data Dictionary Tables
Impact Performance in terms of Logical I/O’s
Demo Cardinality
Demo Join Cardinality
Recap Histograms can be really useful when gathered on skewed columns Histograms are specific to your data and version Test it out and prove that gathering histograms is beneficial Be careful of bind variable substitutions as histograms may not be used

More Related Content

PPT
PPT
SQL Optimization With Trace Data And Dbms Xplan V6
PPT
Cost Based Optimizer - Part 1 of 2
PDF
Explaining the explain_plan
PPTX
Ground Breakers Romania: Explain the explain_plan
PPTX
Stored procedure tuning and optimization t sql
PDF
How to Analyze and Tune MySQL Queries for Better Performance
PPTX
Part3 Explain the Explain Plan
SQL Optimization With Trace Data And Dbms Xplan V6
Cost Based Optimizer - Part 1 of 2
Explaining the explain_plan
Ground Breakers Romania: Explain the explain_plan
Stored procedure tuning and optimization t sql
How to Analyze and Tune MySQL Queries for Better Performance
Part3 Explain the Explain Plan

What's hot (19)

PPTX
Part2 Best Practices for Managing Optimizer Statistics
PPT
Overview of query evaluation
PDF
How to Analyze and Tune MySQL Queries for Better Performance
PPT
Chapter15
PDF
Brad McGehee Intepreting Execution Plans Mar09
PPTX
Honey I Shrunk the Database
PDF
How to analyze and tune sql queries for better performance vts2016
PDF
MySQL Optimizer Cost Model
PPTX
How to understand and analyze Apache Hive query execution plan for performanc...
PPTX
SQL Server 2016 Query store
PPTX
Part4 Influencing Execution Plans with Optimizer Hints
PDF
phoenix-on-calcite-nyc-meetup
DOCX
Stacks
PPTX
02 database oprimization - improving sql performance - ent-db
PDF
8 query processing and optimization
PDF
How to analyze and tune sql queries for better performance percona15
PPT
Augustus Overview Open Source Analytics
PDF
Tech Talk - JPA and Query Optimization - publish
DOCX
ETL and pivoting in spark
Part2 Best Practices for Managing Optimizer Statistics
Overview of query evaluation
How to Analyze and Tune MySQL Queries for Better Performance
Chapter15
Brad McGehee Intepreting Execution Plans Mar09
Honey I Shrunk the Database
How to analyze and tune sql queries for better performance vts2016
MySQL Optimizer Cost Model
How to understand and analyze Apache Hive query execution plan for performanc...
SQL Server 2016 Query store
Part4 Influencing Execution Plans with Optimizer Hints
phoenix-on-calcite-nyc-meetup
Stacks
02 database oprimization - improving sql performance - ent-db
8 query processing and optimization
How to analyze and tune sql queries for better performance percona15
Augustus Overview Open Source Analytics
Tech Talk - JPA and Query Optimization - publish
ETL and pivoting in spark
Ad

Viewers also liked (20)

PDF
The Cost Based Optimiser in 11gR2
PPTX
AODV Protocol
PDF
E learningt3 4puketapapahomework2015-3
PPTX
2013 stamps-intro-assembly
PPT
Review Adobe Wallaby
PPT
18 Di Concetta
PPTX
La comunicazione-del-vino-ai-tempi-di-facebook
PDF
Analizador sintáctico de Pascal escrito en Bison
PDF
Top 5 Issues Affecting the HR Profession in Ohio
PPT
MoMoTLV Israel March 2010 - Aviv Revach - Mobile Apps Monetization Overview
PDF
2016 legal seminar for credit professionals
PDF
33 Lead Generation Tips in 33 Minutes
PPT
Velkomst 011210 passivhus nordvest
PDF
2015 Ohio Ballot Issues
PDF
Kegler Brown's 2015 Managing Labor + Employee Relations Seminar
PPTX
Global crisis2011
PDF
How to convert a file to Portable Document format (PDF)?
PDF
pl_global-powers-cons-products-2015
PDF
OSHA Goes On the Attack as the Obama Administration Winds Down: Are You Prepa...
PPTX
2015 ohsu-metagenome
The Cost Based Optimiser in 11gR2
AODV Protocol
E learningt3 4puketapapahomework2015-3
2013 stamps-intro-assembly
Review Adobe Wallaby
18 Di Concetta
La comunicazione-del-vino-ai-tempi-di-facebook
Analizador sintáctico de Pascal escrito en Bison
Top 5 Issues Affecting the HR Profession in Ohio
MoMoTLV Israel March 2010 - Aviv Revach - Mobile Apps Monetization Overview
2016 legal seminar for credit professionals
33 Lead Generation Tips in 33 Minutes
Velkomst 011210 passivhus nordvest
2015 Ohio Ballot Issues
Kegler Brown's 2015 Managing Labor + Employee Relations Seminar
Global crisis2011
How to convert a file to Portable Document format (PDF)?
pl_global-powers-cons-products-2015
OSHA Goes On the Attack as the Obama Administration Winds Down: Are You Prepa...
2015 ohsu-metagenome
Ad

Similar to Cost Based Optimizer - Part 2 of 2 (20)

PPTX
Presentación Oracle Database Migración consideraciones 10g/11g/12c
PPTX
Processes in Query Optimization in (ABMS) Advanced Database Management Systems
PDF
Managing Statistics for Optimal Query Performance
PPTX
PDF
Implementation of query optimization for reducing run time
PPTX
Explain the explain_plan
PPTX
Oracle Query Optimizer - An Introduction
PPTX
Beginners guide to_optimizer
PPTX
Analysis Services Best Practices From Large Deployments
PDF
Brad McGehee Intepreting Execution Plans Mar09
PDF
Cost-Based Optimizer in Apache Spark 2.2
PDF
Data warehousing testing strategies cognos
PDF
Best Practices for Oracle Exadata and the Oracle Optimizer
PPTX
SQL Server 2008 Development for Programmers
PDF
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
DOCX
12 1-man-operation center-ug(2)
PDF
Ps training mannual ( configuration )
PPT
Oracle Sql Tuning
PDF
Presentation v mware roi tco calculator
PPTX
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Processes in Query Optimization in (ABMS) Advanced Database Management Systems
Managing Statistics for Optimal Query Performance
Implementation of query optimization for reducing run time
Explain the explain_plan
Oracle Query Optimizer - An Introduction
Beginners guide to_optimizer
Analysis Services Best Practices From Large Deployments
Brad McGehee Intepreting Execution Plans Mar09
Cost-Based Optimizer in Apache Spark 2.2
Data warehousing testing strategies cognos
Best Practices for Oracle Exadata and the Oracle Optimizer
SQL Server 2008 Development for Programmers
Cost-Based Optimizer in Apache Spark 2.2 Ron Hu, Sameer Agarwal, Wenchen Fan ...
12 1-man-operation center-ug(2)
Ps training mannual ( configuration )
Oracle Sql Tuning
Presentation v mware roi tco calculator
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...

More from Mahesh Vallampati (20)

PDF
Operating a payables shared service organization in oracle cloud oow 2019_v4
PPTX
Oracle BI Publisher to Transform Cloud ERP Reports
PPTX
Cloudy with a chance of 1099
PPTX
Banking on the Cloud
PPTX
Statistical Accounts and Data in Oracle Cloud General Ledger
PDF
Sparse Matrix Manipulation Made easy in an Oracle RDBMS
PDF
The Data Architect Manifesto
PPTX
Five pillars of competency
PDF
Oracle EBS Change Projects Process Flows
PDF
Cutover plan template Tool
PDF
CRM Lead Lifecycle Process
PPTX
Enough Blame for System Performance Issues
PDF
Oracle R12 12.1.3 Legal Entity Data Gathering Template
PDF
ERP Manager meets SDLC and CMMI
PPT
Oracle 11i OID AD Integration
PDF
Generic Backup and Restore Process
PDF
OIC Process Flow V7
PPT
XBRL in Oracle 11i and R12
PDF
Sales Process Flow V4
DOCX
ITP Instance Management Process V2
Operating a payables shared service organization in oracle cloud oow 2019_v4
Oracle BI Publisher to Transform Cloud ERP Reports
Cloudy with a chance of 1099
Banking on the Cloud
Statistical Accounts and Data in Oracle Cloud General Ledger
Sparse Matrix Manipulation Made easy in an Oracle RDBMS
The Data Architect Manifesto
Five pillars of competency
Oracle EBS Change Projects Process Flows
Cutover plan template Tool
CRM Lead Lifecycle Process
Enough Blame for System Performance Issues
Oracle R12 12.1.3 Legal Entity Data Gathering Template
ERP Manager meets SDLC and CMMI
Oracle 11i OID AD Integration
Generic Backup and Restore Process
OIC Process Flow V7
XBRL in Oracle 11i and R12
Sales Process Flow V4
ITP Instance Management Process V2

Recently uploaded (20)

PPTX
A Presentation on Artificial Intelligence
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Cloud computing and distributed systems.
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Encapsulation theory and applications.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
Teaching material agriculture food technology
PPTX
Machine Learning_overview_presentation.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
A Presentation on Artificial Intelligence
Programs and apps: productivity, graphics, security and other tools
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Cloud computing and distributed systems.
MIND Revenue Release Quarter 2 2025 Press Release
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
Encapsulation theory and applications.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Mobile App Security Testing_ A Comprehensive Guide.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Network Security Unit 5.pdf for BCA BBA.
Teaching material agriculture food technology
Machine Learning_overview_presentation.pptx
Review of recent advances in non-invasive hemoglobin estimation
Advanced methodologies resolving dimensionality complications for autism neur...
Building Integrated photovoltaic BIPV_UPV.pdf

Cost Based Optimizer - Part 2 of 2

  • 1. Cost Based Optimizer – 2 of 2 Hotsos Enterprises, Ltd. Grapevine, Texas Oracle. Performance. Now. [email_address]
  • 2. Agenda Cost Based Optimizer and its impact on performance Skewed Data Histograms Impact Performance (Logical I/O Impact) Performance (Join Strategy) Bind Variables Cardinality and Cost Conclusion
  • 4. Cost Based Optimizer (CBO) The CBO in reality is a complex decision making software Use several Database Initialization Parameters These are listed in the 10053 trace file Uses several session level initialization parameter These are parameters at the session level that override the database initialization parameters Uses statistics about the objects (Tables, Indexes) Hints to the optimizer Uses Statistics about the system (CPU, Disk etc) Use this information and makes decisions on the “best way” to generate an execution plan Use Information about the skew of the column if that information is gathered
  • 5. CBO will be part of your life if you keep working with Oracle. The cost-based query optimizer (CBO)… Uses data from a variety of sources Estimates the costs of several execution plans Chooses the plan it estimates to be the least expensive Characteristics Adapts to changing circumstances Frustrating if you don’t know what it considers as input Works great if you know how to use it But produces very poor results if you lie to it The only query optimizer supported by Oracle Corporation from release 10 onward
  • 6. The cost-based query optimizer chooses the plan that it computes as having the lowest estimated cost. Don’t assume the following are identical CBO’s estimated cost of an execution plan The actual cost of an execution plan CBO’s cost estimate can be imperfect Are your CBO inputs perfect? CBO isn’t perfect, but by 9.2 it’s almost always good enough Without properly collected statistics, the CBO will use RBO if no statistics exist on any object in the statement use default statistics if statistics exist for a single object in the statement but not others use dynamic sampling to generate statistics (based on parameter setting and Oracle version)
  • 8. Execution plan changes can result in profoundly different application performance. Table size change Device latency change Execution plan change Type C performance changes are the most profound size change performance change performance change performance change
  • 9. Recap The CBO is a complex piece of software It uses several data points to calculate the cost of the execution plan and will choose the plan with the lowest cost It is dynamic and will adapt to changing data better than the Rule Based Optimizer A good understanding of the Cost Based Optimizer is imperative in understanding the rationale behind some of the choices
  • 11. Skewed Data Skewed Data is where the data distribution is not uniform A good example is the owner column for dba_objects The column is highly skewed Select owner,count(*) from dba_objects Group by owner;
  • 12. Some kinds of data skew naturally; some don’t. Guaranteed to be skewed E.g., status attribute (open | closed) of a sales order table Possibly not skewed E.g., sale date attribute of a sales order table
  • 14. What are the costs and benefits of histograms? Benefits of histograms CBO sometimes needs the information to make good decisions Costs of histograms Computing histograms will consume extra computing capacity during the statistics collection Some CPU time and extra latching is required during plan determination for the optimizer to consider histograms
  • 15. Histograms provide the optimizer with better information from which to derive an execution plan for a query. A histogram is a graphic representation of frequency distribution by means of rectangles whose widths represent class intervals and whose heights represent corresponding frequencies Oracle implements histograms in two ways Height-balanced – created if column NDV > SIZE Frequency – created if column NDV <= SIZE
  • 16. Types of Histograms Frequency Every distinct value in the column will have a count of how many occurrences of that value Height Balanced Histograms All histogram entries will have the same value but a range for the columns will be used
  • 19. Histograms can be gathered by setting the parameter for METHOD_OPT . For a specific column: FOR COLUMNS column_x SIZE <n|REPEAT|AUTO|SKEWONLY> For all the columns in a table: FOR ALL COLUMNS For only the columns that have an index: FOR ALL INDEXED COLUMNS EXEC DBMS_STATS.GATHER_TABLE_STATS( ownname=>'OP', tabname=>'my_table', method_opt=>'FOR COLUMNS column_x SIZE 10')
  • 20. Histograms are not useful in all cases. Histograms are not useful for columns with the following characteristics: All (or most) predicates on the column use bind variables The column data is uniformly distributed The column is unique and is used only with equality predicates Data distribution changes frequently and statistics aren't collected to match
  • 21. Even in the most recent Oracle versions, histogram optimization doesn’t completely work with bind variables. Oracle version 8 Use of bind variables prohibits histogram optimization Oracle version 9 and above Oracle query optimizer “peeks” at bind value to use histogram optimization But only on initial hard parse of a query
  • 22. Be prepared for how application developers might have worked around skew problems. The old-fashioned RBO technique Create the index Hard-code the selective query with “ status=1 ” Hard-code the un-selective query with “ status+0=1 ” A CBO technique Create the index Hard-code the selective query with /*+ index(t) */ Hard-code the un-selective query with /*+ full(t) */ Don’t resort to either of these!
  • 23. Where Histogram Information is Stored DBA_TAB_HISTOGRAMS DBA_TAB_COL_STATISTICS
  • 24. Demo Histogram Data Dictionary Tables
  • 25. Impact Performance in terms of Logical I/O’s
  • 28. Recap Histograms can be really useful when gathered on skewed columns Histograms are specific to your data and version Test it out and prove that gathering histograms is beneficial Be careful of bind variable substitutions as histograms may not be used

Editor's Notes

  • #7: Note that without properly collected statistics, the CBO will do one of two things: if no statistics exist for any object used in the SQL statement, the CBO may use rule-based optimization (prior to v10) or use dynamic sampling if statistics exist for any single object but not others in the SQL statement, the CBO may use a set of default statistics for the object without statistics or use dynamic sampling. CBO default statistics for objects without collected stats (prior to v10…in v10 dynamic sampling is typically used instead of defaults): TABLE SETTING DEFAULT STATISTICS cardinality (number of blocks * (block size – cache layer) / average row length average row length 100 bytes number of blocks 100 or actual value based on the extent map remote cardinality (distrib) 2000 rows remote average row length 100 bytes INDEX SETTING DEFAULT STATISTICS levels 1 leaf blocks 25 leaf blocks/key 1 data blocks/key 1 distinct keys 100 clustering factor 800
  • #9: Plot A illustrates a situation in which the execution plan does not change, but the query response time varies significantly as the number of rows in the table changes. This kind of thing occurs when an application chooses a TABLE ACCESS (FULL) execution plan for a growing table. It’s what causes RBO-based applications to appear fast in a small development environment, but then behave poorly in the production environment. Plot B illustrates the marginal improvement that’s achievable, for example, by distributing an inefficient application’s workload more uniformly across the disks in a disk array. Notice that the execution plan (or “shape of the performance curve”) isn’t necessarily changed by such an operation (although, if the output of dbms_stats.gather_system_statistics changes as a result of the configuration change, then the plan might change). The performance for a given number of rows might change, however, as the plot here indicates. Plot C illustrates what is commonly the most profound type of performance change: an execution plan change. This situation can be caused by a change to any of CBO inputs. For example, an accidental deletion of a segment’s statistics can change a plan from a nice fast plan (depicted by the green curve, which is O(log n)) to a horrifically slow plan (depicted by the red curve, which is O(n 2 )). The phenomenon illustrated in plot C is what has happened when a query that was fast last week now runs for 14 hours without completing before you finally give up and kill the session.
  • #15: Since the CBO determines the selectivity of predicates that appear in queries, it is important that there be adequate information for the CBO to make it&apos;s estimates properly. By gathering histogram data, the CBO can make improved selectivity estimates in the presence of data skew, resulting in optimal execution plans with non-uniform data distributions. The histogram approach provides an efficient and compact way to represent data distributions. Selectivity estimates are used to decide when to use an index and the order in which to join tables. Many table columns are not uniformly distributed. Therefore, the normal calculations for selectivity may not be accurate without the use of histograms.
  • #16: Height-balanced histograms put approximately the same number of values into each interval, so that the endpoints of the interval are determined by the number of values in that interval. Only the last (largest) values in each bucket appear as bucket (end point) values. A height-balanced histogram will be created if the number of histogram buckets ( SIZE ) indicates a value smaller than the number of distinct values in the column. Frequency histograms (sometimes called value-based histograms) are created when the number of histogram buckets ( SIZE ) specified is greater than or equal to the number of distinct column values. In frequency histograms, all the individual values in the column have a corresponding bucket, and the bucket number reflects the repetition count of each value. The type of histogram is stored in the HISTOGRAM column of the *TAB_COL_STATISTICS views. The column can have values of HEIGHT BALANCED, FREQUENCY , or NONE . The SIZE of a histogram can be set by you or automatically by Oracle when the histogram is collected. The default SIZE (when no SIZE is specified) is 75. The maximum SIZE is 255.
  • #20: DBMS_STATS Constants SIZE REPEAT Causes the histograms to be created with the same options as last time you created it. It reads the data dictionary to figure out what to do. SIZE AUTO Oracle looks at the data and using a magical, undocumented and changing algorithm, figures out all by itself what columns to gather stats on and how many buckets and all. It&apos;ll collect histograms in memory only for those columns which are used by your applications (those columns appearing in a predicate involving an equality, range, or like operators). It knows that a particular column was used by an application because at parse time, it will store workload information in SGA. Then it will store histograms in the data dictionary only if it has skewed data (and it worthy of a histogram). SIZE SKEWONLY When you collect histograms with the SIZE option set to SKEWONLY , it collects histogram data in memory for all specified columns (if you do not specify any, all columns are used). Once an &amp;quot;in-memory&amp;quot; histogram is computed for a column, it is stored inside the data dictionary only if it has &amp;quot;popular&amp;quot; values (multiple end-points with the same value which is what is meant by &amp;quot;there is skew in the data&amp;quot;).
  • #22: In Oracle version 8, the use of bind variables in a predicate effectively disables the use of histograms. This is because the optimizer needs to know the value ( WHERE col = &apos;x&apos; ) in order to check the histogram statistics for selectivity for that value. When a bind variable is used, it is not actually bound into the query until execution time. Since the execution plan is determined in the parse phase, the optimizer won&apos;t know the value and thus can&apos;t use the histogram to makes its decision. In Oracle version 9, the optimizer behavior regarding bind variables changed slightly. In version 9, when a query is initially parsed, the optimizer will &amp;quot;peek&amp;quot; at the value of the bind variable and use the value it finds to make decisions. Does that make the situation better or worse? It depends. Let&apos;s say that when the query is initially parsed, it has a bind variable value of 1 being used in the predicate. If the column has a histogram and the histogram indicates that selectivity is low for that value (few values match), then it will likely choose to use an index on that column if available. Everything works well, performance is sub-second and everyone is happy. Now, what happens if the query is executed a 2 nd time but passes the value of 0 in the bind variable (and the selectivity for the value 0 is high…lots of values match). What happens? The original plan is still used and the query will attempt to use the same index. If there are thousands of records in the row source, it is likely that the index scan will perform significantly worse than simply doing a full table scan. In this case, everything works but performance stinks and complaints arise. So, what do you do? For some, the best solution is to not use bind variables when you have a column with a limited number of values and the values are skewed and to just hard-code the value you need. The best way to know what to do is to test different approaches to find what works best for your environment.
  • #23: The RBO workaround is forgivable because it’s all the RBO environment could offer as an option. The CBO technique shown here is particularly bad because it makes the application less flexible and therefore less able to respond appropriately to system changes. Ideally, if you (the developer) already know that data for certain columns tends to skew, you can write code to account for it. A good guideline to follow is to look at the number of distinct values in the column. If the column has only a few distinct values, then hard-coding the value will allow the optimizer to correctly choose the plan based on histogram data. If there are a lot of distinct values, but you know in advance the actual skewed values, you could write conditional code to use a bind variable in all cases except when the known skewed values are requested. In that case, the conditional code would branch to a SQL statement version which hard-codes the skewed value under those circumstances.