שבוע אורקל 2016

Exploring Oracle Database
Performance Tuning Best Practices for
DBAs and Developers

* 39 years old
* Married + 3
* 16 years AS a dba, consultant, instructor, architect.
* CEO @ DBcs ltd.
* Was cto @ johnbryce israel
* Oracle certified professional
* Microsoft sql server certified professional

Agenda
• Oracle Database Architecture Overview
• The connection between SQL tuning & Instance tuning
• The connection between database & operating system
• Common bottlenecks - Drill down
• How do you identify the source of the problem?
• Focusing on benchmark issues: physical IO, logical reads, shared pool, buffer cache
• Solutions: where do you start and what order to work?
• Introduction to SQL and Application Tuning
• The Oracle Optimizer:
• Rule Based Optimization (overview)
• Cost Based Optimization
• The Different Modes of the Cost Based Optimizer
• Execution Plans
• Data Access Methods
• Indexes – Types, Classifications, Advantages & Disadvantages
• Sort Usage Guidelines
• When and What to Tune?
• Clustering factor
• Data Types are Important
• Integrity Constrains are Important
• Reasons for Inefficient SQL Performance
• Using Bind Variables
• Restructuring SQL Statements
• Shared SQL and Cursors
• Advanced SQL and Application Topics

"You have to be constantly
evolving and in some cases
DBAs/Programmers don’t do that because
they know how they did it
years ago and they want to
keep doing it that way..."

Quote from Thomas Kyte's book
if you want a 10 step guide to tuning a query, buy a piece of software. You are not needed in this process, anyone
can put a query in, get a query out and run it to see if it is faster. There are tons of these tools on the market. They
work using rules (heuristics) and can tune maybe 1% of the problem queries out there. They APPEAR to be able to
tune a much larger percent but that is only because the people using these tools never look at the outcome -- hence
they continue to make the same basic mistakes over and over and over.
If you want to really be able to tune the other 99% of the queries out there, knowledge of lots of stuff -- physical
storage mechanisms, access paths, how the optimizer works -that's the only way.
..
..
Think about it for a moment. If there were a 10 step or even 1,000,000 step process by which any query can be
tuned (or even X% of queries for that matter), we would write a program to do it. Oh don't get me wrong, there are
many programs that actually try to do this - Oracle Enterprise Manager with its tuning pack, SQL Navigator and
others. What they do is primarily recommend indexing schemes to tune a query, suggest materialized views, offer to
add hints to the query to try other access plans. They show you different query plans for the same statement and
allow you to pick one. They offer "rules of thumb" (what I generally call ROT since the acronym and the word is
maps to are so appropriate for each other) SQL optimizations - which if they were universally applicable - the
optimizer would do it as a matter of fact. In fact, the cost based optimizer does that already - it rewrites our queries
all of the time. These tuning tools use a very limited set of rules that sometimes can suggest that index or set of
indexes you really should have thought of during your design.

Oracle Database Architecture Overview

Oracle Database Memory Structures: Overview
Background
process
Server
process
Server
process
Redo log
buffer
Database buffer
cache
Shared pool Large pool
Aggregated
PGA
…
Java pool Streams
pool
SGA

Database Buffer Cache
• Is a part of the SGA
• Holds copies of data blocks that are read from data files
• Is shared by all concurrent processes
Database writer
process
Database
buffer
cache
SGA
Data files
DBWn
Server
process

Redo Log Buffer
• Is a circular buffer in the SGA (based on the number of
CPUs)
• Contains redo entries that have the information to redo
changes made by operations, such as DML and DDL
Log writer process
Redo log
buffer
SGA
Redo log
files
LGWR
Server
process

Shared Pool
• Is part of the SGA
– Contains:
• Library cache
• Shared parts of SQL and
PL/SQL statements
• Data dictionary cache
• Result cache:
• SQL queries
• PL/SQL functions
• Control structures
• Locks
SGA
Library
cache
Data
dictionary
cache
(row cache)
Control structures
Result
cache
Server
process
Shared pool

Processing a DML Statement: Example
Database
Data
files
Control
files
Redo
log files
User
process
Shared pool
Redo log
buffer
Server
process 3
5
1 Library cache
2
4
Database
buffer cache
DBWn SGA
2

COMMIT Processing: Example
Database
Data
files
Control
files
Redo
log files
User
process
SGA
Shared pool
Redo log
buffer
Server
process 1
3
Library cache
Database
buffer cache
DBWn
2LGWR
SGA

Program Global Area (PGA)
–PGA is a memory area that contains:
• Session information
• Cursor information
• SQL execution work areas:
• Sort area
• Hash join area
• Bitmap merge area
• Bitmap create area
• Work area size influences SQL performance.
• Work areas can be automatically or manually
managed.
Stack
Space
User Global Area (UGA)
User
Session
Data
Cursor
Status
SQL
Area
Server
process

Background Process Roles
PMON SMON ARCnDBWn LGWRCKPT
Database
buffer
cache
Shared poolSGA Redo log
buffer
MMON CJQ0 QMNnRCBG MMAN

SGA
Java
pool
Fixed SGA
Redo log
buffer
Database
buffer cache
Automatic Shared Memory Management
Which size to choose?
Large poolShared pool
Streams
pool
SGA_TARGET + STATISTICS_LEVEL
Automatically tuned SGA components

Automated SQL Execution Memory Management
Background
process
Server
process
Server
process
… …
…
PGA_AGGREGATE_TARGET
Which size to choose?
Aggregated
PGA

Automatic Memory Management
• Sizing of each memory component is vital for SQL
execution performance.
• It is difficult to manually size each component.
• Automatic memory management automates memory
allocation of each SGA component and aggregated PGA.
Buffercache
Largepool
Sharedpool
Javapool
Streamspool
Private
SQLareas
OtherSGA
Untunable
PGA
FreeMEMORY_TARGET+STATISTICS_LEVEL
MMAN

The connection between SQL tuning & Instance
tuning

Database tuning is the process of tuning the actual database, which
encompasses the allocated memory, disk usage, CPU, I/O, and underlying
database processes.
Tuning a database also involves the management and manipulation of the
database structure itself, such as the design and layout of tables and indexes.
Additionally, database tuning often involves the modification of the database
architecture in order to optimize the use of the hardware resources available.
There are many other considerations when tuning a database, but these tasks
are normally accomplished by the database administrator.
The objective of database tuning is to ensure that the database has been
designed in a way that best accommodates expected activity within the
database.
tuning

SQL tuning is the process of tuning the SQL statements that access
the database.
These SQL statements include database queries and transactional
operations such as inserts, updates, and deletes.
The objective of SQL statement tuning is to formulate statements
that most effectively access the database in its current state, taking
advantage of database and system resources and indexes.
tuning

Both database tuning and SQL statement tuning must be performed to achieve
optimal results when accessing the database.
A poorly tuned database may very well render wasted effort in SQL tuning, and
vice versa.
Ideally, it is best to first tune the database, and then ensure that indexes exist
where needed, and then tune the SQL code.
tuning

The connection between database & operating
system

Question: We are in the process of adopting Oracle and we have many choices
of operating system platforms. Which OS is best for Oracle and how to I compare
operating system environments for Oracle databases?
Answer: That's a very common question. Oracle dominates the database world in
part because it runs on over 60 platforms, everything from a Mainframe to a Mac.
Oracle chose Solaris as their preferred OS in 2005, and later decided to work on
their own Linux distro, making a Oracle Linux OS that is custom-tailored to the
needs of a typical database. Oracle leverages on the advantages of all OS
platforms with an independent OSI, customized to each platform.
As to which UNIX dialect is "best", it's often related to the server environment. For
example, svmon is only available on IBM AIX. . . .
Some, operating systems are better at managing large volumes of data, such as
SUSE, who developed a special kernel, just for Oracle
system

Data integrity features (T10 Protection Information )
Protection Information enables applications or kernel subsystems to attach
metadata to I/O operations, allowing devices that support PI to verify the
integrity before passing them further down the stack and physically
committing them to disk.
Data Integrity Extensions or DIX is a hardware feature that enables
exchange of protection metadata between host operating system and HBA
and helps to avoid corrupt data from being written, allowing a full
end-to-end data integrity check.
system

Zero downtime updates
Make updates to the Linux Operating System (OS) kernel, while it is running,
without a reboot or any interruption.
Only Oracle Linux offers this unique capability, making it possible to keep up with
important Linux kernel updates without burdening you with the operational cost
and disruption of rebooting for every update to the
kernel.
Ksplice allows system administrators to deliver valuable patches for both the
Unbreakable Enterprise Kernel as well as the Red Hat compatible kernel with
lower costs, less downtime, increased security, and greater flexibility and control.
system

Btrfs
File System Btrfs (B-tree file system) is the “next generation file system” for Linux.
Pronounced as “Butter FS” or “B-tree FS”, it is a GPL licensed file system first developed by
Oracle’s Chris Mason in 2007.
Btrfs provides a number of features that make it a very attractive file system solution for
local disk storage.
Btrfs is designed for:
• Large files and file systems from the ground up
• Simplified administration
• Integrated RAID and volume management
• Snapshots
• Checksums for data and meta-data
system

10 common performance issues
Common bottlenecks - Drill down

Not every suggestion is a good suggestion.”
Even if its from the Software provider himself .”
Aaron Shilo 
Once upon a time, Oracle Support had a note called Script: Lists All Indexes that Benefit from a Rebuild (Doc ID
122008.1) which lets just say I didn’t view in a particularly positive light :-) Mainly because it gave dubious advice
which included that indexes should be rebuilt if:
Deleted entries represent 20% or more of current entries
The index depth is more than 4 levels
It then detailed a script that ran a Validate Structure across all indexes in the database that didn’t belong in
either the SYS or SYSTEM schema.
This script basically read through and sequentially locked all tables (maybe multiple times) in the database in
order to list indexes that might not actually need a rebuild while potentially missing out on some that do. I could
write a script that achieved the same result with far less overheads. For example, SELECT index_name FROM
DBA_INDEXES where index_name like ‘A%’ and owner not in (‘SYS’, ‘SYSTEM’) would achieve a very similar result

Posted by Richard Foote in Doc 122008.1, Doc 989093.1, Index Rebuild, Oracle Indexes

Bad connection management
• The application connects and disconnects for each database
interaction.
• This problem is common with stateless middleware in application
servers.
• It has over two orders of magnitude impact on performance, and is
totally unscalable.

Bad use of cursors and the shared pool
• Not using cursors results in repeated parses.
• If bind variables are not used, then there is hard parsing of all SQL
statements.
• This has an order of magnitude impact in performance, and it is totally
unscalable.
• Use cursors with bind variables that open the cursor and execute it
many times.
• Be suspicious of applications generating dynamic SQL.

Bad SQL
• Bad SQL is SQL that uses more resources than appropriate for the
application requirement.
• This can be a decision support systems (DSS) query that runs for
more than 24 hours, or a query from an online application that
takes more than a minute.
• You should investigate SQL that consumes significant system
resources for potential improvement.
• ADDM identifies high load SQL.
• SQL Tuning Advisor can provide recommendations for
improvement.

Use of nonstandard initialization parameters
• These might have been implemented based on poor advice or
incorrect assumptions.
• Most databases provide acceptable performance using only the set
of basic parameters.
• In particular, parameters associated with SPIN_COUNT on latches
and undocumented optimizer features can cause a great deal of
problems that can require considerable investigation.
• Likewise, optimizer parameters set in the initialization parameter file
can override proven optimal execution plans
• For these reasons, schemas, schema statistics, and optimizer
settings should be managed as a group to ensure consistency of
performance.

Getting database I/O wrong
• Many sites lay out their databases poorly over the available disks.
• Other sites specify the number of disks incorrectly, because they
configure disks by disk space and not I/O bandwidth.

Online redo log setup problems
• Many sites run with too few online redo log files and files that
are too small.
• Small redo log files cause system checkpoints to continuously
put a high load on the buffer cache and I/O system.
• If too few redo log files exist, then the archive cannot keep up,
and the database must wait for the archiver to catch up.
All online redo log files should be the same size and configured to switch approximately once an hour during normal
activity. They should switch no more frequently than every 20 minutes during peak activity.
There should be a minimum of four online log groups to prevent LGWR from waiting for a group to be available following a
log switch. A group may be unavailable because a checkpoint has not yet completed or the group has not yet been archived.
http://docs.oracle.com/cd/B12037_01/server.101/b10726/configbp.htm#1006950

Serialization
• Serialization of data blocks in the buffer cache due to lack of free
lists, free list groups, transaction slots (INITRANS), or shortage of
rollback segments.
• This is particularly common on INSERT-heavy applications, in
applications that have raised the block size above 8K, or in
applications with large numbers of active users and few rollback
segments.
• Use automatic segment-space management (ASSM) and
automatic undo management to solve this problem.

Long full table scans
• Long full table scans for high-volume or interactive online operations
could indicate poor transaction design, missing indexes, or poor
SQL optimization.
• Long table scans, by nature, are I/O intensive and unscalable.

High amounts of recursive (SYS) SQL
• Large amounts of recursive SQL executed by SYS could indicate space
management activities, such as extent allocations, taking place.
• This is unscalable and impacts user response time.
• Use locally managed tablespaces to reduce recursive SQL due to
extent allocation.
• Recursive SQL executed under another user ID is probably SQL and
PL/SQL, and this is not a problem.

Deployment and migration errors
• In many cases, an application uses too many resources because the
schema owning the tables has not been successfully migrated from
the development environment or from an older implementation.
• Examples of this are missing indexes or incorrect statistics.
• These errors can lead to sub-optimal execution plans and poor
interactive user performance.
• When migrating applications of known performance, export the
schema statistics to maintain plan stability using the DBMS_STATS
package .
• Although these errors are not directly detected by ADDM, ADDM
highlights the resulting high load SQL.

• Rule Based Optimization (overview)
• Cost Based Optimization
• The Different Modes of the Cost Based
Optimizer
• Execution Plans
• Data Access Methods
• Indexes – Types, Classifications, Advantages &
Disadvantages
• Sort Usage Guidelines
The Oracle Optimizer:

• The optimizer determines the most efficient way to execute
a SQL statement after considering many factors related to
the objects referenced and the conditions specified in the
query.
• This determination is an important step in the processing of
any SQL statement and can greatly affect execution time.
The Oracle Optimizer:

The Oracle Optimizer:SQL Statement Parsing, Overview
Syntactic and semantic check
Privileges check
Allocate private SQL Area
Existing shared
SQL area?
Allocate shared SQL area
Execute statement
No
(Hard parse)
Yes (Soft parse)
Parse
call
Parse operation
(Optimization)
Private
SQL area
Shared
SQL area
Parsed representation

The Oracle Optimizer: Why Do You Need an Optimizer?
SELECT * FROM emp WHERE job =
'MANAGER ';
How can I retrieve these rows?
Use the
index.
Read
each row
and check.
Which one is faster?
Query to optimize
Only 1% of employees are managers
Statistics
Schema
information
Use the
index
1
2
3
Possible access paths
I have a plan!

The Oracle Optimizer: Why Do You Need an Optimizer?
SELECT * FROM emp WHERE job = 'MANAGER ';
How can I retrieve these rows?
Use the
index.
Read
each row
and check.
Which one is faster?
Query to optimize
80% of employees are managers
Statistics
Schema
information
Use Full
Table Scan
Possible access paths
I have a plan!
1
2
3

• Using the RBO, the optimizer chooses an execution plan based on the access paths
available and the ranks of these access paths. Oracle's ranking of the access paths is
heuristic. If there is more than one way to execute a SQL statement, then the RBO
always uses the operation with the lower rank. Usually, operations of lower rank execute
faster than those associated with constructs of higher rank.
The list shows access paths and their ranking:
• RBO Path 1: Single Row by Rowid
• RBO Path 2: Single Row by Cluster Join
• RBO Path 3: Single Row by Hash Cluster Key with Unique or Primary Key
• RBO Path 4: Single Row by Unique or Primary Key
• RBO Path 5: Clustered Join
• RBO Path 6: Hash Cluster Key
• RBO Path 7: Indexed Cluster Key
• RBO Path 8: Composite Index
• RBO Path 9: Single-Column Indexes
• RBO Path 10: Bounded Range Search on Indexed Columns
• RBO Path 11: Unbounded Range Search on Indexed Columns
• RBO Path 12: Sort Merge Join
• RBO Path 13: MAX or MIN of Indexed Column
• RBO Path 14: ORDER BY on Indexed Column
• RBO Path 15: Full Table Scan
The Oracle Optimizer: Rule Based Optimization (overview)

The CBO performs the following steps:
•The optimizer generates a set of potential plans for the SQL statement based on
available access paths and hints.
•The optimizer estimates the cost of each plan based on statistics in the data
dictionary for the data distribution and storage characteristics of the tables,
indexes, and partitions accessed by the statement.
•The cost is an estimated value proportional to the expected resource use needed
to execute the statement with a particular plan. The optimizer calculates the cost of
access paths and join orders based on the estimated computer resources, which
includes I/O, CPU, and memory.
•Serial plans with higher costs take more time to execute than those with smaller
costs. When using a parallel plan, however, resource use is not directly related to
elapsed time.
•The optimizer compares the costs of the plans and chooses the one with the
lowest cost.
The Oracle Optimizer: Cost Based Optimization

The following features require use of the CBO:
•Partitioned tables and indexes
•Index-organized tables
•Reverse key indexes
•Function-based indexes
• SAMPLE clauses in a SELECT statement
•Parallel query and parallel DML
•Star transformations and star joins
•Extensible optimizer
•Query rewrite with materialized views
•Enterprise Manager progress meter
•Hash joins
•Bitmap indexes and bitmap join indexes
•Index skip scans

•Piece of code:
• Estimator
• Plan generator
•Estimator determines cost of optimization suggestions made by the plan
generator:
• Cost: Optimizer’s best estimate of the number of standardized I/Os
made to execute a particular statement optimization
•Plan generator:
• Tries out different statement optimization techniques
• Uses the estimator to cost each optimization suggestion
• Chooses the best optimization suggestion based on cost
• Generates an execution plan for best optimization

•Selectivity is the estimated proportion of a row set retrieved by a particular
predicate or combination of predicates.
•It is expressed as a value between 0.0 and 1.0:
• High selectivity: Small proportion of rows
• Low selectivity: Big proportion of rows
•Selectivity computation:
• If no statistics: Use dynamic sampling
• If no histograms: Assume even distribution of rows
•Statistic information:
• DBA_TABLES and DBA_TAB_STATISTICS (NUM_ROWS)
• DBA_TAB_COL_STATISTICS (NUM_DISTINCT, DENSITY,
HIGH/LOW_VALUE,…)
The Oracle Optimizer: Estimator: Selectivity
Selectivity =
Number of rows satisfying a condition
Total number of rows

•Expected number of rows retrieved by a particular operation in the
execution plan
•Vital figure to determine join, filters, and sort costs
•Simple example:
• The number of distinct values in DEV_NAME is 203.
• The number of rows in COURSES (original cardinality) is 1018.
• Selectivity = 1/203 = 4.926*e-03
• Cardinality = (1/203)*1018 = 5.01 (rounded off to 6)
The Oracle Optimizer: Estimator: Cardinality
SELECT days FROM courses WHERE dev_name = 'ANGEL;'
Cardinality = Selectivity * Total number of rows

The Oracle Optimizer: Estimator: Cost
• Cost is the optimizer’s best estimate of the number of
standardized I/Os it takes to execute a particular statement.
• Cost unit is a standardized single block random read:
• 1 cost unit = 1 SRds
• The cost formula combines three different costs units into
standard cost units.
#SRds*sreadtim + #MRds*mreadtim + #CPUCycles/cpuspeed
sreadtim
Cost=
Single block I/O cost Multiblock I/O cost CPU cost
#SRds: Number of single block reads
#MRds: Number of multiblock reads
#CPUCycles: Number of CPU Cycles
Sreadtim: Single block read time
Mreadtim: Multiblock read time
Cpuspeed: Millions instructions per second

The Oracle Optimizer:The Different Modes of the Cost Based Optimizer
Value Description
CHOOSE The optimizer chooses between a cost-based approach and a rule-based approach, depending on whether statistics
are available. This is the default value.
•If the data dictionary contains statistics for at least one of the accessed tables, then the optimizer uses a cost-based
approach and optimizes with a goal of best throughput.
•If the data dictionary contains only some statistics, then the cost-based approach is still used, but the optimizer must
guess the statistics for the subjects without any statistics. This can result in suboptimal execution plans.
•If the data dictionary contains no statistics for any of the accessed tables, then the optimizer uses a rule-based
approach.
ALL_ROWS The optimizer uses a cost-based approach for all SQL statements in the session regardless of the presence of statistics
and optimizes with a goal of best throughput (minimum resource use to complete the entire statement).
FIRST_ROWS_n The optimizer uses a cost-based approach, regardless of the presence of statistics, and optimizes with a goal of best
response time to return the first n number of rows; n can equal 1, 10, 100, or 1000.
FIRST_ROWS The optimizer uses a mix of cost and heuristics to find a best plan for fast delivery of the first few rows.
Note: Using heuristics sometimes leads the CBO to generate a plan with a cost that is significantly larger than the cost
of a plan without applying the heuristic. FIRST_ROWS is available for backward compatibility and plan stability.
RULE The optimizer chooses a rule-based approach for all SQL statements regardless of the presence of statistics.

The Oracle Optimizer: What Is an Execution Plan?
• The execution plan of a SQL statement is composed of small
building blocks called row sources for serial execution plans.
• The combination of row sources for a statement is called the
execution plan.
• By using parent-child relationships, the execution plan can be
displayed in a tree-like structure (text or graphical).

The Oracle Optimizer: Where to Find Execution Plans?
• PLAN_TABLE (EXPLAIN PLAN or SQL*Plus autotrace)
• V$SQL_PLAN (Library Cache)
• V$SQL_PLAN_MONITOR (11g)
• DBA_HIST_SQL_PLAN (AWR)
• STATS$SQL_PLAN (Statspack)
• SQL Management Base (SQL Plan Management Baselines)
• SQL tuning set
• Trace files generated by DBMS_MONITOR
• Event 10053 trace file
• Process state dump trace file since 10gR2

The Oracle Optimizer: How To Read?
SQL> explain plan for
2 select e.empno, e.ename, d.dname
3 from emp e, dept d
4 where e.deptno = d.deptno
5 and e.deptno = 10;
Explained.
SQL> SELECT * FROM table(dbms_xplan.display(null,null,'basic'));
PLAN_TABLE_OUTPUT
------------------------------------------------
Plan hash value: 568005898
------------------------------------------------
| Id | Operation | Name |
------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | NESTED LOOPS | |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPT |
| 3 | INDEX UNIQUE SCAN | PK_DEPT |
| 4 | TABLE ACCESS FULL | EMP |
------------------------------------------------
1. Operation 0 is the root of the tree; it has one child,
Operation 1
2. Operation 1 has two children, which is Operation 2
and 4
3. Operation 2 has one child, which is Operation 3

The Oracle Optimizer: How To Read?
•Operation 0
(SELECT STATEMENT)
|
|
|
Operation 1
(NESTED LOOPS)
/
/
/
/
/
/
/
/
Operation 2 Operation 4
(TABLE ACCESS (TABLE ACCESS FULL)
BY INDEX ROWID)
|
|
|
Operation 3
(INDEX UNIQUE SCAN)
the graphical representation of the execution plan.
If you read the tree;
In order to perform Operation 1 , you need to perform
Operation 2 and 4.
Operation 2 comes first;
In order to perform 2, you need to perform its Child
Operation 3.
In order to perform Operation 4, you need to perform
Operation 2.

Oracle Supports the below access methods.
•Full Table SCAN (FTS)
•Table Access by ROW-ID
•Index Unique Scan
•Index Range Scan
•Index Skip Scan
•Full Index Scan
•Fast Full Index Scans
•Index Joins
•Hash Access
•Cluster Access
•Bit Map Index
The Oracle Optimizer: Data Access Methods

Guidelines for Managing Indexes
• Create indexes after inserting table data
• Index the correct tables and columns
• Order index columns for performance
• Limit the number of indexes for each table
• Drop indexes that are no longer needed
• Understand deferred segment creation
• Estimate index size and set storage parameters
• Specify the tablespace for each index
• Consider parallelizing index creation
• Consider creating indexes with NOLOGGING
• Understand when to use unusable or invisible indexes
• Consider costs and benefits of coalescing or rebuilding indexes
• Consider cost before disabling or dropping constraints
The Oracle Optimizer: Indexes – Types, Classifications,
Advantages & Disadvantages

Index Type Usage
The Oracle Optimizer: Indexes – Types, Classifications, Advantages &
Disadvantages
Default, balanced tree index, good for high-cardinality (high degree of distinct
values) columns
B-tree
Used with clustered tablesB-tree cluster
Used with hash clustersHash cluster
Good for columns that have SQL functions applied to themFunction-based
Good for columns that have SQL functions applied to them; viable alternative
to using a function-based index
Indexed virtual column
Useful to balance I/O in an index that has many sequential insertsReverse-key
Useful for concatenated indexes where the leading column is often repeated;
compresses leaf block entries
Key-compressed
Useful in data warehouse environments with low-cardinality columns; these
indexes aren’t appropriate for online transaction processing (OLTP) databases
where rows are heavily updated.
Bitmap
Useful in data warehouse environments for queries that join fact and
dimension tables
Bitmap join
Global index across all partitions in a partitioned tableGlobal partitioned
Local index based on individual partitions in a partitioned tableLocal partitioned
Specific for an application or cartridgeDomain

Physical layout of a table and B-tree index
The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages

The Oracle Optimizer: Indexes – Types, Classifications, Advantages & Disadvantages
When you put indexes on a partitioned table, you have the choice between
GLOBAL and LOCAL .
The LOCAL index partitions follow the table partitions :
They have the same partition key & type, get created automatically when new table partitions are added
and get dropped automatically when table partitions are dropped .
Beware: LOCAL indexes are usually not appropriate for OLTP access on the table, because one server process
may have to scan through many index partitions then .
This is the cause of most of the scary performance horror stories you may have heard about partitioning!
A GLOBAL index spans all partitions. It has a good SELECT performance usually, but is more sensitive against
partition maintenance than LOCAL indexes. The GLOBAL index needs to be rebuilt more often‫ץ‬

The Oracle Optimizer: Optimizer Statistics
• Describe the database and the objects in the database
• Information used by the query optimizer to estimate:
• Selectivity of predicates
• Cost of each execution plan
• Access method, join order, and join method
• CPU and input/output (I/O) costs
• Refreshing optimizer statistics whenever they are stale is as
important as gathering them:
• Automatically gathered by the system
• Manually gathered by the user with DBMS_STATS

The Oracle Optimizer: Optimizer Statistics
A common misperception that if no new statistics are gathered (and
assuming nothing else is altered in the database), that execution plans must
always remain the same.
That by not collecting statistics, one somehow can ensure and guarantee
the database will simply perform in the same manner and generate the
same execution plans.
This is fundamentally not true.
In fact, quite the opposite can be true.
One might need to collect fresh statistics to make sure vital execution plans
don’t change.
It’s the act of not refreshing statistics that can cause execution plans to
suddenly change.
explain plan changes with no stat change.sql

The Oracle Optimizer: Types of Optimizer Statistics
• Table statistics:
• Number of rows
• Number of blocks
• Average row length
• Index Statistics:
• B*-tree level
• Distinct keys
• Number of leaf blocks
• System statistics
• I/O performance and utilization
• CPU performance and utilization

The Oracle Optimizer: Histogrms
• The optimizer assumes uniform distributions; this may lead to
suboptimal access plans in the case of data skew.
• Histograms:
• Store additional column distribution information
• Give better selectivity estimates in the case of nonuniform
distributions
• With unlimited resources you could store each different value and
the number of rows for that value.
• This becomes unmanageable for a large number of distinct values
and a different approach is used:
• Frequency histogram (#distinct values ≤ #buckets)
• Height-balanced histogram (#buckets < #distinct values)
• They are stored in DBA_TAB_HISTOGRAMS.

The Oracle Optimizer: Frequency Histograms
10 buckets, 10 distinct values
0
10000
20000
30000
40000
1 3 5 7 10 16 27 32 39 49
ENDPOINT VALUE: Column value
ENDPOINT
NUMBER
Cumulative cardinality
# rows for column value
Distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, 49
Number of rows: 40001

The Oracle Optimizer: Height-Balanced Histograms
5 buckets, 10 distinct values
(8000 rows per bucket)
0 1 3 4 5
ENDPOINT NUMBER: Bucket number
ENDPOINT VALUE
2
Same number
of rows per bucket
1 7 10 10 32 49
Distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, 49
Number of rows: 40001
Popular value

The Oracle Optimizer: Height-Balanced Histograms
In a height-balanced histogram, the ordered column values are divided
into bands so that each band contains approximately the same
number of rows.
The histogram tells you values of the endpoints of each band.
In the example in the slide, assume that you have a column that is
populated with 40,001 numbers.
There will be 8,000 values in each band.
You only have ten distinct values: 1, 3, 5, 7, 10, 16, 27, 32, 39, and 49.
Value 10 is the most popular value with 16,293 occurrences.
When the number of buckets is less than the number of distinct
values, ENDPOINT_NUMBER records the bucket number and
ENDPOINT_VALUE records the column value that corresponds to
this endpoint.

Focusing on benchmark issues: physical IO, logical
reads, shared pool, buffer cache

Buffer cache
For many types of operations, Oracle Database uses the buffer cache to store
data blocks read from disk.
Oracle Database bypasses the buffer cache for particular operations, such as
sorting and parallel reads.
To use the database buffer cache effectively, tune SQL statements for the
application to avoid unnecessary resource consumption.
To meet this goal, verify that frequently executed SQL statements and SQL
statements that perform many buffer gets are well-tuned.
When configuring a new database instance, it is impossible to know the correct
size for the buffer cache.
Typically, a database administrator makes a first estimate for the cache size, then
runs a representative workload on the instance and examines the relevant
statistics to see whether the cache is under-configured or over-configured.

What is a Physical I/O??
Whenever you execute a query, Oracle has to go and fetch data to give you the
result of the query execution.
Here, data means the actual data in data blocks.
Whenever a new data block is requested, it has to be fetched from the physical
datafiles residing on the physical disks.
This fetching of data blocks from the physical disk involves an I/O operation known
as physical I/O.
By virtue of this physical I/O, now the block has been fetched and read into the
memory area called buffer cache.
This is a default action.
We know that a data block might be requested multiple times by multiple queries.

What is a Logical I/O??
Once a physical I/O has taken place and the block has been read into the memory,
the next request for the same data block wont require the block to be fetched from
the disk and hence avoiding a physical I/O.
So now to return the results for the select query requesting the same data block,
the block will be fetched from the memory and is called a Logical I/O.
Whenever the quantum of Logical I/O is calculated, two kinds of reads are
considered : Consistent reads and Current reads.
Jointly, these 2 statistics are known as Logical I/O performed by Oracle.

Consistent reads
It is a well known fact that whenever a change is induced in a data block, the old
data/entry is written to the UNDO/ROLLBACK segments.
From the fundamentals of UNDO, we also know that this is to provide a read
consistent view of the data block to other users trying to read the same data
block.
Consistent reads mean reading the block in a consistent mode “point in time”.
Here the phrase “point in time” means the time when the query/statement began.
A consistent read might or might not involve any UNDO data.
UNDO data will be applied when it is necessary to roll back a data block to the
required “point in time” when the SQL statement was fired.
If on reading the buffer cache, it is found that the data block is already in the
required state, no UNDO data is required because the block is already
consistent.

Consistent reads and array size
Consistent reads could also depend on and vary with the array size setting of SQLPLUS.
The default value is 15.
Array size is the number of rows fetched in a single read.
The value of array size is an indicator of the number of network round trips made to fetch
the required data from Oracle.
A careful adjustment of array size value can improve performance by reducing the
network round trips.
A higher array size might be good for performance of queries (by reducing the network
round trips and also the consistent reads) but too high value also uses more memory.
However, array size is not a setting restricted to SQLPLUS; it can be set in many
other applications requesting data from oracle database.

How do you identify the source of the problem?

Solving database performance issues sometimes requires the use of operating system (OS) utilities.
These tools often provide information that can help isolate database performance problems.
Consider the following situations:
• You’re running multiple databases and multiple applications on one server and
want to use OS utilities to identify which database (and corresponding process) is
consuming the most operating system resources. This approach is invaluable
when one database application is consuming resources to the point of causing
other databases on the box to perform poorly.
• You need to verify if the database server is adequately sized for current application
workload in terms of CPU, memory, disk I/O, and network bandwidth.
An analysis is needed to determine at what point the server will not be able to
handle larger (future) workloads.
• You’ve used database tools to identify system bottlenecks and want to double-check
the analysis via operating system tools.

In these scenarios, to effectively analyze, tune, and troubleshoot, you’ll
need to employ OS tools to identify resource-intensive processes.
Furthermore, if you have multiple databases and applications
running on one server, when troubleshooting performance issues, it’s
often more efficient to first determine which database and process is
consuming the most resources.
Operating system utilities help pinpoint whether the bottleneck is CPU,
memory, disk I/O, or a network issue.
In Linux/Unix environments, once you have the operating system
identifier, you can then query the database to show
any corresponding database processes and SQL statements.

Solutions: where do you start and what order to
work?

Mapping a Resource-Intensive Process to a Database Process
Problem
It’s a dark and stormy night, and the system is performing poorly.
You identify an operating system–intensive process on the box.
You want to map an operating system process back to a database process.
If the database process is a SQL process, you want to display the user of the SQL statement and also
the SQL.
Solution
In Linux/Unix environments, if you can identify the resource-intensive operating system process, then
you can easily check to see if that process is associated with a database process. The process consists
of the following:
1. Run an OS command to identify resource-intensive processes and associated IDs.
2. Identify the database associated with the process.
3. Extract details about the process from the database data dictionary views.
4. If it’s a SQL statement, get those details.
5. Generate an execution plan for the SQL statement.
Solutions: where do you start and what order to
work?

Introduction to SQL and Application Tuning

Proactive Tuning Methodology
•Simple design
•Data modeling
•Tables and indexes
•Using views
•Writing efficient SQL
•Cursor sharing
•Using bind variables

Simplicity in Application Design
•Simple tables
•Well-written SQL
•Indexing only as required
•Retrieving only required information

Data Modeling
•Accurately represent business practices
•Focus on the most frequent and important business
transactions
•Use modeling tools
•Appropriately normalize data (OLTP versus DW)

Table Design
•Compromise between flexibility and performance:
• Principally normalize
• Selectively denormalize
•Use Oracle performance and management features:
• Default values
• Constraints
• Materialized views
• Clusters
• Partitioning
•Focus on business-critical tables

Index Design
•Create indexes on the following:
• Primary key (automatically created)
• Unique key (automatically created)
• Foreign keys (good candidates)
•Index data that is frequently queried (select list).
•Use SQL as a guide to index design.

Using Views
•Simplifies application design
•Is transparent to the developer
•Can cause suboptimal execution plans

SQL Execution Efficiency
•Good database connectivity
•Minimizing parsing
•Share cursors
•Using bind variables

Writing SQL to Share Cursors
•Create generic code using the following:
• Stored procedures and packages
• Database triggers
• Any other library routines and procedures
•Write to format standards (improves readability):
• Case
• White space
• Comments
• Object references
• Bind variables

Performance Checklist
•Set initialization parameters and storage options.
•Verify resource usage of SQL statements.
•Validate connections by middleware.
•Verify cursor sharing.
•Validate migration of all required objects.
•Verify validity and availability of optimizer statistics.

• Integrity Constrains are Important
• Reasons for Inefficient SQL Performance
• Using Bind Variables
• Shared SQL and Cursors
When and What to Tune?

The Clustering Factor
The clustering factor is a number which represent the degree to which
data is randomly distributed in a table.
In simple terms it is the number of “block switches” while reading a table
using an index.

The above diagram explains that how scatter the rows of the table are.
The first index entry (from left of index) points to the first data block and second
index entry points to second data block.
So while making index range scan or full index scan, optimizer have to switch
between blocks and have to revisit the same block more than once because rows are
scatter.
So the number of times optimizer will make these switches is actually termed
as“Clustering factor”.

The above image represents "Good CF”.
In an event of index range scan, optimizer will not have to jump to next data
block as most of the index entries points to same data block.
This helps significantly in reducing the cost of your SELECT statements.
Clustering factor is stored in data dictionary and can be viewed from dba_indexes
(or user_indexes)
Clustering factor.sql

Integrity Constrains are Important
Many people think of constraints as a data integrity thing, and it’s true—they are.
But constraints are used by the optimizer as well when determining the optimal
execution plan.
The optimizer takes as inputs
•The query to optimize
•All available database object statistics
•System statistics, if available (CPU speed, single-block I/O speed, and
so on—metrics about the physical hardware)
•Initialization parameters
•Constraints
null columns differ from not nul.sql
fk adds to query performance

• Reasons for inefficient SQL performance
• Stale or missing optimizer statistics
• Missing access structures
• Suboptimal execution plan selection
• Poorly constructed SQL

Richard Morris:
Are there issues that crop up again and again?
Tom Kyte:
Perhaps the biggest issue is the black box approach of development. A developer will learn everything
they can about the procedural language they're using. However, they don't learn about the database
that they're using or other packages that might be involved……
Richard Morris:
Do you think then that poor education is to blame? That somehow it’s got worse over the years rather
than getting better?
Tom Kyte:
No, it hasn’t changed. When I get up on stage at a seminar and I talk about bind variables I start by
saying that for 16 years I’ve been talking about the same thing but each year the problem is the same.
Why? Because universities are trying to teach students theory and algorithms and things like that,
they’re not teaching them how to write production quality code. They don’t teach them how to debug
or how to instrument, they don’t teach them how to defensively program. They just teach them how to
write a compiler in Lisp which frankly doesn’t translate very well into IT.

Using Bind Variables
Oracle automatically notices when applications send similar SQL statements to the database.
The SQL area used to process the first occurrence of the statement is shared- that is, used for
processing subsequent occurrences of that same statement.
Therefore, only one shared SQL area exists for a unique statement.
Because shared SQL areas are shared memory areas, any Oracle process can use a shared SQL
area.
The sharing of SQL areas reduces memory use on the database server, thereby increasing system
throughput.
In evaluating whether statements are similar or identical, Oracle considers SQL statements issued
directly by users and applications as well as recursive SQL statements issued internally by a DDL
statement.
One of the first stages of parsing is to compare the text of the statement with existing statements in
the shared pool to see if the statement can be shared.
If the statement differs textually in any way, then Oracle does not share the statement.
Exceptions to this are possible when the parameter CURSOR_SHARING has been set to SIMILAR
or FORCE.

ADAPTIVE BINDING
DBAs are always encouraging developers to use bind variables, but when bind variables are used
against columns containing skewed data they sometimes lead to less than optimum execution plans.
This is because the optimizer peeks at the bind variable value during the hard parse of the statement,
so the value of a bind variable when the statement is first presented to the server can affect every
execution of the statement, regardless of the bind variable values.
Oracle uses Adaptive Cursor Sharing to solve this problem by allowing the server to compare the
effectiveness of execution plans between executions with different bind variable values.
If it notices suboptimal plans, it allows certain bind variable values, or ranges of values, to use alternate
execution plans for the same statement.
This functionality requires no additional configuration.
https://oracle-base.com/articles/11g/adaptive-cursor-sharing-11gr1

SELECT COUNT(*) FROM products p
WHERE prod_list_price <
1.15 * (SELECT avg(unit_cost) FROM costs c
WHERE c.prod_id = p.prod_id)
SELECT * FROM job_history jh, employees e
WHERE substr(to_char(e.employee_id),2) =
substr(to_char(jh.employee_id),2)
SELECT * FROM orders WHERE order_id_char = 1205
SELECT * FROM employees
WHERE to_char(salary) = :sal
1
2
3
4
SELECT * FROM parts_old
UNION
SELECT * FROM parts_new
5

Variuos sql and pl/sql techniques to improve
performance
Advanced SQL and Application Topics

שבוע אורקל 2016

More Related Content

Viewers also liked (20)

Similar to שבוע אורקל 2016 (20)

More from Aaron Shilo (6)

שבוע אורקל 2016