Oracle Performance Tuning DE(v1.2)-part2.ppt

Performance Tuning
Kapil Goyal

Agenda
 CPU time
 DB Time
 Reading Statspack/AWR Report
 Report Drilldown Example
 parse cpu to parse elapsed ratio
 Execute to Parse Ratio
 Latches
 Clustering factor
 Statistics Restore
 awrsqrpt.sql
 Wait events

 Because “CPU time” is not wait event. It is
the time spent on CPU to do the actual
work.

Is it good to have CPU time on top? Comments?
“Generally” it is good to be on top. Which means we are not waiting but doing the
actual work on CPU.
As a general rule, systems where CPU time is dominant usually need less tuning
than the ones where wait time is dominant. Alternatively, heavy CPU usage can
be caused by badly written SQL statements.

If someone asks your thoughts for below data, what would you say about CPU time?
 Data is not sufficient
 Interval is not provided
 What if Interval is 60 minutes then?

 We had 60*60=3600 CPU Seconds to use in that interval
if it is a single CPU machine
 If I tell you there were 32 CPUs, means:
60*60*32=115200 CPU seconds to use in 1 hr
interval. “Assuming” only 1 Database is running on box
and no other application load except Oracle database.
 (14,659/115,200)*100 = 12.73% of Total CPU
 So we are not CPU bound. “Hopefully”

DB Time =
DB Wait Time +
DB CPU Time

 DB Time
= Work done by Database on behalf of ALL
users
= Sum of time spent by ALL users (Waiting on
Non-idle events + on CPU + waiting for CPU)

SQL> @time_model
Session altered.
"All time in Sec"
Enter value for date: 24-oct-10
old 13: trunc(sn.begin_interval_time) ='&date' and
new 13: trunc(sn.begin_interval_time) ='24-oct-10' and
sql exec PL/SQL exec parse time backgro
Date time DB time elpsd time DB CPU elpsd time elapsd elasped t
------------------------- --------------- ---------------- ---------------- --------------- --------------- -----------
10/24/10_10_30_11_00 871.53 778.80 247.37 10.60 14.87 118
10/24/10_11_00_11_30 894.91 800.41 300.38 7.87 12.93 117
10/24/10_11_30_12_00 1079.13 973.75 279.78 11.55 16.51 116
10/24/10_12_00_12_30 996.94 890.78 317.13 9.76 15.32 123
10/24/10_12_30_13_00 908.64 808.08 235.90 9.14 17.43 120
10/24/10_13_00_13_30 1218.89 1103.56 386.49 16.94 17.37 226
10/24/10_13_30_14_00 955.42 862.83 241.37 8.67 16.80 121
10/24/10_14_00_14_30 1038.87 905.51 390.07 15.52 25.70 125
10/24/10_14_30_15_00 1666.30 1541.61 369.93 16.15 17.37 145
10/24/10_15_00_15_30 146897.14 145730.59 14712.71 371.41 64.79 12,499
10/24/10_15_30_16_00 29360.88 28438.56 4858.87 731.45 197.13 666
10/24/10_16_00_16_30 239069.20 237550.33 13714.38 31.96 30.84 1,793
10/24/10_16_30_17_00 226198.80 224807.06 2388.79 5.10 33.19 1,143
10/24/10_17_00_17_30 135058.86 126855.07 14737.44 178.26 282.42 2,627
10/24/10_17_30_18_00 23471.87 22723.00 3765.50 217.70 171.49 202
10/24/10_18_00_18_30 945.55 847.94 319.97 8.45 15.89 118
When database was most busiest?

 Start at summary data at the top:
 Top 5 Timed Events
 Wait Events and Background Wait Events
 Wait Event Histogram (only in Statspack or 11g AWR)
 Load Profile (useful with baseline)
 Instance Efficiency (useful with baseline)
 Time Model
 Drill down to specific sections.
 Indicated by top wait event
Reading a Statspack or AWR Report
C
O
N
F
I
R
M

Report Drilldown Examples
 If top timed event is related to I/O waits, look at:
 SQL ordered by Reads
 SQL ordered by Elapsed
 Tablespace IO Stats
 File IO Stats
 File IO Histogram
 If top timed event is related to CPU usage, look at:
 Load Profile
 Time Model
 SQL ordered by CPU
 SQL ordered by Gets

Load Profile Section
 Allows characterization of
the application
 Can point toward potential
problems:
 High hard parse rate
 High I/O rate
 High login rate
 Is more useful if you have a comparable baseline
 Answers “What has changed?”
 Txn/sec change implies changed workload.
 Redo size/txn implies changed transaction mix.
 Physical reads/txn implies changed SQL or plan.

Hit Ratio:
 Never Tune the system with Ratios, tune it
by User response time
 If ratios are 100% but end users are
complaining then listen users  and tune
the system from their perspective

parse cpu to parse elapsed ratio?
 If you spend 1 CPU second on CPU to
parse but total elapsed is 5 second wall
clock time then it means you are waiting on
some resources to complete the parsing.
 100% ratio means parse CPU = Parse
elapsed time so no waits or no contention.

 (8879/110582)*100=8.03%
How does Oracle calculates it?

What does this ratio mean?
 Parse CPU to Parse Elapsd %: 8.03
 It is percentage. 8.03% means .0803
 If you divide it by 1 then 1/.0803 = 12.45
 Which means 12.45 second (wall clock
time) must be elapsed for every cpu
second for parsing. BAD
 It represents resource contention while
parsing.

Execute to Parse Ratio?
 This a ratio which measures how many
times a statement got executed as opposed
to parsed.
 if it is 99.99% then it means for 1 parse
there are 10,000 executes.
 if it is 90% then it means for 1 parse there
are 10 executes.
 For OLTP, good to be near 99%, for DSS it
could be lower as “generally” all sql
statements/reports are unique.

 EXECUTE to PARSE = (1- parse/execute)
 1-915,652/9,944,590 = 1-0.092 = 0.9079
 For percentage => .9079*100 = 90.79%
How does Oracle calculates it?

 EXECUTE to PARSE %= 90.79
 1-parse/execute = .9079
 Parse/execute = 1-.9079
 Parse/execute = 0.0921
 Parse/execute = 921/10000
 For parse = 1 execute = 10.85
 So 1 parse for every ~11 executes.
What does this ratio mean?

Buffer Cache Hit Ratio
LIO-PIO
LIO
1 - _______physical reads______
db block gets + consistent gets

Is it always good to have Buffer Cache
hit ratio 99.99% ?
 Not really, it depends
 Higher hit ratio could be the reason of very
inefficient sqls
 Which do lots of Logical IO

Magic Script to increase the cache hit
ratio:
declare
m varchar2(1);
begin
for i in 1..10000000 loop
select dummy into m from dual;
end loop;
end;
Note – Never run it on Production Box.
Sql to check hit ratio:
SELECT (1-(pr.value/(dbg.value+cg.value)))*100
FROM
v$sysstat pr,
v$sysstat dbg,
v$sysstat cg
where
pr.name = 'physical reads'
and dbg.name = 'db block gets'
and cg.name = 'consistent gets'
;

What is Latch?
 Low level serialization mechanism to protect shared memory
structure.
 A latch is a type of a lock that can be very quickly acquired
and freed.
 Latches are typically used to prevent more than one process
from executing the same piece of code at a given time.
 A process acquires a latch when working with a structure in
the SGA(System Global Area). It continues to hold the latch
for the period of time it works with the structure. The latch is
dropped when the process is finished with the structure.

What is mutex?
 Mutual exclusion – Generally used in concurrent
programming to avoid simultaneous access to
shared resource.
 In Oracle - Kind of replacement for library cache
latch and library cache pin
 Mutexes are faster and lighter than latch, means
uses less CPU (~5 times) and less bytes in size
(perhaps 16-28 bytes for mutex and 110 bytes for
regular latch structure) – Makes system more
scalable.

 V$MUTEX_SLEEP and V$MUTEX_SLEEP_HISTORY are
the 2 views for mutexes.
 Hence just SLEEPS are recorded in database, no GETS –
less overhead.
 Init.ora parameter _kks_use_mutex_pin=true makes it enable
(default in >10.2.0.2 environment)
 Mutex are being used for underlying structure of V$sqlstats
too.
 Select count(*) from v$sqlstats will be faster than select
count(*) from v$sql.

Latches request modes?
 Latches request can be made in two
modes: "willing-to-wait" or "no wait".

 A request in "willing-to-wait“ (e.g. shared pool and
library cache latches) mode will loop, wait, and
request again until the latch is obtained.
 If it can’t get the latch after spinning
(_spin_count), it will sleep and wake-up after one
hundredth of second.
 It will then start this process again, spinning up to
the _SPIN_COUNT and then sleeping for twice as
long (two cs).

 Some latches are “no wait.” (e.g. “redo
copy” latch). This type of latch does not
wait for the latch to become available. They
immediately time out and retry to obtain the
latch.

 _spin_count can be modified to change the
spinning behavior for latches.
 9iR2 onwards, we can assign latches to
different classes and modify only that class
to have different _spin_count without
impacting all the latches.- With support
help.

Oracle Performance Tuning DE(v1.2)-part2.ppt

More Related Content

Similar to Oracle Performance Tuning DE(v1.2)-part2.ppt (20)

Recently uploaded (20)

Oracle Performance Tuning DE(v1.2)-part2.ppt