SlideShare a Scribd company logo
ASH Architecture and Advanced Usage
Graham Wood, Uri Shaft, John Beresniewicz
Oracle America
The following is intended to outline our
general product direction. It is intended for
information purposes only, and may not be
incorporated into any contract. It is not a
commitment to deliver any material, code, or
functionality, and should not be relied upon in
making purchasing decisions.
The development, release, and timing of any
features or functionality described for
Oracle’s products remains at the sole
discretion of Oracle.
Agenda
• ASH Fundamentals and Architecture
• Estimating DB Time Using ASH
• The ASH Fix-up Mechanism
• Estimating Event Counts Using ASH
• Additional ASH Arcana
• Lessons Learned
Motivation for ASH in 10i
Performance diagnostics tool with:
• Always on (unlike SQL trace)
• History of database activity
• Fine grained detail
Other requirements:
• Very small performance overhead
• Uses no locks of any kind
• Works well even when system is overloaded
Active Session History (time based)
A sample of activity taken at regular intervals
t1
Session 1
Session 2
Session 3
Session Time Duration Activity SQL Object
1 t1 null CPU SQL1 null
2 t1 100ms I/O SQL2 EMP
3 t1 5ms I/O SQL1 EMP
3 t2 3sec Row lock SQL1 DEPT
t2 t3
ASH Architecture
Every
1 hour
or
out-of-space
AWR
Circular buffer
in SGA
(2MB per CPU)
DBA_HIST_ACTIVE_SESS_HISTORY
V$ACTIVE_SESSION_HISTORY
Session
state
objects
MMON Lite
(MMNL)
V$SESSION
V$SESSION_WAIT
Variable
length rows
Direct-path
INSERTS
Write
1 out of 10
samples
ASH Architecture
Readers go
unlatched
Writer goes
one direction
Every
1 hour
or
out-of-space
AWR
Circular buffer
in SGA
(2MB per CPU)
DBA_HIST_ACTIVE_SESS_HISTORY
V$ACTIVE_SESSION_HISTORY
MMON Lite
(MMNL)
Session
state
objects
V$SESSION
V$SESSION_WAIT
Readers go the
opposite way
Indexe
d
on time
Indexe
d
on time
ASH Pros and Cons
PROS
• Supports the DB Time method of performance analysis
• Historically for large or systemic problems
• Recently or “now” (V$SESSION still available) for emergency
• Always available (modulo licensing restrictions)
• Minimal cost to server performance
CONS
• Queries become estimates
• We query a sample and not the full data set
• Difficult to form semantically meaningful queries
• Probability distribution of a query result matters
Database Time and Active Sessions
Database Time = Total time spent by sessions in the
database server actively working (on CPU) or actively
waiting (non-idle wait)
Active Sessions = The number of sessions active
(working or waiting) in the database server at a
particular time
Active Sessions is the rate of change of Database
Time over time, also referred to as load
ASH and Database Time


 i
S
T
DBT
DB Time is approximated by multiplying sampling interval by the
total count of the samples
DBT » T · (rows in ASH)
Each ASH sample represents 1 second of DB Time
Basic ASH Math
COUNT(*) = DB Time
GROUP BY ?
Active Sessions and DB Time
• The number of active sessions at any time is the rate
of change of the “DB Time function” at that time.
• DB Time is the integral of the Active Session function.


1
0
t
t
ions
ActiveSess
DBtime
ions
ActiveSess
t
DBtime 

 /
time
t0 t1
Active Sessions and DB Time
t = 1 sec
Active sessions
DB Time
ASH DB Time Query (basic ASH math)
• What are the top SQL by database time in the last 5
minutes?
SELECT * FROM
(SELECT NVL(SQL_ID,'NULL') as SQL_ID
,SUM(1) as DBtime_secs
FROM V$ACTIVE_SESSION_HISTORY
WHERE sample_time > SYSDATE - 5/24/60
GROUP BY SQL_ID
ORDER BY 2 DESC
)
WHERE rownum < 6;
NOTE: SUM(:T) can be substituted for SUM(1) to account
for non-default sampling intervals
ASH Math: EM Top Activity
ASH Math: ASH Analytics
Multi-dimensional DB Time Analysis
18 Copyright © 2011, Oracle and/or its affiliates. All rights
reserved.
Insert Information Protection Policy Classification from Slide 8
Common Mistakes
ASH = Complete record of database activity
• SUM(time_waited) = total time spent in database
• COUNT(*) = total count of wait events
• AVG(time_waited) = average latency of waits
ASH = Random sample of wait events
• AVG(time_waited) = estimate of average wait latency
• Top SQL by SUM(time_waited) = Top SQL by time
spent in the server
Queries Over Samples
• ASH is a time sample of the true workload
• One can imagine multiple ASH mechanisms each
producing different results from the exact same true
workload
• ASH rows are random variables extracted from the true data
• It is a time sample: the probability an event is
sampled is proportional to the event duration
• A 10ms event is 10 times more likely to be in ASH than a 1ms
event
Query Results Are Estimates
• ASH does not contain all the data, just a sample
• Any aggregate over ASH data is a random variable
• It estimates something or is a sample of something.
• Any query over samples is an unbiased estimator of a
property value if the expected value of the query
equals that property value
Compare Estimated to Actual DB Time
Bad ASH Math
• SQL observed using 9 secs of CPU every 10 secs
The ASH “Fix-up”
• ASH columns may be unknown at sampling time
• TIME_WAITED: session is still waiting
• PLAN_HASH: session is still optimizing SQL
• GC events: event details unknown at event initiation
• Certain time model bit vector columns
• ASH “fixes up” data during subsequent sampling
• TIME_WAITED fixed up in first sample after event completes
• Long events: last sample gets fixed up time_waited (all others stay 0)
• Querying current ASH may return un-fixed rows
• Should not be a problem generally
ASH Fix-up 1: Sampling
Session State Objects
ASH Sampler
MMNL
ON CPU
ON CPU
ON CPU
WAITING
WAITING
IO WAIT
V$ASH
TIME_WAITED
PLAN_HASH
TIME_WAITED
Fix-up List
Post-wait callback
registered by session
ASH Fix-up 2: Fixing Up
Callback
Session State Objects
ASH Sampler
MMON
ON CPU
ON CPU
ON CPU
TIME_WAITED=n
WAITING
V$ASH
TIME_WAITED=n
PLAN_HASH
Fix-up List
Estimating Event Counts With ASH
:T = ASH sampling interval
TIME_WAITED = length of sampled event
N = Number of events the ASH sample
represents
:T ~ TIME_WAITED * N
OR
N ~ :T / TIME_WAITED
Why Event Count = :T / Time_Waited?
250 ms
Sally
250 ms
250 ms
250 ms
250 ms
User1
User2
User3
User4
tN tN+:T
:T elapsed time (1000 ms)
ASH sample time
Sally’s row in ASH estimates 1000/250 ms = 4 waits during time interval
ASH Event Count Query (1)
• Top 5 objects by User I/O requests last 15 minutes
SELECT W.*,O.object_name FROM
(SELECT current_obj#
,ROUND(SUM(100000/time_waited)) as est_IOwaits
,SUM(1) as est_DBtime
FROM V$ACTIVE_SESSION_HISTORY
WHERE sample_time > SYSDATE - 15/24/60
AND time_waited > 0
AND event IN (‘db file sequential read’
,’db file scattered read’)
GROUP BY current_obj#
ORDER BY 2 DESC
) W
, DBA_OBJECTS O
WHERE o.object_id = W.current_obj#
AND ROWNUM < 6;
Long Events
• Events that are longer than :T are always sampled
• No sampler “bias” for these
• Event may be sampled multiple times
• Only final (“fixed-up”) ASH row for long events has
TIME_WAITED > 0
• The event count of long events is known:
• 1 for each row with TIME_WAITED >= :T
• The DB Time estimate is still COUNT(*)
ASH Event Count Query (2)
• Top objects by I/O event counts adjusted for long
events
SELECT * FROM
(SELECT current_obj#
,ROUND(SUM(CASE WHEN time_waited >= 1000000 THEN 1
ELSE 1000000 / time_waited
END)) as est_IOwaits
FROM V$ACTIVE_SESSION_HISTORY
WHERE sample_time > SYSDATE - 15/24/60
AND time_waited > 0
AND wait_class = 'User I/O'
GROUP BY current_obj#
ORDER BY 2 DESC
)
WHERE ROWNUM < 6;
Estimating Event Latencies
1. We estimate total DB time on events
2. We can now estimate total event occurrences
3. Therefore, we can compute average latency
Est_avg_latency_ms = est_Dbtime_ms / est_waits
User I/O Events Ordered by Latency
SELECT event
,ROUND(est_DBtime_ms/est_waits,1)
as est_avg_latency_ms
FROM
(SELECT event
,ROUND(SUM(GREATEST(1,1000000/time_waited)))
as est_waits
,SUM(1000) as est_DBtime_ms
FROM V$ACTIVE_SESSION_HISTORY
WHERE sample_time > SYSDATE - 15/1440
AND time_waited > 0
AND wait_class = 'User I/O'
GROUP BY event
)
ORDER BY 2 DESC;
• Note preferred syntax for computing event counts
DEMO?
Additional ASH Arcana
• ASH Samples “ON CPU”
• Challenges of SQL_ID tracking
• Sampling CPU-bound systems
• Current_OBJ# and when to use it
• Blocking session information
ON CPU and ASH
• ASH row status “ON CPU” derived, not observed
• Session is in a database call
• Session is NOT in a wait event (idle or non-idle)
• Un-instrumented waits => “ON CPU”
• These are bugs and should be rare, but have happened
• Session on run queue may be WAITING or ON CPU
• Depends on state prior to going onto run queue
Evolution of SQL_ID
•Objective is to get lowest level user level SQL
•Prior to 11.1.0.7 various schemes tried
•Varying levels of success, but each had problems
•11.1.0.7 and above: it is the currently executing SQL
• except recursions inside the server code, triggers
•PL/SQL has both top level and current level calls
•ENTRY_LEVEL and columns for applications that start with
generic PL/SQL and end up doing interesting things underneath
– need to find what the applications initially called.
Why does ASH work when the server
is CPU bound?
1. ASH sampler is very efficient, and does not lock
• Therefore, in almost all cases it takes a single CPU slice to
finish a full sample.
2. After a sample is done, the sampler computes next
scheduled sample time and sleeps until then
3. Upon scheduled wake-up, it waits for CPU (runq)
and samples again
• Thus, CPU bound samples are shifted by one runq but stay
about 1 per second
ASH Sampler and Run-queue
S_t0 S_t2
S_t1
Run queue Run queue
A_t1
A_t0
Run queue
A_t2
Sleep until next
time
Sleep until next
Sample Sample Sample
If run queue times are consistent sampling interval will be preserved but
sample times shifted
What Is CURRENT_OBJ#
It is the ID of the segment we operate on when ASH is
sampled.
However, this is only valid in specific wait events:
•I/O events on data blocks
•Cluster (global cache) events on data blocks
•Row Locks
•Table Locks
•Buffer busy (and associated RAC events)
DANGER: the column is not cleared when it is invalid.
Blocking Session
BLOCKING_SESSION_STATUS
BLOCKING_SESSION
BLOCKING_SESSION_SERIAL#
BLOCKING_INST_ID
Prior to 11.2:
Finding blocking session in most cases can create a lock/crash
Therefore, blocking session only in same instance, only for some cases
From 11.2:
We copy information from Hang Manager.
Therefore, events that are long enough (3 seconds in same instance, 10
seconds in RAC) get the hang information copied into ASH.
Lessons Learned
• ASH is a time-based and not event-based sample of
database activity
• ASH is an excellent representation of activity history
• How to estimate and rank DB Time spent over ASH
dimensions using basic ASH Math
• ASH estimates underlie key DB Time performance
analysis use cases exposed by EM
• The ASH fix-up is a critical (and unique) mechanism
• How to estimate event counts and latencies using
TIME_WAITED
DEMO?
ASH Archit ecture and Advanced Usage.pdf

More Related Content

PPT
active_session_history_oracle_performance.ppt
PDF
Ash architecture and advanced usage rmoug2014
PPTX
The Practical Use of Active Session History
PDF
con9577-ash-deep-dive-oow2013-2031468.pdf
PDF
Average Active Sessions RMOUG2007
PDF
200603ash.pdf Performance Tuning Oracle DB
PDF
OOW13 JB KP ASH Deep Dive
PPTX
Further Adventures With ASH
active_session_history_oracle_performance.ppt
Ash architecture and advanced usage rmoug2014
The Practical Use of Active Session History
con9577-ash-deep-dive-oow2013-2031468.pdf
Average Active Sessions RMOUG2007
200603ash.pdf Performance Tuning Oracle DB
OOW13 JB KP ASH Deep Dive
Further Adventures With ASH

Similar to ASH Archit ecture and Advanced Usage.pdf (20)

PPT
ash_feel_the_power_kyle_hailey_8951122.ppt
PPTX
Practical ASH
PPTX
ASH and AWR on DB12c
PDF
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
PDF
ash-1-130820122404-phpapp01.pdf
PPT
Intro to ASH
PPT
AWR, ADDM, ASH, Metrics and Advisors.ppt
PPT
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
PPTX
SQL Tuning, takes 3 to tango
PDF
Ash Outliers UKOUG2011
PPT
Ash masters : advanced ash analytics on Oracle
PPT
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
PDF
Ash and awr deep dive hotsos
PDF
Donatone_Grid Performance(2)11111111.pdg
PDF
Ash and awr performance data2
PPT
Rmoug ashmaster
PDF
Average Active Sessions - OaktableWorld 2013
PDF
Awr + 12c performance tuning
PPT
Oracle Open World Thursday 230 ashmasters
PDF
AWR & ASH Analysis
ash_feel_the_power_kyle_hailey_8951122.ppt
Practical ASH
ASH and AWR on DB12c
DB Time, Average Active Sessions, and ASH Math - Oracle performance fundamentals
ash-1-130820122404-phpapp01.pdf
Intro to ASH
AWR, ADDM, ASH, Metrics and Advisors.ppt
Your tuning arsenal: AWR, ADDM, ASH, Metrics and Advisors
SQL Tuning, takes 3 to tango
Ash Outliers UKOUG2011
Ash masters : advanced ash analytics on Oracle
Performance Tuning With Oracle ASH and AWR. Part 1 How And What
Ash and awr deep dive hotsos
Donatone_Grid Performance(2)11111111.pdg
Ash and awr performance data2
Rmoug ashmaster
Average Active Sessions - OaktableWorld 2013
Awr + 12c performance tuning
Oracle Open World Thursday 230 ashmasters
AWR & ASH Analysis
Ad

More from tricantino1973 (6)

DOCX
Oracle-GoldenGate-18c-Workshop-Lab-16.docx
DOCX
Oracle-GoldenGate-18c-Workshop-Lab-17.docx
DOCX
GoldenGate-12c-Advanced-Workshop-Lab-Exercise-1.docx
PDF
upgrade_migrate_consolidate_19999c (1).pdf
PDF
freetools-170503222740.pdforacleeeeeeeee
PPT
les_01_arch_presentation_asm_oracle_rac_
Oracle-GoldenGate-18c-Workshop-Lab-16.docx
Oracle-GoldenGate-18c-Workshop-Lab-17.docx
GoldenGate-12c-Advanced-Workshop-Lab-Exercise-1.docx
upgrade_migrate_consolidate_19999c (1).pdf
freetools-170503222740.pdforacleeeeeeeee
les_01_arch_presentation_asm_oracle_rac_
Ad

Recently uploaded (20)

PDF
Bioremediation Technologies best for safe envirnment.pdf
PDF
WKA #29: "FALLING FOR CUPID" TRANSCRIPT.pdf
PPTX
the Honda_ASIMO_Presentation_Updated.pptx
PPTX
wegen seminar ppt.pptxhkjbkhkjjlhjhjhlhhvg
PDF
Between the Reels and the Revolution Enzo Zelocchi’s Unscripted Path Through ...
PDF
Rakshabandhan – Celebrating the Bond of Siblings - by Meenakshi Khakat
PPTX
genderandsexuality.pptxjjjjjjjjjjjjjjjjjjjj
PPTX
providenetworksystemadministration.pptxhnnhgcbdjckk
PDF
Songlyrics.net-website for lyrics song download
PDF
A New Kind of Director for a New Kind of World Why Enzo Zelocchi Matters More...
PDF
Gregerorgio Juba_20250803_191224_0000.pdf
PPTX
The story of Nomuzi and the way she was living
PPTX
asdmadsmammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm.pptx
PDF
oppenheimer and the story of the atomic bomb
PPTX
History ATA Presentation.pptxhttps://www.slideshare.net/slideshow/role-of-the...
PDF
EVs U-5 ONE SHOT Notes_c49f9e68-5eac-4201-bf86-b314ef5930ba.pdf
PPTX
E8 ssssssssssssssssssssssssssssssssssQ1 0101 PS.pptx
PPT
business model and some other things that
DOCX
Lambutchi Calin Claudiu had a discussion with the Buddha about the restructur...
PDF
TAIPANQQ SITUS MUDAH MENANG DAN MUDAH MAXWIN SEGERA DAFTAR DI TAIPANQQ DAN RA...
Bioremediation Technologies best for safe envirnment.pdf
WKA #29: "FALLING FOR CUPID" TRANSCRIPT.pdf
the Honda_ASIMO_Presentation_Updated.pptx
wegen seminar ppt.pptxhkjbkhkjjlhjhjhlhhvg
Between the Reels and the Revolution Enzo Zelocchi’s Unscripted Path Through ...
Rakshabandhan – Celebrating the Bond of Siblings - by Meenakshi Khakat
genderandsexuality.pptxjjjjjjjjjjjjjjjjjjjj
providenetworksystemadministration.pptxhnnhgcbdjckk
Songlyrics.net-website for lyrics song download
A New Kind of Director for a New Kind of World Why Enzo Zelocchi Matters More...
Gregerorgio Juba_20250803_191224_0000.pdf
The story of Nomuzi and the way she was living
asdmadsmammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm.pptx
oppenheimer and the story of the atomic bomb
History ATA Presentation.pptxhttps://www.slideshare.net/slideshow/role-of-the...
EVs U-5 ONE SHOT Notes_c49f9e68-5eac-4201-bf86-b314ef5930ba.pdf
E8 ssssssssssssssssssssssssssssssssssQ1 0101 PS.pptx
business model and some other things that
Lambutchi Calin Claudiu had a discussion with the Buddha about the restructur...
TAIPANQQ SITUS MUDAH MENANG DAN MUDAH MAXWIN SEGERA DAFTAR DI TAIPANQQ DAN RA...

ASH Archit ecture and Advanced Usage.pdf

  • 1. ASH Architecture and Advanced Usage Graham Wood, Uri Shaft, John Beresniewicz Oracle America
  • 2. The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
  • 3. Agenda • ASH Fundamentals and Architecture • Estimating DB Time Using ASH • The ASH Fix-up Mechanism • Estimating Event Counts Using ASH • Additional ASH Arcana • Lessons Learned
  • 4. Motivation for ASH in 10i Performance diagnostics tool with: • Always on (unlike SQL trace) • History of database activity • Fine grained detail Other requirements: • Very small performance overhead • Uses no locks of any kind • Works well even when system is overloaded
  • 5. Active Session History (time based) A sample of activity taken at regular intervals t1 Session 1 Session 2 Session 3 Session Time Duration Activity SQL Object 1 t1 null CPU SQL1 null 2 t1 100ms I/O SQL2 EMP 3 t1 5ms I/O SQL1 EMP 3 t2 3sec Row lock SQL1 DEPT t2 t3
  • 6. ASH Architecture Every 1 hour or out-of-space AWR Circular buffer in SGA (2MB per CPU) DBA_HIST_ACTIVE_SESS_HISTORY V$ACTIVE_SESSION_HISTORY Session state objects MMON Lite (MMNL) V$SESSION V$SESSION_WAIT Variable length rows Direct-path INSERTS Write 1 out of 10 samples
  • 7. ASH Architecture Readers go unlatched Writer goes one direction Every 1 hour or out-of-space AWR Circular buffer in SGA (2MB per CPU) DBA_HIST_ACTIVE_SESS_HISTORY V$ACTIVE_SESSION_HISTORY MMON Lite (MMNL) Session state objects V$SESSION V$SESSION_WAIT Readers go the opposite way Indexe d on time Indexe d on time
  • 8. ASH Pros and Cons PROS • Supports the DB Time method of performance analysis • Historically for large or systemic problems • Recently or “now” (V$SESSION still available) for emergency • Always available (modulo licensing restrictions) • Minimal cost to server performance CONS • Queries become estimates • We query a sample and not the full data set • Difficult to form semantically meaningful queries • Probability distribution of a query result matters
  • 9. Database Time and Active Sessions Database Time = Total time spent by sessions in the database server actively working (on CPU) or actively waiting (non-idle wait) Active Sessions = The number of sessions active (working or waiting) in the database server at a particular time Active Sessions is the rate of change of Database Time over time, also referred to as load
  • 10. ASH and Database Time    i S T DBT DB Time is approximated by multiplying sampling interval by the total count of the samples DBT » T · (rows in ASH) Each ASH sample represents 1 second of DB Time
  • 11. Basic ASH Math COUNT(*) = DB Time GROUP BY ?
  • 12. Active Sessions and DB Time • The number of active sessions at any time is the rate of change of the “DB Time function” at that time. • DB Time is the integral of the Active Session function.   1 0 t t ions ActiveSess DBtime ions ActiveSess t DBtime    /
  • 13. time t0 t1 Active Sessions and DB Time t = 1 sec Active sessions DB Time
  • 14. ASH DB Time Query (basic ASH math) • What are the top SQL by database time in the last 5 minutes? SELECT * FROM (SELECT NVL(SQL_ID,'NULL') as SQL_ID ,SUM(1) as DBtime_secs FROM V$ACTIVE_SESSION_HISTORY WHERE sample_time > SYSDATE - 5/24/60 GROUP BY SQL_ID ORDER BY 2 DESC ) WHERE rownum < 6; NOTE: SUM(:T) can be substituted for SUM(1) to account for non-default sampling intervals
  • 15. ASH Math: EM Top Activity
  • 16. ASH Math: ASH Analytics
  • 18. 18 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8
  • 19. Common Mistakes ASH = Complete record of database activity • SUM(time_waited) = total time spent in database • COUNT(*) = total count of wait events • AVG(time_waited) = average latency of waits ASH = Random sample of wait events • AVG(time_waited) = estimate of average wait latency • Top SQL by SUM(time_waited) = Top SQL by time spent in the server
  • 20. Queries Over Samples • ASH is a time sample of the true workload • One can imagine multiple ASH mechanisms each producing different results from the exact same true workload • ASH rows are random variables extracted from the true data • It is a time sample: the probability an event is sampled is proportional to the event duration • A 10ms event is 10 times more likely to be in ASH than a 1ms event
  • 21. Query Results Are Estimates • ASH does not contain all the data, just a sample • Any aggregate over ASH data is a random variable • It estimates something or is a sample of something. • Any query over samples is an unbiased estimator of a property value if the expected value of the query equals that property value
  • 22. Compare Estimated to Actual DB Time
  • 23. Bad ASH Math • SQL observed using 9 secs of CPU every 10 secs
  • 24. The ASH “Fix-up” • ASH columns may be unknown at sampling time • TIME_WAITED: session is still waiting • PLAN_HASH: session is still optimizing SQL • GC events: event details unknown at event initiation • Certain time model bit vector columns • ASH “fixes up” data during subsequent sampling • TIME_WAITED fixed up in first sample after event completes • Long events: last sample gets fixed up time_waited (all others stay 0) • Querying current ASH may return un-fixed rows • Should not be a problem generally
  • 25. ASH Fix-up 1: Sampling Session State Objects ASH Sampler MMNL ON CPU ON CPU ON CPU WAITING WAITING IO WAIT V$ASH TIME_WAITED PLAN_HASH TIME_WAITED Fix-up List Post-wait callback registered by session
  • 26. ASH Fix-up 2: Fixing Up Callback Session State Objects ASH Sampler MMON ON CPU ON CPU ON CPU TIME_WAITED=n WAITING V$ASH TIME_WAITED=n PLAN_HASH Fix-up List
  • 27. Estimating Event Counts With ASH :T = ASH sampling interval TIME_WAITED = length of sampled event N = Number of events the ASH sample represents :T ~ TIME_WAITED * N OR N ~ :T / TIME_WAITED
  • 28. Why Event Count = :T / Time_Waited? 250 ms Sally 250 ms 250 ms 250 ms 250 ms User1 User2 User3 User4 tN tN+:T :T elapsed time (1000 ms) ASH sample time Sally’s row in ASH estimates 1000/250 ms = 4 waits during time interval
  • 29. ASH Event Count Query (1) • Top 5 objects by User I/O requests last 15 minutes SELECT W.*,O.object_name FROM (SELECT current_obj# ,ROUND(SUM(100000/time_waited)) as est_IOwaits ,SUM(1) as est_DBtime FROM V$ACTIVE_SESSION_HISTORY WHERE sample_time > SYSDATE - 15/24/60 AND time_waited > 0 AND event IN (‘db file sequential read’ ,’db file scattered read’) GROUP BY current_obj# ORDER BY 2 DESC ) W , DBA_OBJECTS O WHERE o.object_id = W.current_obj# AND ROWNUM < 6;
  • 30. Long Events • Events that are longer than :T are always sampled • No sampler “bias” for these • Event may be sampled multiple times • Only final (“fixed-up”) ASH row for long events has TIME_WAITED > 0 • The event count of long events is known: • 1 for each row with TIME_WAITED >= :T • The DB Time estimate is still COUNT(*)
  • 31. ASH Event Count Query (2) • Top objects by I/O event counts adjusted for long events SELECT * FROM (SELECT current_obj# ,ROUND(SUM(CASE WHEN time_waited >= 1000000 THEN 1 ELSE 1000000 / time_waited END)) as est_IOwaits FROM V$ACTIVE_SESSION_HISTORY WHERE sample_time > SYSDATE - 15/24/60 AND time_waited > 0 AND wait_class = 'User I/O' GROUP BY current_obj# ORDER BY 2 DESC ) WHERE ROWNUM < 6;
  • 32. Estimating Event Latencies 1. We estimate total DB time on events 2. We can now estimate total event occurrences 3. Therefore, we can compute average latency Est_avg_latency_ms = est_Dbtime_ms / est_waits
  • 33. User I/O Events Ordered by Latency SELECT event ,ROUND(est_DBtime_ms/est_waits,1) as est_avg_latency_ms FROM (SELECT event ,ROUND(SUM(GREATEST(1,1000000/time_waited))) as est_waits ,SUM(1000) as est_DBtime_ms FROM V$ACTIVE_SESSION_HISTORY WHERE sample_time > SYSDATE - 15/1440 AND time_waited > 0 AND wait_class = 'User I/O' GROUP BY event ) ORDER BY 2 DESC; • Note preferred syntax for computing event counts
  • 34. DEMO?
  • 35. Additional ASH Arcana • ASH Samples “ON CPU” • Challenges of SQL_ID tracking • Sampling CPU-bound systems • Current_OBJ# and when to use it • Blocking session information
  • 36. ON CPU and ASH • ASH row status “ON CPU” derived, not observed • Session is in a database call • Session is NOT in a wait event (idle or non-idle) • Un-instrumented waits => “ON CPU” • These are bugs and should be rare, but have happened • Session on run queue may be WAITING or ON CPU • Depends on state prior to going onto run queue
  • 37. Evolution of SQL_ID •Objective is to get lowest level user level SQL •Prior to 11.1.0.7 various schemes tried •Varying levels of success, but each had problems •11.1.0.7 and above: it is the currently executing SQL • except recursions inside the server code, triggers •PL/SQL has both top level and current level calls •ENTRY_LEVEL and columns for applications that start with generic PL/SQL and end up doing interesting things underneath – need to find what the applications initially called.
  • 38. Why does ASH work when the server is CPU bound? 1. ASH sampler is very efficient, and does not lock • Therefore, in almost all cases it takes a single CPU slice to finish a full sample. 2. After a sample is done, the sampler computes next scheduled sample time and sleeps until then 3. Upon scheduled wake-up, it waits for CPU (runq) and samples again • Thus, CPU bound samples are shifted by one runq but stay about 1 per second
  • 39. ASH Sampler and Run-queue S_t0 S_t2 S_t1 Run queue Run queue A_t1 A_t0 Run queue A_t2 Sleep until next time Sleep until next Sample Sample Sample If run queue times are consistent sampling interval will be preserved but sample times shifted
  • 40. What Is CURRENT_OBJ# It is the ID of the segment we operate on when ASH is sampled. However, this is only valid in specific wait events: •I/O events on data blocks •Cluster (global cache) events on data blocks •Row Locks •Table Locks •Buffer busy (and associated RAC events) DANGER: the column is not cleared when it is invalid.
  • 41. Blocking Session BLOCKING_SESSION_STATUS BLOCKING_SESSION BLOCKING_SESSION_SERIAL# BLOCKING_INST_ID Prior to 11.2: Finding blocking session in most cases can create a lock/crash Therefore, blocking session only in same instance, only for some cases From 11.2: We copy information from Hang Manager. Therefore, events that are long enough (3 seconds in same instance, 10 seconds in RAC) get the hang information copied into ASH.
  • 42. Lessons Learned • ASH is a time-based and not event-based sample of database activity • ASH is an excellent representation of activity history • How to estimate and rank DB Time spent over ASH dimensions using basic ASH Math • ASH estimates underlie key DB Time performance analysis use cases exposed by EM • The ASH fix-up is a critical (and unique) mechanism • How to estimate event counts and latencies using TIME_WAITED
  • 43. DEMO?