SlideShare a Scribd company logo
RELATING THE TIME
REQUIRED TO OBSERVE A
CERTAIN NUMBER OF EVENTS
Asoka Korale, Ph.D.
C.Eng. MIESL
MOTIVATIONS FOR RELATING TIME
AND EVENTS
APPLICATIONS OF RELATING TIME
AND EVENTS Call CentersTraffic Management
Transportation and Logistics
Packet Switching
Production Scheduling
Forecasting / Relating Time based Ev
INSIGHTS FROM RELATING TIME
AND EVENTS• Relate an interval of observation to a sum of inter-arrival time random
variables
• Relate the interval of observation to
• the total number of events observed in the interval
• the uncertainty associated with the average number of events in the
interval
• the sum of the number of inter-arrival time intervals that compose
the interval
• Establish a probabilistic relationship for the time taken to observe a
number of events
• Relate the uncertainty in the interval of observation to a number of
events
NOVEL STOCHASTIC RELATIONSHIP
BETWEEN
TIME AND EVENTS
RELATE TIME TAKEN TO OBSERVE A CERTAIN NUMBER OF EVENTS
UNCERTAINTY ASSOCIATED WITH
EVENTS OVER TIME
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
 total time (Z) to observe a number of events (N) is a sum
of a similar number of inter-arrival time – time intervals
 each inter-arrival time – time interval a random variable
(∆𝑡i)
 total uncertainty in the time interval (Z) a reflection of
the uncertainty associated with each individual random
variable (∆𝑡i)
 the dependence between random variables impacts the
total uncertainty associated with the sum
 total uncertainty in the interval (Z) – leads to the
variance in the number of events observed in such an
interval
time interval Z to observe N events
inter-arrival time random
variables
distribution of inter-arrival
times
events
A SUM OF INTER-ARRIVAL TIME
RANDOM VARIABLES
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
• Each event inter-arrival time ∆𝑡i is a random variable
• each such random variable has associated with it a certain uncertainty
• An N number of inter-arrival time random variables are required to observe an equivalent
number of events
• The total time Z taken to observe N events is a sum of N inter-arrival time random
variables
• The uncertainty associated with this sum of random variables – translates in to a number
of events
• a number of events associated with the uncertainty in the total time taken to observe
the events
• The distribution of the inter-arrival times may be estimated from historical data
RELATING TIME AND
EVENTS
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
when the inter-arrival times are drawn from a single distribution and are independent (IID), Z has
mean and variance
E(Z) = 𝑁𝜇∆𝑡
Var Z = 𝑁𝜎∆𝑡
2
E(∆𝑡𝑖) = 𝜇∆𝑡
Var ∆𝑡𝑖 = 𝜎∆𝑡
2
when the events are correlated the variance of the sum of a number of inter-arrival times will
feature the covariance between each pair of random variables that compose the sum
𝑉𝑎𝑟 𝑍 = ∀𝑖 𝑉𝑎𝑟(∆𝑡𝑖) + ∀𝑖,𝑗 𝑖≠𝑗 𝐶𝑜𝑣(∆𝑡𝑖∆𝑡𝑗)
𝑁 =
𝑉𝑎𝑟(𝑍)
𝜇∆𝑡
=
𝑁𝜎∆𝑡
2
𝜇∆𝑡
𝑁 = 𝑁 ± 𝑘 ∗ 𝑁
• to observe 𝑁 number of events in a time interval of length Z
• scale the variance (or standard deviation) via constant k
• a measure of the degree of the uncertainty in N - a measure of its deviation
from the mean.
where
where
NOVEL STOCHASTIC MODEL OF AN
M/M/1 QUEUE SYSTEM
BY RELATING TIME AND EVENTS VIA A SUM OF INTER-ARRIVAL TIME RANDOM
VARIABLES
Birth – Death process model of an M/M/1
Queue System
Deterministic approach –
• rates are deterministic – usually measured over an
interval of time
λ
>
n=0
Po
<
µ
λ
>
λ
>
<
µ
<
µ
n=n
Pn
n= n-1
Pn-1
n=1
P1
n=2
P2
λ
>
<
µ
λ𝑃𝑛−1 = μ𝑃𝑛 𝑃𝑛 = (λ/μ) 𝑛
𝑃0
𝑛=0
𝑁
𝑃𝑛 = 1ρ = λ/μ
𝑃𝑛 = 𝜌 − 1 [ 𝜌 𝑁+1
− 1]ρ 𝑛
balance equations
traffic intensity
probability distribution of
state
use sum to solve
for Po
probability of
state
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
approach
Deterministic Approach Stochastic Approach
λ
>
n=0
Po
<
µ
λ
>
λ
>
<
µ
<
µ
n=n
Pnn= n-1
Pn-1
n=1
P1
n=2
P2
λ
>
<
µ
λ 𝑛−1 𝑃𝑛−1 = 𝜇 𝑛 𝑃𝑛
𝑃𝑛 = λ 𝑛−1/μ 𝑛 λ 𝑛−2/μ 𝑛−1 … (λ 𝑜/μ1)𝑃0
𝑛=0
𝑁
𝑃𝑛 = 1
ρ 𝑛 = λ 𝑛−1/μ 𝑛
λ𝑖
𝑖
= 1/∆𝑡𝑖
𝐴
𝜇𝑖
𝑖
= 1/∆𝑡𝑖
𝐷
λ𝑖 = 𝐸{λ𝑖
𝑖
} = 𝐸{1/∆𝑡𝑖
𝐴
}
𝜇𝑖 = 𝐸{𝜇𝑖
𝑖
} = 𝐸{1/∆𝑡𝑖
𝐷
}
instantaneous arrivals and
departure rates
𝑃𝑛
𝑖
= ∆ 𝑡 𝑛
𝐷
∆ 𝑡 𝑛−1
𝐴
. . . (∆ 𝑡1
𝐷
∆ 𝑡0
𝐴
)𝑃0
𝑃𝑛 = λ 𝑛−1/μ 𝑛 λ 𝑛−2/μ 𝑛−1 … (λ 𝑜/μ1)𝑃0
expected probability of state converges to
deterministic result
instantaneous probability of
state
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
Probability of observing a particular sequence of
events
when inter-arrival times are independent the expectation of the product it the product
of the expectations
Let Z = 𝑃(∆𝑡1, ∆𝑡2, … , ∆𝑡 𝑁)
E1 E2 EN-1 EN
∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
𝑃 𝑍 = 𝑖=1
𝑁
𝑃(∆𝑡𝑖)
𝐸 𝑃 𝑍 = 𝐸
𝑖=1
𝑁
)𝑃(∆𝑡𝑖 =
𝑖=1
𝑁
}𝐸{𝑃(∆𝑡𝑖)
probability of a sequence is the product of the individual probabilities of observing a
particular inter-arrival time
when inter-arrival times are independent – consistent with an M/M/1
scenario
ANOMALY DETECTION IN AN M/M/1
QUEUE SYSTEM
CHARACTERIZING PERFORMANCE OF A SOFTWARE COMPONENT
Anomaly Detection Scheme
• A system of components
• Each component a queue / server
Comp 1
Comp 2
Comp 3
Comp N
• Component Load  Distribution of No of Messages in System
 Arrivals – Departures in ∆𝑇
load trigger threshold
M (I)
State
N+1
Comp
1
Comp
2
Comp
3
Comp
N
Comp
1 1 1
State N
Comp
2
Comp
• Dispersion of anomaly across component sy
Estimating Load on a Software
Component
• Treat system as a network of components
• inter-arrival times help to characterize the performance best
• Model each component as queue – server system
• Queue – buffering messages into the component
• Server – processing all messages within a component
• No of messages in “system” (in queuing parlance)
• those waiting and in service – difference between arrivals and departures
• account for multiple queues within a component
--------------------------------------------------------------------------
• Common approach - threshold based alert system
• Thresholds commonly measure performance - at
• component level
• system level
• Typically Thresholds use – latencies, queue lengths,
Performance Measures - Software
Component
• Variation in the number of messages in “system” (in queuing parlance)
• Performance measures –
• Variance, Mean - of messages in the system
• Variance / Mean - of messages in the system
• Estimate Performance measure from the Distribution of
• no of messages
• Variance / Mean
• Threshold setting –
• detect an outlier
• a certain number of standard deviations from mean
• The time behavior of the distribution in the arrivals and departures will imp
 envision time dependent thresholds
Characterizing Variation in the load
𝑍 𝐴
= ∆𝑇 = ∆𝑡1 𝐴 + ∆𝑡2 𝐴 + ⋯ + ∆𝑡 𝑁 𝐴
𝑍 𝐷
= ∆𝑇 = ∆𝑡1 𝐷 + ∆𝑡2 𝐷 + ⋯ + ∆𝑡 𝑁 𝐷
𝑁 𝐴 = 𝑘 𝐴
𝑉𝑎𝑟(𝑍 𝐴)
𝜇∆𝑡,𝐴
= 𝑘 𝐴
𝑁 𝐴 𝜎 𝐴
2
𝜇∆𝑡,𝐴
𝑁 𝐷 = 𝑘 𝐷
𝑉𝑎𝑟(𝑍 𝐷)
𝜇∆𝑡,𝐷
= 𝑘 𝐷
𝑁 𝐷 𝜎 𝐷
2
𝜇∆𝑡,𝐷
𝑁 = 𝐸{𝑁 𝐴
} − 𝐸{𝑁 𝐷
}
𝑉𝑎𝑟{𝑁 𝐴
− 𝑁 𝐷
} = 𝑉𝑎𝑟{𝑁 𝐴
} + Var{𝑁 𝐷
}
No of events in the system at the end of a common time interval ∆𝑇 is the difference
between those that arrive and those that depart
total number of arrivals in time interval ∆𝑇 is 𝑁 𝐴
total number of arrivals in time interval ∆𝑇
is 𝑁 𝐷
number of arrivals associated with the composition of 𝑁 𝐴
events in
time interval ∆𝑇
number of departures associated with the composition of 𝑁 𝐷
events
in time interval ∆𝑇
average number of events in the system at the end of time
interval ∆𝑇
variance in the number of events in the system at the end of
time interval ∆𝑇
The variance arises due to the contribution of the individual uncertainties associated with
the individual random variables that compose the sum ∆𝑇
Components
• Model the anomaly state (yes 1 / no 0) at each
component - interface
• Track anomalies across system and across
time via a transition matrix (M)
• Update transition matrix entries at each
change of state
• the difference between matrix M(I+1) and
M(I) will provide system state at M(I-1) and
also the
• The transition matrix gives insight in to how
M (I)
State
N+1
Comp
1
Comp
2
Comp
3
Comp
N
Comp
1 1 1
State N
Comp
2
Comp
3 1
Comp
N
Comp 1
Comp 2
Comp 3
Comp N
M (I+1)
State
N+1
Comp
1
Comp
2
Comp
3
Comp
N
Comp
1 2 1
State N
Comp
2
Comp
3 1
update when system state changes
record anomaly on a link-component
RESULTS:
ANOMALY DETECTION IN AN M/M/1
QUEUE SYSTEM
TO CHARACTERIZE PERFORMANCE OF A SOFTWARE COMPONENT
Test Scenarios and Validation of model
Test Scenarios:
• Different offered load and service discipline
• Poisson arrivals (exponential service time with
independent increments) Exponential service
time (independent increments)
Summary Results:
• Behavior of number in system
• Average number in system =
difference in mean arrivals and
departures
• Variance of number in system =
sum of variances in arrivals and
departures
Inter-Arrival Time (s) Scenario
I
Scenario
II
Scenario
III
Arrivals - Mean
Inter-Arrival Time
0.50 0.51 0.79
Arrivals - Variance
Inter-Arrival Time
0.26 0.26 0.62
Departures - Mean
Inter-Arrival Time
0.50 0.80 1.00
Departures -
Variance
Inter-Arrival Time
0.24 0.64 1.05
Number Over Window Scenario I Scenario II Scenario
III
Mean Arrivals 19.93 19.79 12.67
Variance in Arrivals 19.35 18.52 11.36
Mean Departures 20.05 12.39 9.99
Variance in Departures 18.71 10.98 10.41
Mean (Arrivals - Departures) -0.13 7.39 2.68
Variance (Arrivals - 37.57 29.45 21.38
Arrivals / Departures Process
• Exponential service time with mean 0.5
seconds
• Distribution of number of arrivals in an
interval of 10s
• The number of arrivals equivalent to the sum
of a number of inter-arrival times
• which is a sum of random variables
• the sum converges to a normal
Characterizing component load
• Use distribution of the average number of
events in the system into characterize the
load
• Variance in the number of events in
system
set thresholds to trigger at a probability level
Variation in the Variance
• Use cumulative distribution in the variance to
characterize the impact of variation in the
variance with window length
• Longer windows feature a larger number of
events – each event a inter-arrival time random
variable
• The uncertainty scales with the number of
random variables in the sum
• Longer intervals have larger uncertainty
associated with the composition of the time
interval –
• rightward shifting – flattening curves
𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
𝑁 =
𝑉𝑎𝑟(𝑍)
𝜇∆𝑡
=
𝑁𝜎∆𝑡
2
𝜇∆𝑡
IMPROVEMENTS AND FUTURE
WORK
MESSAGE SCHEDULING AND THRESHOLD OPTIMIZATION
SCHEDULING AND IMPACT ON
PERFORMANCE
• Introduce load balancing to intelligently route messages –
• Particularly in components with multiple queues
• Assign messages
• to queue with lowest load
• to queue that is most likely to process it fastest / most efficiently
• Characterizing
• processing time of messages as a function of
• Type of messages – and expected processing time
• messages in the queue …
• Model inter-arrival times – on a per queue basis –
• see appendix: on relating events and time taken to observe them
• Account for time dependence of statistics
ALERT THRESHOLDS
OPTIMIZATION
• Critical Stats guide uses a fixed set of thresholds
• Consider component load stat – use variation of number of messages in
• stat – based on existing / recoded measurements
• Performance at component level –
• irrespective of input conditions
• based on maximum design spec of component
• depending on input conditions – traffic / trading / time dependent
• set thresholds to account for behavior that is also depending on
• expected / normal traffic
• Determine threshold values based on Normal / Abnormal behavior
• amount of load that is historically observed
• Consider time based thresholds –
• if feasible – as offered load is time varying
• Tune anomaly threshold – based on time varying load
Slide | 27
THANK YOU
APPENDIX
EVENT (INTER) ARRIVAL
TIME PROCESS
EVENT INTER –
ARRIVAL TIME
• Introduce a Feature to characterize the “Time property” in the Event based Model
• Each Event has a time stamp and between Events – an Event “Inter - Arrival time”
• Modeling this “time interval” will give insights in to “Time Patterns” of the Events in
characterizing Trading behavior
• Natural to consider basic statistics related to Inter- Arrival Time
• Descriptive Statistics – means, variances, Higher Order Statistics
• But they don’t necessarily capture the characteristics in the pattern of Event
Inter - Arrival Times
• Also fitting Distributions and estimating their characteristics may not be very viable /
reliable
• Data Dependent, too little data to estimate , degree of fit issues
A B C B …….. A C
E1 E2 E3 E4 …..... EN-1 EN
…….t1 t2 t3 tN-1
Event Type
Event No
MODELING THE - EVENT INTER-
ARRIVAL TIME
• This Time Series captures the time patterns in the placing of Market Orders and Trading
Event
• We characterize and quantify these patterns through Statistical Analysis that captures its
important properties
• The Randomness in the Event Inter - Arrival times – via Entropy
• Autocorrelation – measures degree of correlation between samples of inter arrival
times
A B C B …….. A C
E1 E2 E3 E4 .…..... EN-1 EN
…….t1 t2 t3 tN-1
1 2 3 ........... N-2 N-1
..…….
t1
t2
t3
tN-1
Event Type
Time Series of Event Inter - Arrival Time
Sample Number of time series
Event No
ti - Event inter-arrival time
A DISTRIBUTION INDEPENDENT OF
MEASUREMENT (TIME) WINDOW
• Observe the distribution of the time between each pair of events
• call it the event inter arrival time
• The distribution of this quantity does not change as its not dependent on a
window of measurement.
• purely a function of the event arrival (generative) process
• the process will depend on the particular quantity (orders, trades ect …) we are
observing
• The underlying distribution however is fixed for a particular data set
RELATING NUMBER OF EVENTS OBSERVED TO
INTERVALS OF TIME
E1 E2 E3 EN-1 EN
…….
Event No
∆𝑡 𝑁∆𝑡1 ∆𝑡2 ∆𝑡3
Z= ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁
Let Z be the sum of N IID random variables drawn from the distribution of the
inter arrival time
E(Z) = 𝑁𝜇∆𝑡
Let the mean and variance of distribution of the inter-arrival time
be
E(∆𝑡) = 𝜇∆𝑡 Var(∆𝑡) = 𝜎∆𝑡
2
∆𝑡
Var(Z) = 𝑁𝜎∆𝑡
2
For large N
Z is a random variable and is the time taken to observe N events.
Its expected value (average) is E(Z)
A measure of the uncertainty in Z (about its mean) is its standard deviation
RELATING NUMBER OF EVENTS TO
INTERVALS OF TIME
E1 E2 E3 EN-1 EN
…….
Event No
∆𝑡 𝑁∆𝑡1 ∆𝑡2 ∆𝑡3
• The uncertainty in Z can be translated in to an average number of events
• As the total time and total the number of IID events observed in that time is
related probabilistically via the distribution in the inter arrival time
• So we may estimate an average number of events associated with this uncertainty
𝜎𝑧
2 = 𝑁𝜎∆𝑡
2
𝑁 =
𝜎𝑧
𝜇∆𝑡
=
𝑁𝜎∆𝑡
2
𝜇∆𝑡
Thus we may set a threshold “T” for the number of events observed in an interval of
length
to detect outliers
E(Z) = 𝑁𝜇∆𝑡
𝑇 > 𝑁 + 𝑎 𝑓𝑎𝑐𝑡𝑜𝑟 ∗ 𝑁

More Related Content

PPTX
State modeling
PDF
Realtime systems chapter 1
PDF
Ch5 transient and steady state response analyses(control)
PDF
Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...
PPTX
Real Time System
PDF
Ch2 mathematical modeling of control system
PPT
Ds ppt imp.
PDF
Distributed computing time
State modeling
Realtime systems chapter 1
Ch5 transient and steady state response analyses(control)
Dynamic vs. Traditional Probabilistic Risk Assessment Methodologies - by Huai...
Real Time System
Ch2 mathematical modeling of control system
Ds ppt imp.
Distributed computing time

What's hot (6)

PDF
Ch1 introduction to control
PPTX
Lecture 12 time_domain_analysis_of_control_systems
PPT
Chapter 10
PPT
ppt on Time Domain and Frequency Domain Analysis
PPTX
Av 738-Adaptive Filters - Extended Kalman Filter
PDF
TIME DOMAIN ANALYSIS
Ch1 introduction to control
Lecture 12 time_domain_analysis_of_control_systems
Chapter 10
ppt on Time Domain and Frequency Domain Analysis
Av 738-Adaptive Filters - Extended Kalman Filter
TIME DOMAIN ANALYSIS
Ad

Similar to Improving predictability and performance by relating the number of events and the time over which to observe them (20)

PDF
Industrial engineering notes for gate
PDF
Lecture-2-01-02-2022.pdf
PPTX
Control system unit(1)
DOCX
Improving predictability and performance by relating the number of events and...
PPT
14 queuing
PDF
solver (1)
PPTX
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
PPTX
KEC-602 Control System Unit-3 gandgfdghhg
PDF
Module 1 (1).pdf
PPTX
Probabilistic slope stability analysis as a tool to optimise a geotechnical s...
PPT
Chap 5
PPTX
lecture_1__2-introduction to Control Systems.pptx
PDF
T1-4_Maslennikov_et_al.pdf
PDF
Computational Intelligence for Time Series Prediction
PPTX
Discrete Time Systems & its classifications
PDF
digital control sfrrffttys 24-25 (1).pdf
PPTX
Cost minimization model
PDF
Real Time Systems
PPTX
DTSP UNIT I - INTRODUCTION.pptx
PDF
unit-v_modelling-and-analysis-of-mechatronic-system.pdf
Industrial engineering notes for gate
Lecture-2-01-02-2022.pdf
Control system unit(1)
Improving predictability and performance by relating the number of events and...
14 queuing
solver (1)
When Two Choices Are not Enough: Balancing at Scale in Distributed Stream Pro...
KEC-602 Control System Unit-3 gandgfdghhg
Module 1 (1).pdf
Probabilistic slope stability analysis as a tool to optimise a geotechnical s...
Chap 5
lecture_1__2-introduction to Control Systems.pptx
T1-4_Maslennikov_et_al.pdf
Computational Intelligence for Time Series Prediction
Discrete Time Systems & its classifications
digital control sfrrffttys 24-25 (1).pdf
Cost minimization model
Real Time Systems
DTSP UNIT I - INTRODUCTION.pptx
unit-v_modelling-and-analysis-of-mechatronic-system.pdf
Ad

More from Asoka Korale (20)

PPTX
Novel price models in the capital market
PDF
Modeling prices for capital market surveillance
DOCX
Entity profling and collusion detection
PDF
Entity Profiling and Collusion Detection
PDF
Markov Decision Processes in Market Surveillance
PDF
A framework for dynamic pricing electricity consumption patterns via time ser...
PDF
A framework for dynamic pricing electricity consumption patterns via time ser...
DOC
Customer Lifetime Value Modeling
DOCX
Forecasting models for Customer Lifetime Value
DOC
Capacity and utilization enhancement
DOC
Cell load KPIs in support of event triggered Cellular Yield Maximization
DOCX
Vehicular Traffic Monitoring Scenarios
PPTX
Mixed Numeric and Categorical Attribute Clustering Algorithm
PPTX
Introduction to Bit Coin Model
PPTX
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
PPTX
Mapping Mobile Average Revenue per User to Personal Income level via Househol...
DOCX
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
PPTX
event tiggered cellular yield enhancement linkedin
DOCX
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
PPTX
Estimating market share through mobile traffic analysis linkedin
Novel price models in the capital market
Modeling prices for capital market surveillance
Entity profling and collusion detection
Entity Profiling and Collusion Detection
Markov Decision Processes in Market Surveillance
A framework for dynamic pricing electricity consumption patterns via time ser...
A framework for dynamic pricing electricity consumption patterns via time ser...
Customer Lifetime Value Modeling
Forecasting models for Customer Lifetime Value
Capacity and utilization enhancement
Cell load KPIs in support of event triggered Cellular Yield Maximization
Vehicular Traffic Monitoring Scenarios
Mixed Numeric and Categorical Attribute Clustering Algorithm
Introduction to Bit Coin Model
Estimating Gaussian Mixture Densities via an implemetation of the Expectaatio...
Mapping Mobile Average Revenue per User to Personal Income level via Househol...
Asoka_Korale_Event_based_CYM_IET_2013_submitted linkedin
event tiggered cellular yield enhancement linkedin
IET_Estimating_market_share_through_mobile_traffic_analysis linkedin
Estimating market share through mobile traffic analysis linkedin

Recently uploaded (20)

PDF
Introduction to Data Science and Data Analysis
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Introduction to Knowledge Engineering Part 1
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
Introduction to the R Programming Language
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
annual-report-2024-2025 original latest.
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
Introduction to Data Science and Data Analysis
Fluorescence-microscope_Botany_detailed content
Qualitative Qantitative and Mixed Methods.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Data_Analytics_and_PowerBI_Presentation.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Introduction to Knowledge Engineering Part 1
ISS -ESG Data flows What is ESG and HowHow
IBA_Chapter_11_Slides_Final_Accessible.pptx
Mega Projects Data Mega Projects Data
IB Computer Science - Internal Assessment.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Business Ppt On Nestle.pptx huunnnhhgfvu
Introduction to the R Programming Language
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
annual-report-2024-2025 original latest.
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Clinical guidelines as a resource for EBP(1).pdf

Improving predictability and performance by relating the number of events and the time over which to observe them

  • 1. RELATING THE TIME REQUIRED TO OBSERVE A CERTAIN NUMBER OF EVENTS Asoka Korale, Ph.D. C.Eng. MIESL
  • 2. MOTIVATIONS FOR RELATING TIME AND EVENTS
  • 3. APPLICATIONS OF RELATING TIME AND EVENTS Call CentersTraffic Management Transportation and Logistics Packet Switching Production Scheduling Forecasting / Relating Time based Ev
  • 4. INSIGHTS FROM RELATING TIME AND EVENTS• Relate an interval of observation to a sum of inter-arrival time random variables • Relate the interval of observation to • the total number of events observed in the interval • the uncertainty associated with the average number of events in the interval • the sum of the number of inter-arrival time intervals that compose the interval • Establish a probabilistic relationship for the time taken to observe a number of events • Relate the uncertainty in the interval of observation to a number of events
  • 5. NOVEL STOCHASTIC RELATIONSHIP BETWEEN TIME AND EVENTS RELATE TIME TAKEN TO OBSERVE A CERTAIN NUMBER OF EVENTS
  • 6. UNCERTAINTY ASSOCIATED WITH EVENTS OVER TIME E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁  total time (Z) to observe a number of events (N) is a sum of a similar number of inter-arrival time – time intervals  each inter-arrival time – time interval a random variable (∆𝑡i)  total uncertainty in the time interval (Z) a reflection of the uncertainty associated with each individual random variable (∆𝑡i)  the dependence between random variables impacts the total uncertainty associated with the sum  total uncertainty in the interval (Z) – leads to the variance in the number of events observed in such an interval time interval Z to observe N events inter-arrival time random variables distribution of inter-arrival times events
  • 7. A SUM OF INTER-ARRIVAL TIME RANDOM VARIABLES E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁 • Each event inter-arrival time ∆𝑡i is a random variable • each such random variable has associated with it a certain uncertainty • An N number of inter-arrival time random variables are required to observe an equivalent number of events • The total time Z taken to observe N events is a sum of N inter-arrival time random variables • The uncertainty associated with this sum of random variables – translates in to a number of events • a number of events associated with the uncertainty in the total time taken to observe the events • The distribution of the inter-arrival times may be estimated from historical data
  • 8. RELATING TIME AND EVENTS E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁 when the inter-arrival times are drawn from a single distribution and are independent (IID), Z has mean and variance E(Z) = 𝑁𝜇∆𝑡 Var Z = 𝑁𝜎∆𝑡 2 E(∆𝑡𝑖) = 𝜇∆𝑡 Var ∆𝑡𝑖 = 𝜎∆𝑡 2 when the events are correlated the variance of the sum of a number of inter-arrival times will feature the covariance between each pair of random variables that compose the sum 𝑉𝑎𝑟 𝑍 = ∀𝑖 𝑉𝑎𝑟(∆𝑡𝑖) + ∀𝑖,𝑗 𝑖≠𝑗 𝐶𝑜𝑣(∆𝑡𝑖∆𝑡𝑗) 𝑁 = 𝑉𝑎𝑟(𝑍) 𝜇∆𝑡 = 𝑁𝜎∆𝑡 2 𝜇∆𝑡 𝑁 = 𝑁 ± 𝑘 ∗ 𝑁 • to observe 𝑁 number of events in a time interval of length Z • scale the variance (or standard deviation) via constant k • a measure of the degree of the uncertainty in N - a measure of its deviation from the mean. where where
  • 9. NOVEL STOCHASTIC MODEL OF AN M/M/1 QUEUE SYSTEM BY RELATING TIME AND EVENTS VIA A SUM OF INTER-ARRIVAL TIME RANDOM VARIABLES
  • 10. Birth – Death process model of an M/M/1 Queue System Deterministic approach – • rates are deterministic – usually measured over an interval of time λ > n=0 Po < µ λ > λ > < µ < µ n=n Pn n= n-1 Pn-1 n=1 P1 n=2 P2 λ > < µ λ𝑃𝑛−1 = μ𝑃𝑛 𝑃𝑛 = (λ/μ) 𝑛 𝑃0 𝑛=0 𝑁 𝑃𝑛 = 1ρ = λ/μ 𝑃𝑛 = 𝜌 − 1 [ 𝜌 𝑁+1 − 1]ρ 𝑛 balance equations traffic intensity probability distribution of state use sum to solve for Po probability of state E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
  • 11. approach Deterministic Approach Stochastic Approach λ > n=0 Po < µ λ > λ > < µ < µ n=n Pnn= n-1 Pn-1 n=1 P1 n=2 P2 λ > < µ λ 𝑛−1 𝑃𝑛−1 = 𝜇 𝑛 𝑃𝑛 𝑃𝑛 = λ 𝑛−1/μ 𝑛 λ 𝑛−2/μ 𝑛−1 … (λ 𝑜/μ1)𝑃0 𝑛=0 𝑁 𝑃𝑛 = 1 ρ 𝑛 = λ 𝑛−1/μ 𝑛 λ𝑖 𝑖 = 1/∆𝑡𝑖 𝐴 𝜇𝑖 𝑖 = 1/∆𝑡𝑖 𝐷 λ𝑖 = 𝐸{λ𝑖 𝑖 } = 𝐸{1/∆𝑡𝑖 𝐴 } 𝜇𝑖 = 𝐸{𝜇𝑖 𝑖 } = 𝐸{1/∆𝑡𝑖 𝐷 } instantaneous arrivals and departure rates 𝑃𝑛 𝑖 = ∆ 𝑡 𝑛 𝐷 ∆ 𝑡 𝑛−1 𝐴 . . . (∆ 𝑡1 𝐷 ∆ 𝑡0 𝐴 )𝑃0 𝑃𝑛 = λ 𝑛−1/μ 𝑛 λ 𝑛−2/μ 𝑛−1 … (λ 𝑜/μ1)𝑃0 expected probability of state converges to deterministic result instantaneous probability of state E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁
  • 12. Probability of observing a particular sequence of events when inter-arrival times are independent the expectation of the product it the product of the expectations Let Z = 𝑃(∆𝑡1, ∆𝑡2, … , ∆𝑡 𝑁) E1 E2 EN-1 EN ∆𝑡1 ∆𝑡2 ∆𝑡 𝑁 𝑃 𝑍 = 𝑖=1 𝑁 𝑃(∆𝑡𝑖) 𝐸 𝑃 𝑍 = 𝐸 𝑖=1 𝑁 )𝑃(∆𝑡𝑖 = 𝑖=1 𝑁 }𝐸{𝑃(∆𝑡𝑖) probability of a sequence is the product of the individual probabilities of observing a particular inter-arrival time when inter-arrival times are independent – consistent with an M/M/1 scenario
  • 13. ANOMALY DETECTION IN AN M/M/1 QUEUE SYSTEM CHARACTERIZING PERFORMANCE OF A SOFTWARE COMPONENT
  • 14. Anomaly Detection Scheme • A system of components • Each component a queue / server Comp 1 Comp 2 Comp 3 Comp N • Component Load  Distribution of No of Messages in System  Arrivals – Departures in ∆𝑇 load trigger threshold M (I) State N+1 Comp 1 Comp 2 Comp 3 Comp N Comp 1 1 1 State N Comp 2 Comp • Dispersion of anomaly across component sy
  • 15. Estimating Load on a Software Component • Treat system as a network of components • inter-arrival times help to characterize the performance best • Model each component as queue – server system • Queue – buffering messages into the component • Server – processing all messages within a component • No of messages in “system” (in queuing parlance) • those waiting and in service – difference between arrivals and departures • account for multiple queues within a component -------------------------------------------------------------------------- • Common approach - threshold based alert system • Thresholds commonly measure performance - at • component level • system level • Typically Thresholds use – latencies, queue lengths,
  • 16. Performance Measures - Software Component • Variation in the number of messages in “system” (in queuing parlance) • Performance measures – • Variance, Mean - of messages in the system • Variance / Mean - of messages in the system • Estimate Performance measure from the Distribution of • no of messages • Variance / Mean • Threshold setting – • detect an outlier • a certain number of standard deviations from mean • The time behavior of the distribution in the arrivals and departures will imp  envision time dependent thresholds
  • 17. Characterizing Variation in the load 𝑍 𝐴 = ∆𝑇 = ∆𝑡1 𝐴 + ∆𝑡2 𝐴 + ⋯ + ∆𝑡 𝑁 𝐴 𝑍 𝐷 = ∆𝑇 = ∆𝑡1 𝐷 + ∆𝑡2 𝐷 + ⋯ + ∆𝑡 𝑁 𝐷 𝑁 𝐴 = 𝑘 𝐴 𝑉𝑎𝑟(𝑍 𝐴) 𝜇∆𝑡,𝐴 = 𝑘 𝐴 𝑁 𝐴 𝜎 𝐴 2 𝜇∆𝑡,𝐴 𝑁 𝐷 = 𝑘 𝐷 𝑉𝑎𝑟(𝑍 𝐷) 𝜇∆𝑡,𝐷 = 𝑘 𝐷 𝑁 𝐷 𝜎 𝐷 2 𝜇∆𝑡,𝐷 𝑁 = 𝐸{𝑁 𝐴 } − 𝐸{𝑁 𝐷 } 𝑉𝑎𝑟{𝑁 𝐴 − 𝑁 𝐷 } = 𝑉𝑎𝑟{𝑁 𝐴 } + Var{𝑁 𝐷 } No of events in the system at the end of a common time interval ∆𝑇 is the difference between those that arrive and those that depart total number of arrivals in time interval ∆𝑇 is 𝑁 𝐴 total number of arrivals in time interval ∆𝑇 is 𝑁 𝐷 number of arrivals associated with the composition of 𝑁 𝐴 events in time interval ∆𝑇 number of departures associated with the composition of 𝑁 𝐷 events in time interval ∆𝑇 average number of events in the system at the end of time interval ∆𝑇 variance in the number of events in the system at the end of time interval ∆𝑇 The variance arises due to the contribution of the individual uncertainties associated with the individual random variables that compose the sum ∆𝑇
  • 18. Components • Model the anomaly state (yes 1 / no 0) at each component - interface • Track anomalies across system and across time via a transition matrix (M) • Update transition matrix entries at each change of state • the difference between matrix M(I+1) and M(I) will provide system state at M(I-1) and also the • The transition matrix gives insight in to how M (I) State N+1 Comp 1 Comp 2 Comp 3 Comp N Comp 1 1 1 State N Comp 2 Comp 3 1 Comp N Comp 1 Comp 2 Comp 3 Comp N M (I+1) State N+1 Comp 1 Comp 2 Comp 3 Comp N Comp 1 2 1 State N Comp 2 Comp 3 1 update when system state changes record anomaly on a link-component
  • 19. RESULTS: ANOMALY DETECTION IN AN M/M/1 QUEUE SYSTEM TO CHARACTERIZE PERFORMANCE OF A SOFTWARE COMPONENT
  • 20. Test Scenarios and Validation of model Test Scenarios: • Different offered load and service discipline • Poisson arrivals (exponential service time with independent increments) Exponential service time (independent increments) Summary Results: • Behavior of number in system • Average number in system = difference in mean arrivals and departures • Variance of number in system = sum of variances in arrivals and departures Inter-Arrival Time (s) Scenario I Scenario II Scenario III Arrivals - Mean Inter-Arrival Time 0.50 0.51 0.79 Arrivals - Variance Inter-Arrival Time 0.26 0.26 0.62 Departures - Mean Inter-Arrival Time 0.50 0.80 1.00 Departures - Variance Inter-Arrival Time 0.24 0.64 1.05 Number Over Window Scenario I Scenario II Scenario III Mean Arrivals 19.93 19.79 12.67 Variance in Arrivals 19.35 18.52 11.36 Mean Departures 20.05 12.39 9.99 Variance in Departures 18.71 10.98 10.41 Mean (Arrivals - Departures) -0.13 7.39 2.68 Variance (Arrivals - 37.57 29.45 21.38
  • 21. Arrivals / Departures Process • Exponential service time with mean 0.5 seconds • Distribution of number of arrivals in an interval of 10s • The number of arrivals equivalent to the sum of a number of inter-arrival times • which is a sum of random variables • the sum converges to a normal
  • 22. Characterizing component load • Use distribution of the average number of events in the system into characterize the load • Variance in the number of events in system set thresholds to trigger at a probability level
  • 23. Variation in the Variance • Use cumulative distribution in the variance to characterize the impact of variation in the variance with window length • Longer windows feature a larger number of events – each event a inter-arrival time random variable • The uncertainty scales with the number of random variables in the sum • Longer intervals have larger uncertainty associated with the composition of the time interval – • rightward shifting – flattening curves 𝑍 = ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁 𝑁 = 𝑉𝑎𝑟(𝑍) 𝜇∆𝑡 = 𝑁𝜎∆𝑡 2 𝜇∆𝑡
  • 24. IMPROVEMENTS AND FUTURE WORK MESSAGE SCHEDULING AND THRESHOLD OPTIMIZATION
  • 25. SCHEDULING AND IMPACT ON PERFORMANCE • Introduce load balancing to intelligently route messages – • Particularly in components with multiple queues • Assign messages • to queue with lowest load • to queue that is most likely to process it fastest / most efficiently • Characterizing • processing time of messages as a function of • Type of messages – and expected processing time • messages in the queue … • Model inter-arrival times – on a per queue basis – • see appendix: on relating events and time taken to observe them • Account for time dependence of statistics
  • 26. ALERT THRESHOLDS OPTIMIZATION • Critical Stats guide uses a fixed set of thresholds • Consider component load stat – use variation of number of messages in • stat – based on existing / recoded measurements • Performance at component level – • irrespective of input conditions • based on maximum design spec of component • depending on input conditions – traffic / trading / time dependent • set thresholds to account for behavior that is also depending on • expected / normal traffic • Determine threshold values based on Normal / Abnormal behavior • amount of load that is historically observed • Consider time based thresholds – • if feasible – as offered load is time varying • Tune anomaly threshold – based on time varying load
  • 29. EVENT INTER – ARRIVAL TIME • Introduce a Feature to characterize the “Time property” in the Event based Model • Each Event has a time stamp and between Events – an Event “Inter - Arrival time” • Modeling this “time interval” will give insights in to “Time Patterns” of the Events in characterizing Trading behavior • Natural to consider basic statistics related to Inter- Arrival Time • Descriptive Statistics – means, variances, Higher Order Statistics • But they don’t necessarily capture the characteristics in the pattern of Event Inter - Arrival Times • Also fitting Distributions and estimating their characteristics may not be very viable / reliable • Data Dependent, too little data to estimate , degree of fit issues A B C B …….. A C E1 E2 E3 E4 …..... EN-1 EN …….t1 t2 t3 tN-1 Event Type Event No
  • 30. MODELING THE - EVENT INTER- ARRIVAL TIME • This Time Series captures the time patterns in the placing of Market Orders and Trading Event • We characterize and quantify these patterns through Statistical Analysis that captures its important properties • The Randomness in the Event Inter - Arrival times – via Entropy • Autocorrelation – measures degree of correlation between samples of inter arrival times A B C B …….. A C E1 E2 E3 E4 .…..... EN-1 EN …….t1 t2 t3 tN-1 1 2 3 ........... N-2 N-1 ..……. t1 t2 t3 tN-1 Event Type Time Series of Event Inter - Arrival Time Sample Number of time series Event No ti - Event inter-arrival time
  • 31. A DISTRIBUTION INDEPENDENT OF MEASUREMENT (TIME) WINDOW • Observe the distribution of the time between each pair of events • call it the event inter arrival time • The distribution of this quantity does not change as its not dependent on a window of measurement. • purely a function of the event arrival (generative) process • the process will depend on the particular quantity (orders, trades ect …) we are observing • The underlying distribution however is fixed for a particular data set
  • 32. RELATING NUMBER OF EVENTS OBSERVED TO INTERVALS OF TIME E1 E2 E3 EN-1 EN ……. Event No ∆𝑡 𝑁∆𝑡1 ∆𝑡2 ∆𝑡3 Z= ∆𝑡1 + ∆𝑡2 + ⋯ + ∆𝑡 𝑁 Let Z be the sum of N IID random variables drawn from the distribution of the inter arrival time E(Z) = 𝑁𝜇∆𝑡 Let the mean and variance of distribution of the inter-arrival time be E(∆𝑡) = 𝜇∆𝑡 Var(∆𝑡) = 𝜎∆𝑡 2 ∆𝑡 Var(Z) = 𝑁𝜎∆𝑡 2 For large N Z is a random variable and is the time taken to observe N events. Its expected value (average) is E(Z) A measure of the uncertainty in Z (about its mean) is its standard deviation
  • 33. RELATING NUMBER OF EVENTS TO INTERVALS OF TIME E1 E2 E3 EN-1 EN ……. Event No ∆𝑡 𝑁∆𝑡1 ∆𝑡2 ∆𝑡3 • The uncertainty in Z can be translated in to an average number of events • As the total time and total the number of IID events observed in that time is related probabilistically via the distribution in the inter arrival time • So we may estimate an average number of events associated with this uncertainty 𝜎𝑧 2 = 𝑁𝜎∆𝑡 2 𝑁 = 𝜎𝑧 𝜇∆𝑡 = 𝑁𝜎∆𝑡 2 𝜇∆𝑡 Thus we may set a threshold “T” for the number of events observed in an interval of length to detect outliers E(Z) = 𝑁𝜇∆𝑡 𝑇 > 𝑁 + 𝑎 𝑓𝑎𝑐𝑡𝑜𝑟 ∗ 𝑁

Editor's Notes

  • #11: The element by element difference of the prices provides insights in to the underlying random processes …
  • #12: The element by element difference of the prices provides insights in to the underlying random processes …
  • #13: The element by element difference of the prices provides insights in to the underlying random processes …
  • #18: The element by element difference of the prices provides insights in to the underlying random processes …
  • #21: The element by element difference of the prices provides insights in to the underlying random processes …
  • #22: The element by element difference of the prices provides insights in to the underlying random processes …
  • #23: The element by element difference of the prices provides insights in to the underlying random processes …
  • #24: The element by element difference of the prices provides insights in to the underlying random processes …