Threads concurrency identifying performance deviations in thread pools(1)

Identifying Performance Deviations
in Thread Pools
Mark D. Syer, Bram Adams and Ahmed E. Hassan
(Software Analysis and Intelligence Lab (SAIL) School of Computing, Queen's University, Canada)
1

Outline
● Introduction
● Thread Pools
● Methodology
● Case study
● Results
● Evaluation
● Conclusion
2

Introduction
● Ultra-Large-Scale (ULS) systems need high
concurrency and speed.
● Analysing system performance is difficult
○ Time Consuming
○ Significant manual review of the data and logs
○ Lack of tool support, need heavy instrumentation
○ Hardware sensor gathered data hard to interpret.
3

Main contributions
To analyze performance deviations in systems
designed using thread pools..
1. A top-down methodology for identifying and ranking
the most deviating thread behaviour
1. A qualitative and quantitative evaluation of proposing
methodology on a large-scale industrial ULS system
4

Thread Pools
● Advantages
o Avoid thread create/destroy overhead
o System become more responsive
● Difficulties
o Too many threads → resource thrashing →
performance degradation
o Hard to configure and test
o synchronization errors → idle threads/deadlocks
o Thread leakage
5

Motivational Example
● Analyse the system for x5 large workload
● Machine level macro threads for each thread
pool
● Identify majority and deviating behaviours
by the use of clusters in dendrogram
6

Methodology (cont.)
A. Performance Data
a. Resource usage metrics of the pooled resources
eg: CPU, memory, #opened files
a. Resource metrics - accuracy - overhead of
performance monitoring - data redundancy
9

Methodology (cont.)
B. Metric Abstraction
a. Group threads into higher level (macro) abstractions
by space or time.
eg: in a cluster of machines, all pooled threads
executing on one node aggregated into one.
b. Identify the deviations at higher level
c. Repeat methodology for deviations at lower level
10

Methodology (cont.)
C. Distance Calculation Between Covariance
Matrices
To get level of dissimilarity, or distance, between two
abstractions
11
Remember your Maths?

Methodology (cont.)
Covariance metric
12
Covariance between X(n) and X(1)
how much X(n) and X(1) variables change together
Ref - Wikipedia(2014, Dec), Covarience Metrics. link: http://en.wikipedia.org/wiki/Covariance_matrix
Variance of metric ( i )

Methodology (cont.)
Eg:
● Metrics Xc,XM,XH(cpu, memory, #open files) for
Threads A,B
● A is instrumented 100 times, B 1000 times
● Calculate distance metric (Forstner & Moonen method)
one-dimensional distance for each pair of covariance
matrices
● Distance value - Similarity
13

Methodology (cont.)
D. Hierarchical Clustering
● Starts with each abstraction in its own cluster and
proceeds to find and merge the closest pair of clusters
● Have used Ward's method of clustering
14ref -Large Scale Gene Expression Data Analysis I . link:http://compbio.uthsc.edu/microarray/lecture1.htm

Methodology (cont.)
E. Cluster Visualization
15

Methodology (cont.)
F. Ranking Clusters
● Recursive
● Top to bottom
16

Case Study
● Performance data
○ CPU, Virtual Bytes, Private Bytes, Handles,
MicroThreads
● Metric Abstraction
● Hierarchical Clustering and Ranking
18

Quantitative Evaluation
● Validating the ability of the methodology to
identify and rank deviations
● Identified important deviations
● Injecting synthetical deviations in to the
performance data
● Verify the methodology (precision/recall)
21

Quantitative Evaluation (cont.)
22

Conclusion
● A methodology for automatically identifying
deviating behaviour in ULS systems
● Ranking most deviating thread behavior
with different abstractions (wave, thread)
● It is possible to use this methodology for
other applications
23

Threads concurrency identifying performance deviations in thread pools(1)

More Related Content

Similar to Threads concurrency identifying performance deviations in thread pools(1) (20)

Recently uploaded (20)

Threads concurrency identifying performance deviations in thread pools(1)

Editor's Notes