Understanding high-dimensional networks for continuous
variables using ECL
Kshitij Khare and Syed Rahman
Department of Statistics
University of Florida
Motivation
• Availability of high-dimensional data from various applications
• Number of variables (p) much larger than (or sometimes comparable to) the sample size
(n)
• Examples:
Biology: gene expression data
Environmental science: climate data on spatial grid
Finance: returns on thousands of stocks
1 / 22
Goal: Understanding relationships between variables
• Common goal in many applications: Understand complex network of relationships
between variables
• Covariance matrix: a fundamental quantity to help understand multivariate relationships
• Even if estimating the covariance matrix is not the end goal, it is a crucial first step
before further analysis
2 / 22
Quick recap: What is a covariance matrix?
• The covariance of two variables/features (say two stock prices) is a measure of linear
dependence between these variables
• Positive covariance indicates similar behavior, Negative covariance indicates opposite
behavior, zero covariance indicates lack of linear dependence
3 / 22
Lets say we have five stock prices S1, S2, S3, S4, S5. The covariance matrix of these five
stocks looks like
S2	
  
S2	
  
S3	
  
S4	
  
S5	
  
S1	
  
S1	
  
S3	
   S4	
   S5	
  
4 / 22
Challenges in high-dimensional estimation
• If p = 1000, we need to estimate roughly 1 million covariance parameters
• If sample size n is much smaller (or even same order) than p, this is not viable
• The sample covariance matrix (classical estimator) can perform very poorly in
high-dimensional situations (not even invertible when n < p)
5 / 22
Is there a way out?
• Reliably estimate small number of parameters in the covariance matrix or an
appropriate function of the covariance matrix
• Set insignificant parameters to zero
• Sparsity pattern (pattern of 0s) can be represented by graphs/networks
6 / 22
Directed acyclic graph models: Sparsity in the Cholesky Parameter
• Set entries of L to be zero: corresponds to assuming certain conditional independences
• Sparsity pattern in L can be represented by an directed graph
• Build a graph from sparse L
7 / 22
Directed acyclic graph models: Sparsity in the Cholesky Parameter
• Consider Cholesky decompostion of Σ−1 = LtL
• 



4.29 0.65 0.76 0.80
0.65 4.25 0.76 0.80
0.76 0.76 4.16 0.8
0.80 0.80 0.80 4.0




Σ−1
=




2.0 0.2 0.3 0.4
0 2.0 0.3 0.4
0 0 2.0 0.4
0 0 0 2.0




Lt




2.0 0 0 0
0.2 2.0 0 0
0.3 0.3 2.0 0
0.4 0.4 0.4 2




L
• Entires of L have a concrete and direct interpretation in terms of appropriate conditional
covariances
8 / 22
Directed acyclic graph models: Sparsity in the Cholesky Parameter
• Consider Cholesky decompostion of Σ−1 = LtL
• Set entries of L to be zero
• Sparsity pattern in L can be represented by a directed graph
L =
A B C D






2 0 0 0 A
0.2 2 0 0 B
0.3 0 2 0 C
0 0.4 0.4 2 D
A
B
C
D
9 / 22
STATISTICAL CHALLENGE
How do we estimate a covariance matrix with a sparse Cholesky factor based on data?
10 / 22
Convex Sparse Cholesky Selection(CSCS)
• Obtain a sparse estimate for L by minimizing the objective function:
QCSCS (L) = tr(Lt
LS) − 2log|L|
log-likelihood
+ λ
1 j<i p
|Lij |.
penalty term to
induce sparsity/zeros
• λ (chosen by the user) controls the level of sparsity in the estimator
• Larger the λ, sparser the estimator
11 / 22
CSCS method: Comparison with other methods
METHOD
Property
SparseCholesky
SparseDAG
CSCS
No constraints on sparsity pattern + + +
No constraints on D + +
Convergence guarantee to acceptable global minimum + +
Asymptotic consistency (n, p → ∞) + +
CSCS outperfroms and improves on existing methods!
12 / 22
Breaking up the objective function row-wise
• QCSCS (L) breaks up as a sum of independent functions of the rows of L
• If F(x, y, z) = F1(x) + F2(y) + F3(z), then to minimize F(x, y, z), we can minimize F1(x)
with respect to x, F2(y) with respect to y and F3(z) with respect to z
•
QCSCS (L) = Q1(L1.)
Function of entries of
1st row of L
+ . . . + Qp(Lp.)
Function of entries of
pth row of L
13 / 22
Call center data
• The data come from one call centre in a major U.S. northeastern financial organisation.
• For each day, a 17-hour period was divided into p = 102 10-minute intervals, and the
number of calls arriving at the service queue during each interval was counted
• n = 239 days using the singular value decomposition to screen out outliers that include
holidays and days when the recording equipment was faulty
• Hence 102
2 = 10506 parameters need to be estimated
14 / 22
GOAL
Build a predictor to forecast the calls coming in during the second half of the day based on the
number of calls during the first half of the day
15 / 22
Call Center data
• The best mean squared error forecast of y
(2)
i , calls coming in during the second half of
day t, using y
(1)
i , calls coming in during the first half of the day is
ˆy
(2)
it = µ2 + Σ21Σ−1
11 (y
(1)
it − µ1)
where µ2, µ1, Σ21 and Σ11 must be estimated.
• To evaluate the performance of the sample covariance versus CSCS, we use Average
Absolute Forecast Error, defined as,
AEt =
1
34
239
i=206
ˆy
(2)
it − y
(2)
it
• For CSCS, λ is chosen using 5- fold cross-validation
16 / 22
Call center data
0.5
1.0
1.5
50 60 70 80 90 100
Time
AE
Method
S
CSCS
Figure : Average Absolute Forecast error using cross-validation
CSCS outperforms the sample covariance matrix 46 out of 51 times!
17 / 22
Call center data
0
250
500
750
1000
Non−parallel Parallel
System
Timing
Figure : Timing Comparison between Parallel and Non-parallel Versions of CSCS
18 / 22
Algorithm1
Algorithm 1 Cyclic coordinatewise algorithm for hk,A,λ
Input: Fix k, A, λ
Input: Fix maximum number of iterations: rmax
Input: Fix initial estimate: ˆx(0)
Input: Fix convergence threshold:
Set r ← 1
converged = FALSE
Set ˆxcurrent
← ˆx(0)
repeat
ˆxold
← ˆxcurrent
)
for j ← 1, 2, ..., k − 1 do ˆxcurrent
j ← Tj (j, λ, A, ˆxold
)
end for
ˆxcurrent
k ← Tk (λ, A, ˆxold
)
ˆxr
← ˆxcurrent
Convergence Checking
if ˆxcurrent
− ˆxold
< then
converged = TRUE
else
r ← r + 1
end if
until converged = TRUE or r > rmax
Return final estimate: ˆxr
19 / 22
Algorithm1 in ECL
CSCS_h1(DATASET(xElement) xx0, DATASET(DistElem) A, UNSIGNED k,
REAL lambda, UNSIGNED maxIter = 100, REAL tol = 0.00001):=
FUNCTION
out := LOOP(xx0,(COUNTER<=maxIter AND
MaxAbsDff(ROWS(LEFT))>tol),
OuterBody(ROWS(LEFT), A, COUNTER, k, lambda));
RETURN SORT(PROJECT(out(typ=xType.x), DistElem),x,y);
END;
20 / 22
CSCS algorithm
Algorithm 2 CSCS Algorithm
Input: Data Y1, Y2, ..., Yn and λ
Input: Fix maximum number of iterations: rmax
Input: Fix initial estimate: ˆL(0)
Input: Fix convergence threshold:
Can be done in parallel
for i ← 1, 2, . . . , p do
(ˆηi
)(0)
← ith
row of ˆL(0)
Set ˆηi
to be minimizer of objective function QCSCS,i obtained by
using Algorithm 1 with k = 1, A = Si , λ, rmax , ˆx(0)
= (ˆηi
)(0)
,
end for
Construct ˆL by setting its ith
row (up to the diagonal) as ˆηi
Return final estimate: ˆL
21 / 22
CSCS in ECL
CSCS2(DATASET(Elem) YY, REAL lambda3, REAL tol2=0.00001,
UNSIGNED maxIter2=100) := FUNCTION
nobs := MAX(YY,x);
pvars := MAX(YY,y);
S:= Mat.Scale(Mat.Mul(Mat.Trans(YY),YY),(1/nobs));
S1 := DISTRIBUTE(NORMALIZE(S, (pvars DIV 2),
TRANSFORM(DistElem, SELF.nid:=COUNTER, SELF:=LEFT)), nid);
L:= Identity(pvars);
LL:= DISTRIBUTE(L,nid);
L11 := PROJECT(CHOOSEN(S1(x=1 AND y=1 AND nid=1),1),
TRANSFORM(DistElem, SELF.x := 1, SELF.y := 1,
SELF.value:=1/LEFT.value, SELF.nid := 1, SELF:=[]));
newL := LL(x <> 1) + L11;
newLL := LOOP(newL,COUNTER<pvars,OuterOuterBody(ROWS(LEFT), S1,
COUNTER, lambda3, maxIter2, tol2));
RETURN newLL;
END;
22 / 22

More Related Content

PDF
K-Means Algorithm
PDF
xldb-2015
PPTX
Lightning talk at MLConf NYC 2015
PPTX
Week 15 state space rep may 25 2016 final
PDF
Modern Control System (BE)
PPT
Simulation in terminated system
PDF
The Controller Design For Linear System: A State Space Approach
PPT
State space analysis, eign values and eign vectors
K-Means Algorithm
xldb-2015
Lightning talk at MLConf NYC 2015
Week 15 state space rep may 25 2016 final
Modern Control System (BE)
Simulation in terminated system
The Controller Design For Linear System: A State Space Approach
State space analysis, eign values and eign vectors

What's hot (20)

PPSX
linear algebra in control systems
PDF
A New Approach to Design a Reduced Order Observer
PPTX
PCA (Principal component analysis) Theory and Toolkits
PDF
Laser 1-background
PPTX
Signal flow graph
PPTX
State equations for physical systems
PPTX
Randomized Algorithm- Advanced Algorithm
PDF
Coordinate Descent method
PPTX
Discrete state space model 9th &10th lecture
PPTX
FDM Numerical solution of Laplace Equation using MATLAB
PPT
Control Systems
PPTX
Evaluation of programs codes using machine learning
PDF
Clustering of graphs and search of assemblages
PPTX
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
PDF
Monte Carlo and quasi-Monte Carlo integration
PPTX
Advance control theory
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
Controllability and observability
PPTX
Randomized algorithm min cut problem and its solution using karger's algorithm
linear algebra in control systems
A New Approach to Design a Reduced Order Observer
PCA (Principal component analysis) Theory and Toolkits
Laser 1-background
Signal flow graph
State equations for physical systems
Randomized Algorithm- Advanced Algorithm
Coordinate Descent method
Discrete state space model 9th &10th lecture
FDM Numerical solution of Laplace Equation using MATLAB
Control Systems
Evaluation of programs codes using machine learning
Clustering of graphs and search of assemblages
ICDE-2015 Shortest Path Traversal Optimization and Analysis for Large Graph C...
Monte Carlo and quasi-Monte Carlo integration
Advance control theory
International Journal of Engineering Research and Development (IJERD)
Controllability and observability
Randomized algorithm min cut problem and its solution using karger's algorithm
Ad

Viewers also liked (20)

PPTX
Il congiuntivo imperfetto e trapassato
DOCX
Microsoft dynamics crm4 e mail router configuration scenarios
PPTX
The engagement formula
PPTX
Demo
PPTX
Business in a Nonprofit World Presentation
PDF
тренды избирательных кампаний
PPTX
Si spersonalizzante
PDF
Gestor de proyectos saa
PDF
MathsGenius Leadership Institute Application 2014
PPT
Presetacion ok
PPT
Máquinas de Combustão Interna – Ciclo Otto
PPT
Definitions of personnel management
PDF
Speech recognizers & generators
PPTX
HIDDEN CELL TYPES REVEALED: NEW METHOD IMPROVES SINGLE-CELL GENOMICS ANALYSES...
PPTX
In-Transition Boot Camp at the Rutgers Club 9.14.2016
ODP
Sapphire Orlando 2013
PDF
Podium - Private SNS for developer
PPTX
Modifying clnic workflows final 0513
DOCX
Nicholas_Schultz_2
PDF
INSPIRED Magazine Vol 02 Issue 03
Il congiuntivo imperfetto e trapassato
Microsoft dynamics crm4 e mail router configuration scenarios
The engagement formula
Demo
Business in a Nonprofit World Presentation
тренды избирательных кампаний
Si spersonalizzante
Gestor de proyectos saa
MathsGenius Leadership Institute Application 2014
Presetacion ok
Máquinas de Combustão Interna – Ciclo Otto
Definitions of personnel management
Speech recognizers & generators
HIDDEN CELL TYPES REVEALED: NEW METHOD IMPROVES SINGLE-CELL GENOMICS ANALYSES...
In-Transition Boot Camp at the Rutgers Club 9.14.2016
Sapphire Orlando 2013
Podium - Private SNS for developer
Modifying clnic workflows final 0513
Nicholas_Schultz_2
INSPIRED Magazine Vol 02 Issue 03
Ad

Similar to Understanding High-dimensional Networks for Continuous Variables Using ECL (20)

PDF
High-Dimensional Network Estimation using ECL
PPTX
Efficient anomaly detection via matrix sketching
PDF
Graph theoretic approach to solve measurement placement problem for power system
PPTX
machine learning.pptx
PDF
MVPA with SpaceNet: sparse structured priors
PPTX
Path based Algorithms(Term Paper)
PPT
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
PDF
15.sp.dictionary_draft.pdf
PPT
lightweight graphical models for selectivity estimation without independance ...
PDF
Time of arrival based localization in wireless sensor networks a non linear ...
PDF
Discrete Nonlinear Optimal Control of S/C Formations Near The L1 and L2 poi...
PDF
Design and Implementation of Variable Radius Sphere Decoding Algorithm
PDF
AbdoSummerANS_mod3
PDF
Prpagation of Error Bounds Across reduction interfaces
PDF
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
PDF
A new implementation of k-MLE for mixture modelling of Wishart distributions
PDF
K-means Clustering Algorithm with Matlab Source code
PDF
Scalable Constrained Spectral Clustering
PDF
Webinar on Graph Neural Networks
PDF
Discrete wavelet transform-based RI adaptive algorithm for system identification
High-Dimensional Network Estimation using ECL
Efficient anomaly detection via matrix sketching
Graph theoretic approach to solve measurement placement problem for power system
machine learning.pptx
MVPA with SpaceNet: sparse structured priors
Path based Algorithms(Term Paper)
Manifold Blurring Mean Shift algorithms for manifold denoising, presentation,...
15.sp.dictionary_draft.pdf
lightweight graphical models for selectivity estimation without independance ...
Time of arrival based localization in wireless sensor networks a non linear ...
Discrete Nonlinear Optimal Control of S/C Formations Near The L1 and L2 poi...
Design and Implementation of Variable Radius Sphere Decoding Algorithm
AbdoSummerANS_mod3
Prpagation of Error Bounds Across reduction interfaces
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
A new implementation of k-MLE for mixture modelling of Wishart distributions
K-means Clustering Algorithm with Matlab Source code
Scalable Constrained Spectral Clustering
Webinar on Graph Neural Networks
Discrete wavelet transform-based RI adaptive algorithm for system identification

More from HPCC Systems (20)

PPTX
Natural Language to SQL Query conversion using Machine Learning Techniques on...
PPT
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
PPTX
Towards Trustable AI for Complex Systems
PPTX
Welcome
PPTX
Closing / Adjourn
PPTX
Community Website: Virtual Ribbon Cutting
PPTX
Path to 8.0
PPTX
Release Cycle Changes
PPTX
Geohashing with Uber’s H3 Geospatial Index
PPTX
Advancements in HPCC Systems Machine Learning
PPTX
Docker Support
PPTX
Expanding HPCC Systems Deep Neural Network Capabilities
PPTX
Leveraging Intra-Node Parallelization in HPCC Systems
PPTX
DataPatterns - Profiling in ECL Watch
PPTX
Leveraging the Spark-HPCC Ecosystem
PPTX
Work Unit Analysis Tool
PPTX
Community Award Ceremony
PPTX
Dapper Tool - A Bundle to Make your ECL Neater
PPTX
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
PPTX
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Towards Trustable AI for Complex Systems
Welcome
Closing / Adjourn
Community Website: Virtual Ribbon Cutting
Path to 8.0
Release Cycle Changes
Geohashing with Uber’s H3 Geospatial Index
Advancements in HPCC Systems Machine Learning
Docker Support
Expanding HPCC Systems Deep Neural Network Capabilities
Leveraging Intra-Node Parallelization in HPCC Systems
DataPatterns - Profiling in ECL Watch
Leveraging the Spark-HPCC Ecosystem
Work Unit Analysis Tool
Community Award Ceremony
Dapper Tool - A Bundle to Make your ECL Neater
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...

Recently uploaded (20)

PPTX
New ISO 27001_2022 standard and the changes
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
recommendation Project PPT with details attached
PPT
statistics analysis - topic 3 - describing data visually
PPT
DU, AIS, Big Data and Data Analytics.ppt
DOCX
Factor Analysis Word Document Presentation
PDF
Session 11 - Data Visualization Storytelling (2).pdf
PPTX
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
PPTX
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
PDF
A biomechanical Functional analysis of the masitary muscles in man
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
statsppt this is statistics ppt for giving knowledge about this topic
PPTX
Caseware_IDEA_Detailed_Presentation.pptx
PDF
Best Data Science Professional Certificates in the USA | IABAC
PPTX
Tapan_20220802057_Researchinternship_final_stage.pptx
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPT
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
New ISO 27001_2022 standard and the changes
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
recommendation Project PPT with details attached
statistics analysis - topic 3 - describing data visually
DU, AIS, Big Data and Data Analytics.ppt
Factor Analysis Word Document Presentation
Session 11 - Data Visualization Storytelling (2).pdf
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
FMIS 108 and AISlaudon_mis17_ppt_ch11.pptx
A biomechanical Functional analysis of the masitary muscles in man
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
statsppt this is statistics ppt for giving knowledge about this topic
Caseware_IDEA_Detailed_Presentation.pptx
Best Data Science Professional Certificates in the USA | IABAC
Tapan_20220802057_Researchinternship_final_stage.pptx
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt

Understanding High-dimensional Networks for Continuous Variables Using ECL

  • 1. Understanding high-dimensional networks for continuous variables using ECL Kshitij Khare and Syed Rahman Department of Statistics University of Florida
  • 2. Motivation • Availability of high-dimensional data from various applications • Number of variables (p) much larger than (or sometimes comparable to) the sample size (n) • Examples: Biology: gene expression data Environmental science: climate data on spatial grid Finance: returns on thousands of stocks 1 / 22
  • 3. Goal: Understanding relationships between variables • Common goal in many applications: Understand complex network of relationships between variables • Covariance matrix: a fundamental quantity to help understand multivariate relationships • Even if estimating the covariance matrix is not the end goal, it is a crucial first step before further analysis 2 / 22
  • 4. Quick recap: What is a covariance matrix? • The covariance of two variables/features (say two stock prices) is a measure of linear dependence between these variables • Positive covariance indicates similar behavior, Negative covariance indicates opposite behavior, zero covariance indicates lack of linear dependence 3 / 22
  • 5. Lets say we have five stock prices S1, S2, S3, S4, S5. The covariance matrix of these five stocks looks like S2   S2   S3   S4   S5   S1   S1   S3   S4   S5   4 / 22
  • 6. Challenges in high-dimensional estimation • If p = 1000, we need to estimate roughly 1 million covariance parameters • If sample size n is much smaller (or even same order) than p, this is not viable • The sample covariance matrix (classical estimator) can perform very poorly in high-dimensional situations (not even invertible when n < p) 5 / 22
  • 7. Is there a way out? • Reliably estimate small number of parameters in the covariance matrix or an appropriate function of the covariance matrix • Set insignificant parameters to zero • Sparsity pattern (pattern of 0s) can be represented by graphs/networks 6 / 22
  • 8. Directed acyclic graph models: Sparsity in the Cholesky Parameter • Set entries of L to be zero: corresponds to assuming certain conditional independences • Sparsity pattern in L can be represented by an directed graph • Build a graph from sparse L 7 / 22
  • 9. Directed acyclic graph models: Sparsity in the Cholesky Parameter • Consider Cholesky decompostion of Σ−1 = LtL •     4.29 0.65 0.76 0.80 0.65 4.25 0.76 0.80 0.76 0.76 4.16 0.8 0.80 0.80 0.80 4.0     Σ−1 =     2.0 0.2 0.3 0.4 0 2.0 0.3 0.4 0 0 2.0 0.4 0 0 0 2.0     Lt     2.0 0 0 0 0.2 2.0 0 0 0.3 0.3 2.0 0 0.4 0.4 0.4 2     L • Entires of L have a concrete and direct interpretation in terms of appropriate conditional covariances 8 / 22
  • 10. Directed acyclic graph models: Sparsity in the Cholesky Parameter • Consider Cholesky decompostion of Σ−1 = LtL • Set entries of L to be zero • Sparsity pattern in L can be represented by a directed graph L = A B C D       2 0 0 0 A 0.2 2 0 0 B 0.3 0 2 0 C 0 0.4 0.4 2 D A B C D 9 / 22
  • 11. STATISTICAL CHALLENGE How do we estimate a covariance matrix with a sparse Cholesky factor based on data? 10 / 22
  • 12. Convex Sparse Cholesky Selection(CSCS) • Obtain a sparse estimate for L by minimizing the objective function: QCSCS (L) = tr(Lt LS) − 2log|L| log-likelihood + λ 1 j<i p |Lij |. penalty term to induce sparsity/zeros • λ (chosen by the user) controls the level of sparsity in the estimator • Larger the λ, sparser the estimator 11 / 22
  • 13. CSCS method: Comparison with other methods METHOD Property SparseCholesky SparseDAG CSCS No constraints on sparsity pattern + + + No constraints on D + + Convergence guarantee to acceptable global minimum + + Asymptotic consistency (n, p → ∞) + + CSCS outperfroms and improves on existing methods! 12 / 22
  • 14. Breaking up the objective function row-wise • QCSCS (L) breaks up as a sum of independent functions of the rows of L • If F(x, y, z) = F1(x) + F2(y) + F3(z), then to minimize F(x, y, z), we can minimize F1(x) with respect to x, F2(y) with respect to y and F3(z) with respect to z • QCSCS (L) = Q1(L1.) Function of entries of 1st row of L + . . . + Qp(Lp.) Function of entries of pth row of L 13 / 22
  • 15. Call center data • The data come from one call centre in a major U.S. northeastern financial organisation. • For each day, a 17-hour period was divided into p = 102 10-minute intervals, and the number of calls arriving at the service queue during each interval was counted • n = 239 days using the singular value decomposition to screen out outliers that include holidays and days when the recording equipment was faulty • Hence 102 2 = 10506 parameters need to be estimated 14 / 22
  • 16. GOAL Build a predictor to forecast the calls coming in during the second half of the day based on the number of calls during the first half of the day 15 / 22
  • 17. Call Center data • The best mean squared error forecast of y (2) i , calls coming in during the second half of day t, using y (1) i , calls coming in during the first half of the day is ˆy (2) it = µ2 + Σ21Σ−1 11 (y (1) it − µ1) where µ2, µ1, Σ21 and Σ11 must be estimated. • To evaluate the performance of the sample covariance versus CSCS, we use Average Absolute Forecast Error, defined as, AEt = 1 34 239 i=206 ˆy (2) it − y (2) it • For CSCS, λ is chosen using 5- fold cross-validation 16 / 22
  • 18. Call center data 0.5 1.0 1.5 50 60 70 80 90 100 Time AE Method S CSCS Figure : Average Absolute Forecast error using cross-validation CSCS outperforms the sample covariance matrix 46 out of 51 times! 17 / 22
  • 19. Call center data 0 250 500 750 1000 Non−parallel Parallel System Timing Figure : Timing Comparison between Parallel and Non-parallel Versions of CSCS 18 / 22
  • 20. Algorithm1 Algorithm 1 Cyclic coordinatewise algorithm for hk,A,λ Input: Fix k, A, λ Input: Fix maximum number of iterations: rmax Input: Fix initial estimate: ˆx(0) Input: Fix convergence threshold: Set r ← 1 converged = FALSE Set ˆxcurrent ← ˆx(0) repeat ˆxold ← ˆxcurrent ) for j ← 1, 2, ..., k − 1 do ˆxcurrent j ← Tj (j, λ, A, ˆxold ) end for ˆxcurrent k ← Tk (λ, A, ˆxold ) ˆxr ← ˆxcurrent Convergence Checking if ˆxcurrent − ˆxold < then converged = TRUE else r ← r + 1 end if until converged = TRUE or r > rmax Return final estimate: ˆxr 19 / 22
  • 21. Algorithm1 in ECL CSCS_h1(DATASET(xElement) xx0, DATASET(DistElem) A, UNSIGNED k, REAL lambda, UNSIGNED maxIter = 100, REAL tol = 0.00001):= FUNCTION out := LOOP(xx0,(COUNTER<=maxIter AND MaxAbsDff(ROWS(LEFT))>tol), OuterBody(ROWS(LEFT), A, COUNTER, k, lambda)); RETURN SORT(PROJECT(out(typ=xType.x), DistElem),x,y); END; 20 / 22
  • 22. CSCS algorithm Algorithm 2 CSCS Algorithm Input: Data Y1, Y2, ..., Yn and λ Input: Fix maximum number of iterations: rmax Input: Fix initial estimate: ˆL(0) Input: Fix convergence threshold: Can be done in parallel for i ← 1, 2, . . . , p do (ˆηi )(0) ← ith row of ˆL(0) Set ˆηi to be minimizer of objective function QCSCS,i obtained by using Algorithm 1 with k = 1, A = Si , λ, rmax , ˆx(0) = (ˆηi )(0) , end for Construct ˆL by setting its ith row (up to the diagonal) as ˆηi Return final estimate: ˆL 21 / 22
  • 23. CSCS in ECL CSCS2(DATASET(Elem) YY, REAL lambda3, REAL tol2=0.00001, UNSIGNED maxIter2=100) := FUNCTION nobs := MAX(YY,x); pvars := MAX(YY,y); S:= Mat.Scale(Mat.Mul(Mat.Trans(YY),YY),(1/nobs)); S1 := DISTRIBUTE(NORMALIZE(S, (pvars DIV 2), TRANSFORM(DistElem, SELF.nid:=COUNTER, SELF:=LEFT)), nid); L:= Identity(pvars); LL:= DISTRIBUTE(L,nid); L11 := PROJECT(CHOOSEN(S1(x=1 AND y=1 AND nid=1),1), TRANSFORM(DistElem, SELF.x := 1, SELF.y := 1, SELF.value:=1/LEFT.value, SELF.nid := 1, SELF:=[])); newL := LL(x <> 1) + L11; newLL := LOOP(newL,COUNTER<pvars,OuterOuterBody(ROWS(LEFT), S1, COUNTER, lambda3, maxIter2, tol2)); RETURN newLL; END; 22 / 22