SlideShare a Scribd company logo
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Clustering CDS: algorithms, distances,
stability and convergence rates
CMStatistics 2016, University of Seville, Spain
Gautier Marti, Frank Nielsen, Philippe Donnat
HELLEBORECAPITAL
December 9, 2016
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
1 Introduction
2 The standard methodology
3 Exploring dependence between returns
4 Copula-based dependence coefficients (clustering distances)
5 Empirical convergence rates
6 Beyond dependence: a (copula,margins) representation
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Introduction
Goal: Finding groups of ’homogeneous’ assets that can help to:
• build alternative measures of risk,
• elaborate trading strategies. . .
But, we need a high confidence in these clusters (networks).
So, we need appropriate AND fast converging methodologies [8]:
to be consistent yet efficient (bias–variance tradeoff),
to avoid non-stationarity of the time series (too large sample).
A good model selection criterion:
Minimum sample size to reach a given ’accuracy’.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
1 Introduction
2 The standard methodology
3 Exploring dependence between returns
4 Copula-based dependence coefficients (clustering distances)
5 Empirical convergence rates
6 Beyond dependence: a (copula,margins) representation
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
The standard methodology - description
The methodology widely adopted in empirical studies: [7].
Let N be the number of assets.
Let Pi (t) be the price at time t of asset i, 1 ≤ i ≤ N.
Let ri (t) be the log-return at time t of asset i:
ri (t) = log Pi (t) − log Pi (t − 1).
For each pair i, j of assets, compute their correlation:
ρij =
ri rj − ri rj
( r2
i − ri
2) r2
j − rj
2
.
Convert the correlation coefficients ρij into distances:
dij = 2(1 − ρij ).
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
The standard methodology - description
From all the distances dij , compute a minimum spanning tree:
Figure: A minimum spanning tree of stocks (from [1]); stocks from the
same industry (represented by color) tend to cluster together
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
The standard methodology - limitations
• MST clustering equivalent to Single Linkage clustering:
• chaining phenomenon
• not stable to noise / small perturbations [11]
• Use of the Pearson correlation:
• can take value 0 whereas variables are strongly dependent
• not invariant to variable monotone transformations
• not robust to outliers
Is it still useful for financial time series? stocks? CDS??!
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
The standard methodology - limitations
• MST clustering equivalent to Single Linkage clustering:
• chaining phenomenon
• not stable to noise / small perturbations [11]
• Use of the Pearson correlation:
• can take value 0 whereas variables are strongly dependent
• not invariant to variables monotone transformations
• not robust to outliers
Is it still useful for financial time series? stocks? CDS??!
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
1 Introduction
2 The standard methodology
3 Exploring dependence between returns
4 Copula-based dependence coefficients (clustering distances)
5 Empirical convergence rates
6 Beyond dependence: a (copula,margins) representation
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Copulas
Sklar’s Theorem [13]
For (Xi , Xj ) having continuous marginal cdfs FXi
, FXj
, its joint cumulative
distribution F is uniquely expressed as
F(Xi , Xj ) = C(FXi
(Xi ), FXj
(Xj )),
where C is known as the copula of (Xi , Xj ).
Copula’s uniform marginals jointly encode all the dependence.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
From ranks to empirical copula
ri , rj are the rank statistics of Xi , Xj respectively, i.e. rt
i is the rank
of Xt
i in {X1
i , . . . , XT
i }: rt
i = T
k=1 1{Xk
i ≤ Xt
i }.
Deheuvels’ empirical copula [3]
Any copula ˆC defined on the lattice L = {( ti
T ,
tj
T ) : ti , tj = 0, . . . , T} by
ˆC( ti
T ,
tj
T ) = 1
T
T
t=1 1{rt
i ≤ ti , rt
j ≤ tj } is an empirical copula.
ˆC is a consistent estimator of C with uniform convergence [4].
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Clustering of bivariate empirical copulas
Generate the N
2 bivariate empirical copulas
Find clusters of copulas using optimal transport [10, 9]
Compute and display the clusters’ centroids [2]
Some code available at www.datagrapple.com/Tech.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Copula-centers for stocks (CAC 40)
Figure: Stocks: More mass in the bottom-left corner, i.e. lower tail
dependence. Stock prices tend to plummet together.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Copula-centers for Credit Default Swaps (XO index)
Figure: Credit default swaps: More mass in the top-right corner, i.e.
upper tail dependence. Insurance cost against entities’ default tends to
soar in stressed market.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
1 Introduction
2 The standard methodology
3 Exploring dependence between returns
4 Copula-based dependence coefficients (clustering distances)
5 Empirical convergence rates
6 Beyond dependence: a (copula,margins) representation
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Dependence as relative distances between copulas
C copula of (Xi , Xj ),
|u − v|/
√
2 distance between (u, v) to the diagonal
Spearman’s ρS :
ρS (Xi , Xj ) = 12
1
0
1
0
(C(u, v) − uv)dudv
= 1 − 6
1
0
1
0
(u − v)2
dC(u, v)
Many correlation coefficients can be expressed as distances to the
Fr´echet–Hoeffding bounds or the independence [6]. Some are explicitely
built this way (e.g. [12, 5, 9]).
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
A metric space for copulas: Optimal Transport
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
The Target/Forget Dependence Coefficient (TFDC)
Now, we can define our bespoke dependence coefficient:
Build the forget-dependence copulas {CF
l }l
Build the target-dependence copulas {CT
k }k
Compute the empirical copula Cij from xi , xj
TFDC(Cij ) =
minl D(CF
l , Cij )
minl D(CF
l , Cij ) + mink D(Cij , CT
k )
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Spearman vs. TFDC
0.0 0.2 0.4 0.6 0.8 1.0
discontinuity position a
0.0
0.2
0.4
0.6
0.8
1.0
Estimatedpositivedependence
Spearman & TFDC values as a function of a
TFDC
Spearman
Figure: Empirical copulas for (X, Y ) where
X = Z1{Z < a} + X 1{Z > a},
Y = Z1{Z < a + 0.25} + Y 1{Z > a + 0.25}, a = 0, 0.05, . . . , 0.95, 1,
and where Z is uniform on [0, 1] and X , Y are independent noises (left).
TFDC and Spearman coefficients estimated between X and Y as a
function of a (right).
For a = 0.75, Spearman coefficient yields a negative value, yet X = Y
over [0, a].
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
1 Introduction
2 The standard methodology
3 Exploring dependence between returns
4 Copula-based dependence coefficients (clustering distances)
5 Empirical convergence rates
6 Beyond dependence: a (copula,margins) representation
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Process: Recovering a simulated ground-truth [8]
A simulation & benchmark process that needs to be refined:
Extract (using a large sample) a filtered correlation matrix R
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Process: Recovering a simulated ground-truth [8]
A simulation & benchmark process that needs to be refined:
Generate samples of size T = 10, . . . , 20, . . . from a relevant
distribution (parameterized by R)
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Process: Recovering a simulated ground-truth [8]
A simulation & benchmark process that needs to be refined:
Compute the ratio of the number of correct clustering
obtained over the number of trials as a function of T
100 200 300 400 500
Sample size
0.0
0.2
0.4
0.6
0.8
1.0
Score
Empirical rates of convergence for Single Linkage
Gaussian - Pearson
Gaussian - Spearman
Student - Pearson
Student - Spearman
100 200 300 400 500
Sample size
0.0
0.2
0.4
0.6
0.8
1.0
Score
Empirical rates of convergence for Average Linkage
Gaussian - Pearson
Gaussian - Spearman
Student - Pearson
Student - Spearman
100 200 300 400 500
Sample size
0.0
0.2
0.4
0.6
0.8
1.0
Score
Empirical rates of convergence for Ward
Gaussian - Pearson
Gaussian - Spearman
Student - Pearson
Student - Spearman
A full comparative study will be posted online at www.datagrapple.com/Tech.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
1 Introduction
2 The standard methodology
3 Exploring dependence between returns
4 Copula-based dependence coefficients (clustering distances)
5 Empirical convergence rates
6 Beyond dependence: a (copula,margins) representation
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
ON CLUSTERING FINANCIAL TIME SERIES
GAUTIER MARTI, PHILIPPE DONNAT AND FRANK NIELSEN
NOISY CORRELATION MATRICES
Let X be the matrix storing the standardized re-
turns of N = 560 assets (credit default swaps)
over a period of T = 2500 trading days.
Then, the empirical correlation matrix of the re-
turns is
C =
1
T
XX .
We can compute the empirical density of its
eigenvalues
ρ(λ) =
1
N
dn(λ)
dλ
,
where n(λ) counts the number of eigenvalues of
C less than λ.
From random matrix theory, the Marchenko-
Pastur distribution gives the limit distribution as
N → ∞, T → ∞ and T/N fixed. It reads:
ρ(λ) =
T/N
2π
(λmax − λ)(λ − λmin)
λ
,
where λmax
min = 1 + N/T ± 2 N/T, and λ ∈
[λmin, λmax].
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
λ
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
ρ(λ)
Figure 1: Marchenko-Pastur density vs. empirical den-
sity of the correlation matrix eigenvalues
Notice that the Marchenko-Pastur density fits
well the empirical density meaning that most of
the information contained in the empirical corre-
lation matrix amounts to noise: only 26 eigenval-
ues are greater than λmax.
The highest eigenvalue corresponds to the ‘mar-
ket’, the 25 others can be associated to ‘industrial
sectors’.
CLUSTERING TIME SERIES
Given a correlation matrix of the returns,
0 100 200 300 400 500
0
100
200
300
400
500
Figure 2: An empirical and noisy correlation matrix
one can re-order assets using a hierarchical clus-
tering algorithm to make the hierarchical correla-
tion pattern blatant,
0 100 200 300 400 500
0
100
200
300
400
500
Figure 3: The same noisy correlation matrix re-ordered
by a hierarchical clustering algorithm
and finally filter the noise according to the corre-
lation pattern:
0 100 200 300 400 500
0
100
200
300
400
500
Figure 4: The resulting filtered correlation matrix
BEYOND CORRELATION
Sklar’s Theorem. For any random vector X = (X1, . . . , XN ) having continuous marginal cumulative
distribution functions Fi, its joint cumulative distribution F is uniquely expressed as
F(X1, . . . , XN ) = C(F1(X1), . . . , FN (XN )),
where C, the multivariate distribution of uniform marginals, is known as the copula of X.
Figure 5: ArcelorMittal and Société générale prices are projected on dependence ⊕ distribution space; notice their
heavy-tailed exponential distribution.
Let θ ∈ [0, 1]. Let (X, Y ) ∈ V2
. Let G = (GX, GY ), where GX and GY are respectively X and Y marginal
cdf. We define the following distance
d2
θ(X, Y ) = θd2
1(GX(X), GY (Y )) + (1 − θ)d2
0(GX, GY ),
where d2
1(GX(X), GY (Y )) = 3E[|GX(X) − GY (Y )|2
], and d2
0(GX, GY ) = 1
2 R
dGX
dλ − dGY
dλ
2
dλ.
CLUSTERING RESULTS & STABILITY
0 5 10 15 20 25 30
Standard Deviation in basis points
0
5
10
15
20
25
30
35
Numberofoccurrences
Standard Deviations Histogram
Figure 6: (Top) The returns correlation structure ap-
pears more clearly using rank correlation; (Bottom)
Clusters of returns distributions can be partly described
by the returns volatility
Figure 7: Stability test on Odd/Even trading days sub-
sampling: our approach (GNPR) yields more stable
clusters with respect to this perturbation than standard
approaches (using Pearson correlation or L2 distances).
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Ricardo Coelho, Przemyslaw Repetowicz, Stefan Hutzler, and
Peter Richmond.
Investigation of Cluster Structure in the London Stock
Exchange.
Marco Cuturi and Arnaud Doucet.
Fast computation of wasserstein barycenters.
In Proceedings of the 31th International Conference on
Machine Learning, ICML 2014, Beijing, China, 21-26 June
2014, pages 685–693, 2014.
Paul Deheuvels.
La fonction de d´ependance empirique et ses propri´et´es. un test
non param´etrique d’ind´ependance.
Acad. Roy. Belg. Bull. Cl. Sci.(5), 65(6):274–292, 1979.
Paul Deheuvels.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
A non-parametric test for independence.
Publications de l’Institut de Statistique de l’Universit´e de
Paris, 26:29–50, 1981.
Fabrizio Durante and Roberta Pappada.
Cluster analysis of time series via kendall distribution.
In Strengthening Links Between Data Analysis and Soft
Computing, pages 209–216. Springer, 2015.
Eckhard Liebscher et al.
Copula-based dependence measures.
Dependence Modeling, 2(1):49–64, 2014.
Rosario N Mantegna.
Hierarchical structure in financial markets.
The European Physical Journal B-Condensed Matter and
Complex Systems, 11(1):193–197, 1999.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Gautier Marti, S´ebastien Andler, Frank Nielsen, and Philippe
Donnat.
Clustering financial time series: How long is enough?
Proceedings of the Twenty-Fifth International Joint
Conference on Artificial Intelligence, IJCAI 2016, New York,
NY, USA, 9-15 July 2016, pages 2583–2589, 2016.
Gautier Marti, Sebastien Andler, Frank Nielsen, and Philippe
Donnat.
Exploring and measuring non-linear correlations: Copulas,
lightspeed transportation and clustering.
NIPS 2016 Time Series Workshop, 55, 2016.
Gautier Marti, S´ebastien Andler, Frank Nielsen, and Philippe
Donnat.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
Optimal transport vs. fisher-rao distance between copulas for
clustering multivariate time series.
In IEEE Statistical Signal Processing Workshop, SSP 2016,
Palma de Mallorca, Spain, June 26-29, 2016, pages 1–5, 2016.
Gautier Marti, Philippe Very, Philippe Donnat, and Frank
Nielsen.
A proposal of a methodological framework with experimental
guidelines to investigate clustering stability on financial time
series.
In 14th IEEE International Conference on Machine Learning
and Applications, ICMLA 2015, Miami, FL, USA, December
9-11, 2015, pages 32–37, 2015.
Barnab´as P´oczos, Zoubin Ghahramani, and Jeff G. Schneider.
Copula-based kernel dependency measures.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
HELLEBORECAPITAL
Introduction
The standard methodology
Exploring dependence between returns
Copula-based dependence coefficients (clustering distances)
Empirical convergence rates
Beyond dependence: a (copula,margins) representation
In Proceedings of the 29th International Conference on
Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June
26 - July 1, 2012, 2012.
A Sklar.
Fonctions de r´epartition `a n dimensions et leurs marges.
Universit´e Paris 8, 1959.
Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r

More Related Content

PDF
Clustering Financial Time Series: How Long is Enough?
PDF
Optimal Transport between Copulas for Clustering Time Series
PDF
A closer look at correlations
PDF
On clustering financial time series - A need for distances between dependent ...
PDF
Optimal Transport vs. Fisher-Rao distance between Copulas
PDF
Clustering Financial Time Series using their Correlations and their Distribut...
PDF
Some contributions to the clustering of financial time series - Applications ...
PDF
On the stability of clustering financial time series
Clustering Financial Time Series: How Long is Enough?
Optimal Transport between Copulas for Clustering Time Series
A closer look at correlations
On clustering financial time series - A need for distances between dependent ...
Optimal Transport vs. Fisher-Rao distance between Copulas
Clustering Financial Time Series using their Correlations and their Distribut...
Some contributions to the clustering of financial time series - Applications ...
On the stability of clustering financial time series

What's hot (20)

PDF
A review of two decades of correlations, hierarchies, networks and clustering...
PDF
Clustering Random Walk Time Series
PDF
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
PDF
ABC in Varanasi
PDF
A Maximum Entropy Approach to the Loss Data Aggregation Problem
PDF
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
PDF
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
PDF
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
PDF
Numerical smoothing and hierarchical approximations for efficient option pric...
PDF
Bayesian model choice in cosmology
PDF
SwingOptions
PDF
31 Machine Learning Unsupervised Cluster Validity
PDF
Using Vector Clocks to Visualize Communication Flow
PDF
Econophysics III: Financial Correlations and Portfolio Optimization - Thomas ...
PDF
Affine Term Structure Model with Stochastic Market Price of Risk
PDF
Dependent processes in Bayesian Nonparametrics
PDF
Affine cascade models for term structure dynamics of sovereign yield curves
PDF
11.the comparative study of finite difference method and monte carlo method f...
PDF
Uncertain Volatility Models
PDF
Pricing interest rate derivatives (ext)
A review of two decades of correlations, hierarchies, networks and clustering...
Clustering Random Walk Time Series
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
ABC in Varanasi
A Maximum Entropy Approach to the Loss Data Aggregation Problem
Cari2020 Parallel Hybridization for SAT: An Efficient Combination of Search S...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
MCQMC 2020 talk: Importance Sampling for a Robust and Efficient Multilevel Mo...
Numerical smoothing and hierarchical approximations for efficient option pric...
Bayesian model choice in cosmology
SwingOptions
31 Machine Learning Unsupervised Cluster Validity
Using Vector Clocks to Visualize Communication Flow
Econophysics III: Financial Correlations and Portfolio Optimization - Thomas ...
Affine Term Structure Model with Stochastic Market Price of Risk
Dependent processes in Bayesian Nonparametrics
Affine cascade models for term structure dynamics of sovereign yield curves
11.the comparative study of finite difference method and monte carlo method f...
Uncertain Volatility Models
Pricing interest rate derivatives (ext)
Ad

Similar to Clustering CDS: algorithms, distances, stability and convergence rates (12)

PDF
On Clustering Financial Time Series - Beyond Correlation
PPTX
Clusters (4).pptx
DOCX
Pricing: CDS, CDO, Copula funtion
PDF
Credit Correlation Life After Copulas Alexander Lipton Andrew Rennie
PDF
"Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos...
PDF
Real time clustering of time series
PDF
Dissertation (2)
PDF
Pairs Trading: Optimizing via Mixed Copula versus Distance Method for S&P 5...
PPTX
TYPES OF CLUSTERING.pptx
PDF
Dependence Modeling Vine Copula Handbook Dorota Kurowicka Harry Joe
PPTX
Aj Copulas V4
PPTX
Unsupervised Learning-Clustering Algorithms.pptx
On Clustering Financial Time Series - Beyond Correlation
Clusters (4).pptx
Pricing: CDS, CDO, Copula funtion
Credit Correlation Life After Copulas Alexander Lipton Andrew Rennie
"Building Diversified Portfolios that Outperform Out-of-Sample" by Dr. Marcos...
Real time clustering of time series
Dissertation (2)
Pairs Trading: Optimizing via Mixed Copula versus Distance Method for S&P 5...
TYPES OF CLUSTERING.pptx
Dependence Modeling Vine Copula Handbook Dorota Kurowicka Harry Joe
Aj Copulas V4
Unsupervised Learning-Clustering Algorithms.pptx
Ad

More from Gautier Marti (9)

PDF
Using Large Language Models in 10 Lines of Code
PDF
What deep learning can bring to...
PDF
A quick demo of Top2Vec With application on 2020 10-K business descriptions
PDF
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
PDF
How deep generative models can help quants reduce the risk of overfitting?
PDF
Generating Realistic Synthetic Data in Finance
PDF
Applications of GANs in Finance
PDF
My recent attempts at using GANs for simulating realistic stocks returns
PDF
Takeaways from ICML 2019, Long Beach, California
Using Large Language Models in 10 Lines of Code
What deep learning can bring to...
A quick demo of Top2Vec With application on 2020 10-K business descriptions
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
How deep generative models can help quants reduce the risk of overfitting?
Generating Realistic Synthetic Data in Finance
Applications of GANs in Finance
My recent attempts at using GANs for simulating realistic stocks returns
Takeaways from ICML 2019, Long Beach, California

Recently uploaded (20)

PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
[EN] Industrial Machine Downtime Prediction
PDF
Introduction to Data Science and Data Analysis
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Introduction to machine learning and Linear Models
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Mega Projects Data Mega Projects Data
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Introduction to the R Programming Language
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
Qualitative Qantitative and Mixed Methods.pptx
IB Computer Science - Internal Assessment.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
[EN] Industrial Machine Downtime Prediction
Introduction to Data Science and Data Analysis
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Introduction to machine learning and Linear Models
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Business Analytics and business intelligence.pdf
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Knowledge Engineering Part 1
Mega Projects Data Mega Projects Data
STUDY DESIGN details- Lt Col Maksud (21).pptx
Reliability_Chapter_ presentation 1221.5784
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to the R Programming Language
Galatica Smart Energy Infrastructure Startup Pitch Deck

Clustering CDS: algorithms, distances, stability and convergence rates

  • 1. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Clustering CDS: algorithms, distances, stability and convergence rates CMStatistics 2016, University of Seville, Spain Gautier Marti, Frank Nielsen, Philippe Donnat HELLEBORECAPITAL December 9, 2016 Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 2. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation 1 Introduction 2 The standard methodology 3 Exploring dependence between returns 4 Copula-based dependence coefficients (clustering distances) 5 Empirical convergence rates 6 Beyond dependence: a (copula,margins) representation Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 3. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Introduction Goal: Finding groups of ’homogeneous’ assets that can help to: • build alternative measures of risk, • elaborate trading strategies. . . But, we need a high confidence in these clusters (networks). So, we need appropriate AND fast converging methodologies [8]: to be consistent yet efficient (bias–variance tradeoff), to avoid non-stationarity of the time series (too large sample). A good model selection criterion: Minimum sample size to reach a given ’accuracy’. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 4. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation 1 Introduction 2 The standard methodology 3 Exploring dependence between returns 4 Copula-based dependence coefficients (clustering distances) 5 Empirical convergence rates 6 Beyond dependence: a (copula,margins) representation Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 5. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation The standard methodology - description The methodology widely adopted in empirical studies: [7]. Let N be the number of assets. Let Pi (t) be the price at time t of asset i, 1 ≤ i ≤ N. Let ri (t) be the log-return at time t of asset i: ri (t) = log Pi (t) − log Pi (t − 1). For each pair i, j of assets, compute their correlation: ρij = ri rj − ri rj ( r2 i − ri 2) r2 j − rj 2 . Convert the correlation coefficients ρij into distances: dij = 2(1 − ρij ). Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 6. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation The standard methodology - description From all the distances dij , compute a minimum spanning tree: Figure: A minimum spanning tree of stocks (from [1]); stocks from the same industry (represented by color) tend to cluster together Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 7. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation The standard methodology - limitations • MST clustering equivalent to Single Linkage clustering: • chaining phenomenon • not stable to noise / small perturbations [11] • Use of the Pearson correlation: • can take value 0 whereas variables are strongly dependent • not invariant to variable monotone transformations • not robust to outliers Is it still useful for financial time series? stocks? CDS??! Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 8. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation The standard methodology - limitations • MST clustering equivalent to Single Linkage clustering: • chaining phenomenon • not stable to noise / small perturbations [11] • Use of the Pearson correlation: • can take value 0 whereas variables are strongly dependent • not invariant to variables monotone transformations • not robust to outliers Is it still useful for financial time series? stocks? CDS??! Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 9. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation 1 Introduction 2 The standard methodology 3 Exploring dependence between returns 4 Copula-based dependence coefficients (clustering distances) 5 Empirical convergence rates 6 Beyond dependence: a (copula,margins) representation Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 10. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Copulas Sklar’s Theorem [13] For (Xi , Xj ) having continuous marginal cdfs FXi , FXj , its joint cumulative distribution F is uniquely expressed as F(Xi , Xj ) = C(FXi (Xi ), FXj (Xj )), where C is known as the copula of (Xi , Xj ). Copula’s uniform marginals jointly encode all the dependence. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 11. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation From ranks to empirical copula ri , rj are the rank statistics of Xi , Xj respectively, i.e. rt i is the rank of Xt i in {X1 i , . . . , XT i }: rt i = T k=1 1{Xk i ≤ Xt i }. Deheuvels’ empirical copula [3] Any copula ˆC defined on the lattice L = {( ti T , tj T ) : ti , tj = 0, . . . , T} by ˆC( ti T , tj T ) = 1 T T t=1 1{rt i ≤ ti , rt j ≤ tj } is an empirical copula. ˆC is a consistent estimator of C with uniform convergence [4]. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 12. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Clustering of bivariate empirical copulas Generate the N 2 bivariate empirical copulas Find clusters of copulas using optimal transport [10, 9] Compute and display the clusters’ centroids [2] Some code available at www.datagrapple.com/Tech. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 13. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Copula-centers for stocks (CAC 40) Figure: Stocks: More mass in the bottom-left corner, i.e. lower tail dependence. Stock prices tend to plummet together. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 14. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Copula-centers for Credit Default Swaps (XO index) Figure: Credit default swaps: More mass in the top-right corner, i.e. upper tail dependence. Insurance cost against entities’ default tends to soar in stressed market. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 15. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation 1 Introduction 2 The standard methodology 3 Exploring dependence between returns 4 Copula-based dependence coefficients (clustering distances) 5 Empirical convergence rates 6 Beyond dependence: a (copula,margins) representation Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 16. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Dependence as relative distances between copulas C copula of (Xi , Xj ), |u − v|/ √ 2 distance between (u, v) to the diagonal Spearman’s ρS : ρS (Xi , Xj ) = 12 1 0 1 0 (C(u, v) − uv)dudv = 1 − 6 1 0 1 0 (u − v)2 dC(u, v) Many correlation coefficients can be expressed as distances to the Fr´echet–Hoeffding bounds or the independence [6]. Some are explicitely built this way (e.g. [12, 5, 9]). Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 17. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation A metric space for copulas: Optimal Transport Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 18. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation The Target/Forget Dependence Coefficient (TFDC) Now, we can define our bespoke dependence coefficient: Build the forget-dependence copulas {CF l }l Build the target-dependence copulas {CT k }k Compute the empirical copula Cij from xi , xj TFDC(Cij ) = minl D(CF l , Cij ) minl D(CF l , Cij ) + mink D(Cij , CT k ) Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 19. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Spearman vs. TFDC 0.0 0.2 0.4 0.6 0.8 1.0 discontinuity position a 0.0 0.2 0.4 0.6 0.8 1.0 Estimatedpositivedependence Spearman & TFDC values as a function of a TFDC Spearman Figure: Empirical copulas for (X, Y ) where X = Z1{Z < a} + X 1{Z > a}, Y = Z1{Z < a + 0.25} + Y 1{Z > a + 0.25}, a = 0, 0.05, . . . , 0.95, 1, and where Z is uniform on [0, 1] and X , Y are independent noises (left). TFDC and Spearman coefficients estimated between X and Y as a function of a (right). For a = 0.75, Spearman coefficient yields a negative value, yet X = Y over [0, a]. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 20. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation 1 Introduction 2 The standard methodology 3 Exploring dependence between returns 4 Copula-based dependence coefficients (clustering distances) 5 Empirical convergence rates 6 Beyond dependence: a (copula,margins) representation Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 21. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Process: Recovering a simulated ground-truth [8] A simulation & benchmark process that needs to be refined: Extract (using a large sample) a filtered correlation matrix R Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 22. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Process: Recovering a simulated ground-truth [8] A simulation & benchmark process that needs to be refined: Generate samples of size T = 10, . . . , 20, . . . from a relevant distribution (parameterized by R) Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 23. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Process: Recovering a simulated ground-truth [8] A simulation & benchmark process that needs to be refined: Compute the ratio of the number of correct clustering obtained over the number of trials as a function of T 100 200 300 400 500 Sample size 0.0 0.2 0.4 0.6 0.8 1.0 Score Empirical rates of convergence for Single Linkage Gaussian - Pearson Gaussian - Spearman Student - Pearson Student - Spearman 100 200 300 400 500 Sample size 0.0 0.2 0.4 0.6 0.8 1.0 Score Empirical rates of convergence for Average Linkage Gaussian - Pearson Gaussian - Spearman Student - Pearson Student - Spearman 100 200 300 400 500 Sample size 0.0 0.2 0.4 0.6 0.8 1.0 Score Empirical rates of convergence for Ward Gaussian - Pearson Gaussian - Spearman Student - Pearson Student - Spearman A full comparative study will be posted online at www.datagrapple.com/Tech. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 24. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation 1 Introduction 2 The standard methodology 3 Exploring dependence between returns 4 Copula-based dependence coefficients (clustering distances) 5 Empirical convergence rates 6 Beyond dependence: a (copula,margins) representation Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 25. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation ON CLUSTERING FINANCIAL TIME SERIES GAUTIER MARTI, PHILIPPE DONNAT AND FRANK NIELSEN NOISY CORRELATION MATRICES Let X be the matrix storing the standardized re- turns of N = 560 assets (credit default swaps) over a period of T = 2500 trading days. Then, the empirical correlation matrix of the re- turns is C = 1 T XX . We can compute the empirical density of its eigenvalues ρ(λ) = 1 N dn(λ) dλ , where n(λ) counts the number of eigenvalues of C less than λ. From random matrix theory, the Marchenko- Pastur distribution gives the limit distribution as N → ∞, T → ∞ and T/N fixed. It reads: ρ(λ) = T/N 2π (λmax − λ)(λ − λmin) λ , where λmax min = 1 + N/T ± 2 N/T, and λ ∈ [λmin, λmax]. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 λ 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 ρ(λ) Figure 1: Marchenko-Pastur density vs. empirical den- sity of the correlation matrix eigenvalues Notice that the Marchenko-Pastur density fits well the empirical density meaning that most of the information contained in the empirical corre- lation matrix amounts to noise: only 26 eigenval- ues are greater than λmax. The highest eigenvalue corresponds to the ‘mar- ket’, the 25 others can be associated to ‘industrial sectors’. CLUSTERING TIME SERIES Given a correlation matrix of the returns, 0 100 200 300 400 500 0 100 200 300 400 500 Figure 2: An empirical and noisy correlation matrix one can re-order assets using a hierarchical clus- tering algorithm to make the hierarchical correla- tion pattern blatant, 0 100 200 300 400 500 0 100 200 300 400 500 Figure 3: The same noisy correlation matrix re-ordered by a hierarchical clustering algorithm and finally filter the noise according to the corre- lation pattern: 0 100 200 300 400 500 0 100 200 300 400 500 Figure 4: The resulting filtered correlation matrix BEYOND CORRELATION Sklar’s Theorem. For any random vector X = (X1, . . . , XN ) having continuous marginal cumulative distribution functions Fi, its joint cumulative distribution F is uniquely expressed as F(X1, . . . , XN ) = C(F1(X1), . . . , FN (XN )), where C, the multivariate distribution of uniform marginals, is known as the copula of X. Figure 5: ArcelorMittal and Société générale prices are projected on dependence ⊕ distribution space; notice their heavy-tailed exponential distribution. Let θ ∈ [0, 1]. Let (X, Y ) ∈ V2 . Let G = (GX, GY ), where GX and GY are respectively X and Y marginal cdf. We define the following distance d2 θ(X, Y ) = θd2 1(GX(X), GY (Y )) + (1 − θ)d2 0(GX, GY ), where d2 1(GX(X), GY (Y )) = 3E[|GX(X) − GY (Y )|2 ], and d2 0(GX, GY ) = 1 2 R dGX dλ − dGY dλ 2 dλ. CLUSTERING RESULTS & STABILITY 0 5 10 15 20 25 30 Standard Deviation in basis points 0 5 10 15 20 25 30 35 Numberofoccurrences Standard Deviations Histogram Figure 6: (Top) The returns correlation structure ap- pears more clearly using rank correlation; (Bottom) Clusters of returns distributions can be partly described by the returns volatility Figure 7: Stability test on Odd/Even trading days sub- sampling: our approach (GNPR) yields more stable clusters with respect to this perturbation than standard approaches (using Pearson correlation or L2 distances). Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 26. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Ricardo Coelho, Przemyslaw Repetowicz, Stefan Hutzler, and Peter Richmond. Investigation of Cluster Structure in the London Stock Exchange. Marco Cuturi and Arnaud Doucet. Fast computation of wasserstein barycenters. In Proceedings of the 31th International Conference on Machine Learning, ICML 2014, Beijing, China, 21-26 June 2014, pages 685–693, 2014. Paul Deheuvels. La fonction de d´ependance empirique et ses propri´et´es. un test non param´etrique d’ind´ependance. Acad. Roy. Belg. Bull. Cl. Sci.(5), 65(6):274–292, 1979. Paul Deheuvels. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 27. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation A non-parametric test for independence. Publications de l’Institut de Statistique de l’Universit´e de Paris, 26:29–50, 1981. Fabrizio Durante and Roberta Pappada. Cluster analysis of time series via kendall distribution. In Strengthening Links Between Data Analysis and Soft Computing, pages 209–216. Springer, 2015. Eckhard Liebscher et al. Copula-based dependence measures. Dependence Modeling, 2(1):49–64, 2014. Rosario N Mantegna. Hierarchical structure in financial markets. The European Physical Journal B-Condensed Matter and Complex Systems, 11(1):193–197, 1999. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 28. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Gautier Marti, S´ebastien Andler, Frank Nielsen, and Philippe Donnat. Clustering financial time series: How long is enough? Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016, pages 2583–2589, 2016. Gautier Marti, Sebastien Andler, Frank Nielsen, and Philippe Donnat. Exploring and measuring non-linear correlations: Copulas, lightspeed transportation and clustering. NIPS 2016 Time Series Workshop, 55, 2016. Gautier Marti, S´ebastien Andler, Frank Nielsen, and Philippe Donnat. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 29. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation Optimal transport vs. fisher-rao distance between copulas for clustering multivariate time series. In IEEE Statistical Signal Processing Workshop, SSP 2016, Palma de Mallorca, Spain, June 26-29, 2016, pages 1–5, 2016. Gautier Marti, Philippe Very, Philippe Donnat, and Frank Nielsen. A proposal of a methodological framework with experimental guidelines to investigate clustering stability on financial time series. In 14th IEEE International Conference on Machine Learning and Applications, ICMLA 2015, Miami, FL, USA, December 9-11, 2015, pages 32–37, 2015. Barnab´as P´oczos, Zoubin Ghahramani, and Jeff G. Schneider. Copula-based kernel dependency measures. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r
  • 30. HELLEBORECAPITAL Introduction The standard methodology Exploring dependence between returns Copula-based dependence coefficients (clustering distances) Empirical convergence rates Beyond dependence: a (copula,margins) representation In Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, June 26 - July 1, 2012, 2012. A Sklar. Fonctions de r´epartition `a n dimensions et leurs marges. Universit´e Paris 8, 1959. Gautier Marti Clustering CDS: algorithms, distances, stability and convergence r