SlideShare a Scribd company logo
G
a
utier M
a
rti, COMPLEX NETWORKS 2022
Whatdeeplearningcanbringto
two dec
a
des of correl
a
tion, hier
a
rchies, networks
a
nd
clustering in
f
in
a
nci
a
l m
a
rkets
Fromseminalpaper(1999)
to recent st
a
te of the
a
rt (2020)
Conclusion from sota review:
Deep Learning is not a widely used tool (yet?)
Quanttraderconcerns
Problems poorly
a
ddressed by the liter
a
ture
• Which datasets are relevant to
build
fi
nancial networks between
companies, to predict what?
• We cannot use future data, i.e.
using rolling or expanding window:
How long is enough?
• (Too) many clustering and
network-methods available: Which
one should we use, and why?
• Very expensive, IP-protected, not
very suitable for academic research;
Explains focus on stocks returns...
• Many studies are full sample
without out-of-sample validation:
Prediction is not the focus.
• No well de
fi
ned benchmarks:
It makes hard to compare methods.
Howlongisenough?
for my rolling window...
• Deep Learning for simulations,
and
fi
nding 'laws' in large
amount of data.
Howmuchdataisnecessary?
One possible criterion to choose
a
mongst methods
The Hierarchical Correlation Block Model (HCBM)
is a convenient assumption to do some math
(matrix concentration inequalities) but it is
challenging to obtain practical results.
Within this model, simulations help to chooseg best method:
Ward + Spearman correlation with at least 200 days of past returns.
Manychallengestoovercome...
before implementing the 'simul
a
tor'
• The simulator module:
• Financial time series simulators:
Generative Adversarial Networks for Financial Trading
Strategies Fine-Tuning and Combination (2019)
• Financial correlations simulator:
CorrGAN: Sampling Realistic Financial Correlation Matrices
Using Generative Adversarial Networks (2019)
• Both at the same time?
=> It does not exist yet (TTBOMK)
X
Fromsimulations...
to supervised le
a
rning of clustering
a
ccur
a
cy
• For a given fuzzy HCBM model, one can collect
X := noisy estimates (empirical correlation
matrices from the simulated time series of
length T), y := clustering accuracy wrt model.
• How can we go from
(empirical correlation matrix, T)
to an expected clustering accuracy?
=> supervised learning.
?
What is a relevant feature space to describe empirical correlation matrices?
For example:
- correlation coe
ffi
cients summary statistics
- percentage of variance explained by the k-
fi
rst eigenvalues
-
fi
rst eigenvector summary statistics
- minimum spanning tree statistics (centrality, average shortest path length)
- cophenetic correlation coe
ffi
cient
- condition number
- ...
A poor choice of a somewhat arbitrary feature space may bias learning and results...
Deep learning provides an end-to-end approach from
raw empirical matrices to target variables (clustering accuracy).
- CNN (seeing the correlation matrix as an image)
- GNN/GCN (the correlation matrix as a network)
We plan to investigate using convolutional and graph neural networks,
and compare predictive results with standard machine learning approaches.
https://marti.ai/q
fi
n/2020/08/17/empirical-matrices-portfolio-comparisons.html
Applicationtoclustering...
for qu
a
nts
• One can use the predictive model to
determine the smallest possible
window in order to get a valid
clustering, given what the empirical
correlation matrices look like.
• It should be useful for:
• statistical arbitrage
• risk factors and risk models
• portfolio allocation methods
(HRP, HCAA, HERC)
Clustering of global CDS based on Hellebore Capital's proprietary data
Otherpotentialemergingapplications
Numberofclusters,hierarchies
a
nd their
a
utom
a
tic detection
• Automated detection of:
•
fl
at clustering
• hierarchical clustering
• altogether with the relevant number of
clusters or hierarchical levels.
• Not all clusters found by standard methods
are true clusters! Filtering criteria are ad hoc
and not stable for trading/risk systems.
• A task similar to Object Detection and
Recognition with Deep Learning in
Computer Vision
NewopenPiTdatasets
for empiric
a
l
f
in
a
nci
a
l networks rese
a
rch
• Networks from text instead of
correlation of stock returns
• Use of novel large language models
easily available from Hugging Face
to build networks of similar
products & services companies
(cf. Hoberg and Phillips Text Based
Industry Classi
fi
cations for early
work using crude NLP techniques)
Illustrations from
Text-Based Representations of Market Structures, Gerard Hoberg
Whyclusteringatall?
end-to-end deep le
a
rning
• End-to-end approach with a particular
downstream task in mind can, maybe,
recover the 'optimal' clustering, which
is then used implicitly...
• Is it better than relying on expert
knowledge to
fi
nd a good combination
of relevant distance, clustering algo.,
hyper-params, su
ffi
cient rolling
window, and post-processing of the
signals based on clusters obtained? ?

More Related Content

PPTX
The QuantCon Keynote: "Counter Trend Trading – Threat or Complement to Trend ...
PPTX
「解説資料」Set Transformer: A Framework for Attention-based Permutation-Invariant ...
PDF
Feature Engineering
PDF
Peculiarities of Volatilities by Ernest Chan at QuantCon 2016
PDF
Quantitive Approaches and venues for Energy Trading & Risk Management
PDF
Beware of Low Frequency Data by Ernie Chan, Managing Member, QTS Capital Mana...
PDF
[DL輪読会]Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-...
PPT
計量化交易策略的開發與運用
The QuantCon Keynote: "Counter Trend Trading – Threat or Complement to Trend ...
「解説資料」Set Transformer: A Framework for Attention-based Permutation-Invariant ...
Feature Engineering
Peculiarities of Volatilities by Ernest Chan at QuantCon 2016
Quantitive Approaches and venues for Energy Trading & Risk Management
Beware of Low Frequency Data by Ernie Chan, Managing Member, QTS Capital Mana...
[DL輪読会]Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-...
計量化交易策略的開發與運用

What's hot (20)

PDF
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
PDF
[DL輪読会]Temporal Abstraction in NeurIPS2019
PDF
"Don't Lose Your Shirt Trading Mean-Reversion" by Edith Mandel, Principal at ...
PPTX
【DL輪読会】Is Conditional Generative Modeling All You Need For Decision-Making?
PDF
Quantitative Trading in Eurodollar Futures Market by Edith Mandel at QuantCon...
PDF
Generative Models(メタサーベイ )
PPTX
Kaggle – Airbnb New User Bookingsのアプローチについて(Kaggle Tokyo Meetup #1 20160305)
PPTX
Variational Autoencoder를 여러 가지 각도에서 이해하기 (Understanding Variational Autoencod...
PDF
MLaPP 24章 「マルコフ連鎖モンテカルロ法 (MCMC) による推論」
PDF
[DL輪読会]A Hierarchical Latent Vector Model for Learning Long-Term Structure in...
PDF
【DL輪読会】GradMax: Growing Neural Networks using Gradient Information
PPTX
Optimization in deep learning
PDF
論文紹介:Dueling network architectures for deep reinforcement learning
PDF
カスタムSIで使ってみよう ~ OpenAI Gym を使った強化学習
PDF
segmentation-modelsでざっくり動かすセマンティックセグメンテーション(U-Net)
PPTX
Explainable Machine Learning (Explainable ML)
PPTX
Pydata_リクルートにおけるbanditアルゴリズム_実装前までのプロセス
PDF
Risk Management: Maximising Long-Term Growth Presentation
PPTX
Memebership inference attacks against machine learning models
PPTX
金融情報における時系列分析
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
[DL輪読会]Temporal Abstraction in NeurIPS2019
"Don't Lose Your Shirt Trading Mean-Reversion" by Edith Mandel, Principal at ...
【DL輪読会】Is Conditional Generative Modeling All You Need For Decision-Making?
Quantitative Trading in Eurodollar Futures Market by Edith Mandel at QuantCon...
Generative Models(メタサーベイ )
Kaggle – Airbnb New User Bookingsのアプローチについて(Kaggle Tokyo Meetup #1 20160305)
Variational Autoencoder를 여러 가지 각도에서 이해하기 (Understanding Variational Autoencod...
MLaPP 24章 「マルコフ連鎖モンテカルロ法 (MCMC) による推論」
[DL輪読会]A Hierarchical Latent Vector Model for Learning Long-Term Structure in...
【DL輪読会】GradMax: Growing Neural Networks using Gradient Information
Optimization in deep learning
論文紹介:Dueling network architectures for deep reinforcement learning
カスタムSIで使ってみよう ~ OpenAI Gym を使った強化学習
segmentation-modelsでざっくり動かすセマンティックセグメンテーション(U-Net)
Explainable Machine Learning (Explainable ML)
Pydata_リクルートにおけるbanditアルゴリズム_実装前までのプロセス
Risk Management: Maximising Long-Term Growth Presentation
Memebership inference attacks against machine learning models
金融情報における時系列分析
Ad

Similar to What deep learning can bring to... (20)

PDF
My recent attempts at using GANs for simulating realistic stocks returns
PDF
A review of two decades of correlations, hierarchies, networks and clustering...
PDF
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
PDF
Introduction to Data Science
PDF
Some contributions to the clustering of financial time series - Applications ...
PPTX
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
PDF
On clustering financial time series - A need for distances between dependent ...
PPTX
Big Data & Machine Learning - TDC2013 Sao Paulo
PPTX
Internship presentation
PDF
machine_learning_section1_ebook.pdf
PDF
Machine Learning Basics for Web Application Developers
PPTX
Gaussian Processes and Time Series.pptx
PDF
H2O World - Intro to Data Science with Erin Ledell
PDF
Clustering Financial Time Series: How Long is Enough?
PDF
Getting Started with Machine Learning
PPTX
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
PPS
Brief Tour of Machine Learning
PPTX
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
PDF
Choosing a Machine Learning technique to solve your need
PDF
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
My recent attempts at using GANs for simulating realistic stocks returns
A review of two decades of correlations, hierarchies, networks and clustering...
cCorrGAN: Conditional Correlation GAN for Learning Empirical Conditional Dist...
Introduction to Data Science
Some contributions to the clustering of financial time series - Applications ...
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
On clustering financial time series - A need for distances between dependent ...
Big Data & Machine Learning - TDC2013 Sao Paulo
Internship presentation
machine_learning_section1_ebook.pdf
Machine Learning Basics for Web Application Developers
Gaussian Processes and Time Series.pptx
H2O World - Intro to Data Science with Erin Ledell
Clustering Financial Time Series: How Long is Enough?
Getting Started with Machine Learning
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Brief Tour of Machine Learning
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
Choosing a Machine Learning technique to solve your need
Machine_Learning_with_MATLAB_Seminar_Latest.pdf
Ad

More from Gautier Marti (15)

PDF
Using Large Language Models in 10 Lines of Code
PDF
A quick demo of Top2Vec With application on 2020 10-K business descriptions
PDF
How deep generative models can help quants reduce the risk of overfitting?
PDF
Generating Realistic Synthetic Data in Finance
PDF
Applications of GANs in Finance
PDF
Takeaways from ICML 2019, Long Beach, California
PDF
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
PDF
Clustering CDS: algorithms, distances, stability and convergence rates
PDF
Clustering Financial Time Series using their Correlations and their Distribut...
PDF
A closer look at correlations
PDF
Optimal Transport vs. Fisher-Rao distance between Copulas
PDF
On Clustering Financial Time Series - Beyond Correlation
PDF
Optimal Transport between Copulas for Clustering Time Series
PDF
On the stability of clustering financial time series
PDF
Clustering Random Walk Time Series
Using Large Language Models in 10 Lines of Code
A quick demo of Top2Vec With application on 2020 10-K business descriptions
How deep generative models can help quants reduce the risk of overfitting?
Generating Realistic Synthetic Data in Finance
Applications of GANs in Finance
Takeaways from ICML 2019, Long Beach, California
Autoregressive Convolutional Neural Networks for Asynchronous Time Series
Clustering CDS: algorithms, distances, stability and convergence rates
Clustering Financial Time Series using their Correlations and their Distribut...
A closer look at correlations
Optimal Transport vs. Fisher-Rao distance between Copulas
On Clustering Financial Time Series - Beyond Correlation
Optimal Transport between Copulas for Clustering Time Series
On the stability of clustering financial time series
Clustering Random Walk Time Series

Recently uploaded (20)

PDF
NAPF_RESPONSE_TO_THE_PENSIONS_COMMISSION_8 _2_.pdf
PDF
Understanding University Research Expenditures (1)_compressed.pdf
PDF
way to join Real illuminati agent 0782561496,0756664682
PDF
ECONOMICS AND ENTREPRENEURS LESSONSS AND
PDF
Q2 2025 :Lundin Gold Conference Call Presentation_Final.pdf
PDF
Why Ignoring Passive Income for Retirees Could Cost You Big.pdf
PPTX
Session 11-13. Working Capital Management and Cash Budget.pptx
PDF
Lecture1.pdf buss1040 uses economics introduction
PDF
Mathematical Economics 23lec03slides.pdf
PDF
caregiving tools.pdf...........................
PDF
Copia de Minimal 3D Technology Consulting Presentation.pdf
PPTX
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
PPTX
How best to drive Metrics, Ratios, and Key Performance Indicators
PPT
E commerce busin and some important issues
PDF
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
PDF
Circular Flow of Income by Dr. S. Malini
PDF
Dialnet-DynamicHedgingOfPricesOfNaturalGasInMexico-8788871.pdf
PPTX
Session 14-16. Capital Structure Theories.pptx
PDF
discourse-2025-02-building-a-trillion-dollar-dream.pdf
PDF
Bitcoin Layer August 2025: Power Laws of Bitcoin: The Core and Bubbles
NAPF_RESPONSE_TO_THE_PENSIONS_COMMISSION_8 _2_.pdf
Understanding University Research Expenditures (1)_compressed.pdf
way to join Real illuminati agent 0782561496,0756664682
ECONOMICS AND ENTREPRENEURS LESSONSS AND
Q2 2025 :Lundin Gold Conference Call Presentation_Final.pdf
Why Ignoring Passive Income for Retirees Could Cost You Big.pdf
Session 11-13. Working Capital Management and Cash Budget.pptx
Lecture1.pdf buss1040 uses economics introduction
Mathematical Economics 23lec03slides.pdf
caregiving tools.pdf...........................
Copia de Minimal 3D Technology Consulting Presentation.pdf
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
How best to drive Metrics, Ratios, and Key Performance Indicators
E commerce busin and some important issues
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
Circular Flow of Income by Dr. S. Malini
Dialnet-DynamicHedgingOfPricesOfNaturalGasInMexico-8788871.pdf
Session 14-16. Capital Structure Theories.pptx
discourse-2025-02-building-a-trillion-dollar-dream.pdf
Bitcoin Layer August 2025: Power Laws of Bitcoin: The Core and Bubbles

What deep learning can bring to...

  • 1. G a utier M a rti, COMPLEX NETWORKS 2022 Whatdeeplearningcanbringto two dec a des of correl a tion, hier a rchies, networks a nd clustering in f in a nci a l m a rkets
  • 2. Fromseminalpaper(1999) to recent st a te of the a rt (2020) Conclusion from sota review: Deep Learning is not a widely used tool (yet?)
  • 3. Quanttraderconcerns Problems poorly a ddressed by the liter a ture • Which datasets are relevant to build fi nancial networks between companies, to predict what? • We cannot use future data, i.e. using rolling or expanding window: How long is enough? • (Too) many clustering and network-methods available: Which one should we use, and why? • Very expensive, IP-protected, not very suitable for academic research; Explains focus on stocks returns... • Many studies are full sample without out-of-sample validation: Prediction is not the focus. • No well de fi ned benchmarks: It makes hard to compare methods.
  • 5. • Deep Learning for simulations, and fi nding 'laws' in large amount of data. Howmuchdataisnecessary? One possible criterion to choose a mongst methods The Hierarchical Correlation Block Model (HCBM) is a convenient assumption to do some math (matrix concentration inequalities) but it is challenging to obtain practical results. Within this model, simulations help to chooseg best method: Ward + Spearman correlation with at least 200 days of past returns.
  • 6. Manychallengestoovercome... before implementing the 'simul a tor' • The simulator module: • Financial time series simulators: Generative Adversarial Networks for Financial Trading Strategies Fine-Tuning and Combination (2019) • Financial correlations simulator: CorrGAN: Sampling Realistic Financial Correlation Matrices Using Generative Adversarial Networks (2019) • Both at the same time? => It does not exist yet (TTBOMK) X
  • 7. Fromsimulations... to supervised le a rning of clustering a ccur a cy • For a given fuzzy HCBM model, one can collect X := noisy estimates (empirical correlation matrices from the simulated time series of length T), y := clustering accuracy wrt model. • How can we go from (empirical correlation matrix, T) to an expected clustering accuracy? => supervised learning. ? What is a relevant feature space to describe empirical correlation matrices? For example: - correlation coe ffi cients summary statistics - percentage of variance explained by the k- fi rst eigenvalues - fi rst eigenvector summary statistics - minimum spanning tree statistics (centrality, average shortest path length) - cophenetic correlation coe ffi cient - condition number - ... A poor choice of a somewhat arbitrary feature space may bias learning and results... Deep learning provides an end-to-end approach from raw empirical matrices to target variables (clustering accuracy). - CNN (seeing the correlation matrix as an image) - GNN/GCN (the correlation matrix as a network) We plan to investigate using convolutional and graph neural networks, and compare predictive results with standard machine learning approaches. https://marti.ai/q fi n/2020/08/17/empirical-matrices-portfolio-comparisons.html
  • 8. Applicationtoclustering... for qu a nts • One can use the predictive model to determine the smallest possible window in order to get a valid clustering, given what the empirical correlation matrices look like. • It should be useful for: • statistical arbitrage • risk factors and risk models • portfolio allocation methods (HRP, HCAA, HERC) Clustering of global CDS based on Hellebore Capital's proprietary data
  • 10. Numberofclusters,hierarchies a nd their a utom a tic detection • Automated detection of: • fl at clustering • hierarchical clustering • altogether with the relevant number of clusters or hierarchical levels. • Not all clusters found by standard methods are true clusters! Filtering criteria are ad hoc and not stable for trading/risk systems. • A task similar to Object Detection and Recognition with Deep Learning in Computer Vision
  • 11. NewopenPiTdatasets for empiric a l f in a nci a l networks rese a rch • Networks from text instead of correlation of stock returns • Use of novel large language models easily available from Hugging Face to build networks of similar products & services companies (cf. Hoberg and Phillips Text Based Industry Classi fi cations for early work using crude NLP techniques) Illustrations from Text-Based Representations of Market Structures, Gerard Hoberg
  • 12. Whyclusteringatall? end-to-end deep le a rning • End-to-end approach with a particular downstream task in mind can, maybe, recover the 'optimal' clustering, which is then used implicitly... • Is it better than relying on expert knowledge to fi nd a good combination of relevant distance, clustering algo., hyper-params, su ffi cient rolling window, and post-processing of the signals based on clusters obtained? ?