SlideShare a Scribd company logo
Topological Data Analysis and
Persistent Homology
Supervisor: Prof. Francesco Vaccarino, Candidate: Carla Federica Melia
POLITECNICO DI TORINO
DIPARTIMENTO DI SCIENZE MATEMATICHE
Corso di Laurea Magistrale in Ingegneria Matematica
Graduation Session of December 2018
A.Y. 2017/18
 Topological Data Analysis (TDA) is a branch of applied mathematics that uses
notions and techniques of a miscellaneous set of scientific fields such as
 Its resulting tools allow to infer robust features about the “shape” of complex
datasets potentially corrupted by noise/incompleteness.
2
Introduction
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
StatisticsComputer ScienceAlgebraic Topology
TDA aims at inferring statistically significative information on the shape of the data.
3
The Shape of Data
[1]
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
[15] [16]
4
Finding Loops and Voids in Universe
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
Periodic behaviors Attractors
[7] [8]
5
Periodic Systems
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
6
Effective Brain Networks Analysis
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
[17][5]
 This thesis focuses on Persistent Homology (PH) technique.
 The purposes are:
7
Objectives 1/3
1) to provide a satisfying explanation of TDA and PH fundamentals
[2]
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
8
Objectives 2/3
2) to analyze the robustness and the reliability of the inferred features
[3]
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
9
Objectives 3/3
3) to implement some TDA techniques on some study cases
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
 The input is a finite set of elements coming with a notion of distance
between them.
 The elements are mapped into a point cloud (PCD).
 PCD is completed by building "continuous" shape, a complex, on it.
10
From PCD to Complex
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
[13]
A
B
C
B
A
C
A
B
C
 There are many ways to build simplicial complexes from a topological space.
 To be a useful, a simplicial complex has to have an homology that approximates the
one of the space we want to study.
11
[18]
Simplicial Complexes
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
12
Cech and Vietoris–Rips Complexes
 Reconstruction Theorem
 Nerve Theorem
 𝐶 𝛼(X) ⊂ 𝑉2𝛼(X) ⊂ 𝐶2𝛼(X)
Cech complex is difficult to calculate, but it is quite small
and accurate.
[19]
𝐶 𝛼 𝑋 ≔ 𝑝1, … , 𝑝k : 𝑝1, … , 𝑝k ⊂ 𝑋,∩𝑖 𝐵 𝛼 𝑝i ≠ ∅
𝑉𝛼 𝑋 ≔ 𝑝1, … , 𝑝k : 𝑝1, … , 𝑝k ⊂ 𝑋, 𝑚𝑎𝑥 𝑝i,𝑝j∈𝜎
(𝑑(𝑝i, 𝑝j)) ≤ 𝛼
Vietoris–Rips complex is easy to calculate, but it is usually very
big and less accurate.
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
 The homotopy equivalent items shares the same homology.
 Homology groups are a more computable alternative to homotopy ones.
13
Homology
Boundary Operator
𝜎 = [𝑣0, … , 𝑣 𝑘] 𝜕 𝑘 𝜎 = ෍
𝑖=0
𝑛
−1 𝑖
𝑣0,…, ො𝑣𝑖,…, 𝑣 𝑘
Simplex
𝜕 𝑣0, 𝑣1, 𝑣2, 𝑣3 = 𝑣1, 𝑣2, 𝑣3 - 𝑣0, 𝑣2, 𝑣3 + 𝑣0, 𝑣1, 𝑣3 - 𝑣0, 𝑣1, 𝑣2
[8]
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
14
Homology Groups
Cycles
𝑍 𝑘 = 𝐾𝑒𝑟(𝜕 𝑘)
Boundaries
𝐵 𝑘 = 𝐼𝑚(𝜕 𝑘+1)
k-th Homology Group
𝐻 𝑘 = 𝑍 𝑘/𝐵 𝑘
 The goal of homology is to discard cycles that are also boundaries, so we quotient
𝑍 𝑘 using the following equivalence relation
 The rank of 𝐻 𝑘 is 𝛽 𝑘.
𝛽0 = #Components 𝛽1 = #Loops 𝛽2 = #Voids 𝛽 𝑛 = # n-dim Holes
∀𝑧1, 𝑧2 ∈ 𝑍 𝑘, 𝑧1~𝑧2 ⇔ 𝑧1 − 𝑧2 ∈ 𝐵 𝑘
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
The sequence of simplicial complexes with its inclusion maps is a filtration.
[20]
15
Filtrations
Which 𝑑 should we choose?
The most persistent features are detected using PH. They are supposed to represent
true characteristics of the underlying space.
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
16
Persistent Homology
 With PH we study the homology of a filtration as a single algebraic entity.
 Its features can be then analyzed using its barcode representation and this is
formally justified by the Structure Theorem.
Definition
When 0 ≤ 𝑖 ≤ 𝑗 ≤ 𝑛, the inclusion 𝑥𝑖
𝑗
:𝑋𝑖 ↪ 𝑋𝑗 induces a homomorphism
𝐻(𝑥𝑖
𝑗
):𝐻(𝑋𝑖) → 𝐻(𝑋𝑗) whose images are the persistent homology groups.
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
17
Persistence Barcode
[21]
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
18
Persistence Diagram
 Persistence Diagram
 Given two persistent diagrams 𝐷1 and 𝐷2, the Bottleneck distance between them is
𝑑 𝐵 𝐷1, 𝐷2 = 𝑖𝑛𝑓𝛾 𝑠𝑢𝑝 𝑥∈𝐷1
| 𝑥 − 𝛾 𝑥 |∞ where 𝛾 ranges over all multi-bijections
𝐷1 → 𝐷2 .
[23]
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
19
Stability Results
Theorem
Let 𝑋 and 𝑌 be two compact metric spaces and let 𝐹𝑖𝑙𝑡(𝑋)
and 𝐹𝑖𝑙𝑡(𝑌) be the Vietoris–Rips filtrations built on top them.
Then
𝑑 𝐵(𝐷(𝐹𝑖𝑙𝑡 𝑋 ), 𝐷(𝐹𝑖𝑙𝑡 𝑌 )) ≤ 2𝑑 𝐺𝐻(𝑋, 𝑌)
Moreover, if 𝑋 and 𝑌 are embedded in the same space then
𝑑 𝐵(𝐷(𝐹𝑖𝑙𝑡 𝑋 ), 𝐷(𝐹𝑖𝑙𝑡 𝑌 )) ≤ 2𝑑 𝐻(𝑋, 𝑌)
[24]
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
20
Statistical Results
The most persistent features can be detected and separated from topological
noise using statistical methods as the Bootstrap.
Given a persistence diagram 𝑋 with an estimator ෠𝑋, with look for 𝛿 𝛼 such that
𝑃 𝑑 𝐵 𝑋, ෠𝑋 ≥ 𝛿 𝛼 ≤α, α∈(0,1)
The confidence set will be 𝑋: 𝑑 𝐵 𝑋, ෠𝑋 ≥ 𝛿 𝛼
[25]
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
21
Implementation
[27]
 To analyze the topological information of different datasets, a console application
was implemented using GUDHI in Python, TDA in R and QlikView.
 GUDHI proposed an efficient tree representation for simplicial complexes, the
simplex tree.
+ +
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
22
Data Upload
Data
Upload
Consistency
Checks
Dimension
Plot
Data …
pandas
matplotlib
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
23
TDA Application
Data Simplex Tree Persistence
Persistence
Diagram
Persistence
Barcode
Bootstrap
.pers
gudhi
TDA
Confidence
Band
…
Betti
Curves
𝛿 𝛼
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
24
Front End – QlikView
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
25
Results
Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
26
Results
Thank you
BIBLIOGRAPHY
[1] G. Carlsson, The Shape of Data conference, Ayasdi Energy Summit, 2014.
[2] R. Ghrist, Three examples of applied and computational homology, 2008.
[3] P.Bubenik, Statistical Topological Data Analysis using Persistence Landscapes, Journal of Machine Learning Research 16, 2015.
[4] M. Alagappan, J. Carlsson, G. Carlsson, T. Ishkanov, A. Lehman, P. Y. Lum, G. Singh, and M. Vejdemo-Johansson, Extracting insights from the shape of complex data using topology,
Scientific Reports v. 3, Article number: 1236, 2013.
[5] R. Carhart-Harris, P. Expert, P. J. Hellyer, D. Nutt, G. Petri, F. Turkheimer and F. Vaccarino, Homological scaffolds of brain functional networks, Journal of The Royal Society Interface,
11(101):20140873, 2014.
[6] M. Lesnick, Studying the Shape of Data Using Topology, Institute for Advanced Study, School of Mathematics, 2013.
[7] P. Chardy, V. David and B. Sautour, Fitting a predator–prey model to zooplankton time-series data in the Gironde estuary (France): Ecological significance of the parameters, Estuarine,
Coastal and Shelf Science, Volume 67, Issue 4, Pages 605-617, 2006.
[8] S. Maletic, M. Rajkovic and Y. Zhao, Persistent topological features of dynamical systems, doi.org/10.1063/1.4949472, 2016.
[9] X. Feng, Y. Tong, G. W. Wei and K. Xia, Topological modeling of biomolecular data, Nanyang Technological University.
[10] B.Cottenceau, N.Delanoue ,L.Jaulin, Guaranteeing the homotopy type of a set defined by non-linear inequalities, DOI: 10.1007/s11155-007-9043-8, 2006.
[11] P. Lambrechts, The Poincaré conjecture and the shape of the universe slides, Wellesley College, 2009.
[12] E.A.Coutsias,S.Martin,A.ThompsonandJ.P.Watson,Topologyofcyclo-octaneenergylandscape, doi: 10.1063/1.3445267, 2010.
[13] G. Carlsson, T. Ishkhanov , D. L. Ringac, F. Memoli, G. Sapiro and G. Singh, Topological analysis of population activity in visual cortex, Journal of vision 8 8 (2008): 11.1-18.
[14] A. Hatcher, Algebraic Topology, Cambridge University Press, ISBN 0-521-79540-0, 2002.
[15] E.G.P. P. Bos, M. Caroli, R. van de Weygaert, H. Edelsbrunner, B. Eldering, M. van Engelen, J. Feldbrugge, E. ten Have, W. A. Hellwing, J. Hidding, B. J. T. Jones, N. Kruithof, C. Park,
P. Pranav, M.Teillaud and G.Vegter, Alpha,Betti and the Megaparsec Universe: on the Topology of the Cosmic Web, arXiv:1306.3640v1 [astro-ph.CO], 2013.
[16] J Cisewski-Kehea, S.B.Greenb, D.Nagai and X.Xu,Finding cosmic voids and filament loops using topological data analysis, arXiv:1811.08450v1 [astro-ph.CO], 2018.
[17] H. Liang and H. Wang, Structure-Function Network Mapping and Its Assessment via Persistent Homology, doi:10.1371/journal.pcbi.1005325, 2017.
[18] K. G. Wang, The Basic Theory of Presisten Homology, 2012.
[19] F. Chazal and B. Michel, An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists, arXiv:1710.04019v1 [math.ST], 2017.
[20] M. Wright, Introduction to Persistent Homology video on M. Wright channel https : //www.youtube.com/watch?v = 2PSqWBIrn90 consulted on 10/10/2018, 2016.
[21] R. Ghrist, Barcodes: The Persistent Topology Of Data, Bull. Amer. Math. Soc. 45 (2008), 61-75 , Doi: https://doi.org/10.1090/S0273-0979-07-01191-3, 2007.
[22] G. W. Wei and K. Xia, Persistent homology analysis of protein structure, flexibility and folding, arXiv:1412.2779v1 [q-bio.BM], 2014.
[23] The NIPS 2012 workshop on Algebraic Topology and Machine Learning.
[24] K.Fukumizu,Y.HiraokaandG.Kusano,PersistenceweightedGaussiankernelfortopologicaldata analysis, 2016.
[25] J.Cisewski-Kehea,S.B.Greenb,D.NagaiandX.Xu,Findingcosmicvoidsandfilamentloopsusing topological data analysis, arXiv:1811.08450v1 [astro-ph.CO], 2018.
[26] H. A. Harrington, M. A. Porter and B. J. Stolz, Persistent homology of time-dependent functional networks constructed from coupled time series, DOI:10.1063/1.4978997, 2017.
[27] J. Boissonnat, C. Maria, The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes, [Research Report] RR-7993, pp.20. <hal-00707901v1>, 2012.

More Related Content

PDF
Introduction to Topological Data Analysis
PDF
Introduction to Topological Data Analysis
PDF
Topological data analysis
PDF
Topological Data Analysis: visual presentation of multidimensional data sets
PDF
Tutorial of topological_data_analysis_part_1(basic)
PPTX
Topological Data Analysis.pptx
PDF
013_20160328_Topological_Measurement_Of_Protein_Compressibility
PDF
Tda presentation
Introduction to Topological Data Analysis
Introduction to Topological Data Analysis
Topological data analysis
Topological Data Analysis: visual presentation of multidimensional data sets
Tutorial of topological_data_analysis_part_1(basic)
Topological Data Analysis.pptx
013_20160328_Topological_Measurement_Of_Protein_Compressibility
Tda presentation

What's hot (20)

PDF
SIAM-AG21-Topological Persistence Machine of Phase Transition
PPTX
Introduction to XGboost
PDF
PPTX
Dimensionality Reduction and feature extraction.pptx
PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
PDF
敵対的学習に対するラデマッハ複雑度
PDF
第5回パターン認識勉強会
PDF
DeepWalk: Online Learning of Representations
PPTX
Xgboost: A Scalable Tree Boosting System - Explained
PDF
The Origin of Grad-CAM
PDF
Isolation Forest
PDF
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
PDF
数式を使わずイメージで理解するEMアルゴリズム
PPTX
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
PPTX
Random forest
PDF
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
PDF
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
PPTX
[DL輪読会]逆強化学習とGANs
PDF
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
PPTX
Cyclic group- group theory
SIAM-AG21-Topological Persistence Machine of Phase Transition
Introduction to XGboost
Dimensionality Reduction and feature extraction.pptx
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
敵対的学習に対するラデマッハ複雑度
第5回パターン認識勉強会
DeepWalk: Online Learning of Representations
Xgboost: A Scalable Tree Boosting System - Explained
The Origin of Grad-CAM
Isolation Forest
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
数式を使わずイメージで理解するEMアルゴリズム
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
Random forest
CVPR2016 reading - 特徴量学習とクロスモーダル転移について
Decision Tree Algorithm | Decision Tree in Python | Machine Learning Algorith...
[DL輪読会]逆強化学習とGANs
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
Cyclic group- group theory
Ad

Similar to Topological Data Analysis and Persistent Homology (20)

PPTX
[20240712_LabSeminar_Huy]Spatio-Temporal Neural Structural Causal Models for ...
PPTX
[20240614_LabSeminar_Huy]GRLSTM: Trajectory Similarity Computation with Graph...
PPTX
[20240703_LabSeminar_Huy]MakeGNNGreatAgain.pptx
PDF
Carpita metulini 111220_dssr_bari_version2
PDF
Knowledge Graph Embeddings for Recommender Systems
PPTX
[20240621_LabSeminar_Huy]Spatial-Temporal Interplay in Human Mobility: A Hier...
PPTX
[20240527_LabSeminar_Huy]Meta-Graph.pptx
PDF
Application of transportation problem under pentagonal neutrosophic environment
PDF
Learning the structure of Gaussian Graphical models with unobserved variables...
PPTX
[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...
PPTX
[20240812_LabSeminar_Huy]Spatio-Temporal Fusion for Human Action Recognition ...
PPTX
[20240819_LabSeminar_Huy]Learning Decomposed Spatial Relations for Multi-Vari...
PDF
Spatial analysis and Analysis Tools
PDF
Topographic Information System as a Tool for Environmental Management, a Case...
PDF
Data Models and Query Languages for Linked Geospatial Data
PDF
Use of Evolutionary Polynomial Regression (EPR) for Prediction of Total Sedim...
PPTX
Unit 6: All
PPTX
[20240710_LabSeminar_Huy]PDFormer: Propagation Delay-Aware Dynamic Long-Range...
PDF
A Class of Continuous Implicit Seventh-eight method for solving y’ = f(x, y) ...
PDF
Developing Competitive Strategies in Higher Education through Visual Data Mining
[20240712_LabSeminar_Huy]Spatio-Temporal Neural Structural Causal Models for ...
[20240614_LabSeminar_Huy]GRLSTM: Trajectory Similarity Computation with Graph...
[20240703_LabSeminar_Huy]MakeGNNGreatAgain.pptx
Carpita metulini 111220_dssr_bari_version2
Knowledge Graph Embeddings for Recommender Systems
[20240621_LabSeminar_Huy]Spatial-Temporal Interplay in Human Mobility: A Hier...
[20240527_LabSeminar_Huy]Meta-Graph.pptx
Application of transportation problem under pentagonal neutrosophic environment
Learning the structure of Gaussian Graphical models with unobserved variables...
[20240520_LabSeminar_Huy]DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural...
[20240812_LabSeminar_Huy]Spatio-Temporal Fusion for Human Action Recognition ...
[20240819_LabSeminar_Huy]Learning Decomposed Spatial Relations for Multi-Vari...
Spatial analysis and Analysis Tools
Topographic Information System as a Tool for Environmental Management, a Case...
Data Models and Query Languages for Linked Geospatial Data
Use of Evolutionary Polynomial Regression (EPR) for Prediction of Total Sedim...
Unit 6: All
[20240710_LabSeminar_Huy]PDFormer: Propagation Delay-Aware Dynamic Long-Range...
A Class of Continuous Implicit Seventh-eight method for solving y’ = f(x, y) ...
Developing Competitive Strategies in Higher Education through Visual Data Mining
Ad

Recently uploaded (20)

PDF
Digital Logic Computer Design lecture notes
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPT
Project quality management in manufacturing
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Geodesy 1.pptx...............................................
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Digital Logic Computer Design lecture notes
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Operating System & Kernel Study Guide-1 - converted.pdf
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
OOP with Java - Java Introduction (Basics)
Embodied AI: Ushering in the Next Era of Intelligent Systems
Project quality management in manufacturing
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
573137875-Attendance-Management-System-original
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Geodesy 1.pptx...............................................
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx

Topological Data Analysis and Persistent Homology

  • 1. Topological Data Analysis and Persistent Homology Supervisor: Prof. Francesco Vaccarino, Candidate: Carla Federica Melia POLITECNICO DI TORINO DIPARTIMENTO DI SCIENZE MATEMATICHE Corso di Laurea Magistrale in Ingegneria Matematica Graduation Session of December 2018 A.Y. 2017/18
  • 2.  Topological Data Analysis (TDA) is a branch of applied mathematics that uses notions and techniques of a miscellaneous set of scientific fields such as  Its resulting tools allow to infer robust features about the “shape” of complex datasets potentially corrupted by noise/incompleteness. 2 Introduction Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018 StatisticsComputer ScienceAlgebraic Topology
  • 3. TDA aims at inferring statistically significative information on the shape of the data. 3 The Shape of Data [1] Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 4. [15] [16] 4 Finding Loops and Voids in Universe Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 5. Periodic behaviors Attractors [7] [8] 5 Periodic Systems Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 6. 6 Effective Brain Networks Analysis Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018 [17][5]
  • 7.  This thesis focuses on Persistent Homology (PH) technique.  The purposes are: 7 Objectives 1/3 1) to provide a satisfying explanation of TDA and PH fundamentals [2] Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 8. 8 Objectives 2/3 2) to analyze the robustness and the reliability of the inferred features [3] Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 9. 9 Objectives 3/3 3) to implement some TDA techniques on some study cases Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 10.  The input is a finite set of elements coming with a notion of distance between them.  The elements are mapped into a point cloud (PCD).  PCD is completed by building "continuous" shape, a complex, on it. 10 From PCD to Complex Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018 [13] A B C B A C A B C
  • 11.  There are many ways to build simplicial complexes from a topological space.  To be a useful, a simplicial complex has to have an homology that approximates the one of the space we want to study. 11 [18] Simplicial Complexes Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 12. 12 Cech and Vietoris–Rips Complexes  Reconstruction Theorem  Nerve Theorem  𝐶 𝛼(X) ⊂ 𝑉2𝛼(X) ⊂ 𝐶2𝛼(X) Cech complex is difficult to calculate, but it is quite small and accurate. [19] 𝐶 𝛼 𝑋 ≔ 𝑝1, … , 𝑝k : 𝑝1, … , 𝑝k ⊂ 𝑋,∩𝑖 𝐵 𝛼 𝑝i ≠ ∅ 𝑉𝛼 𝑋 ≔ 𝑝1, … , 𝑝k : 𝑝1, … , 𝑝k ⊂ 𝑋, 𝑚𝑎𝑥 𝑝i,𝑝j∈𝜎 (𝑑(𝑝i, 𝑝j)) ≤ 𝛼 Vietoris–Rips complex is easy to calculate, but it is usually very big and less accurate. Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 13.  The homotopy equivalent items shares the same homology.  Homology groups are a more computable alternative to homotopy ones. 13 Homology Boundary Operator 𝜎 = [𝑣0, … , 𝑣 𝑘] 𝜕 𝑘 𝜎 = ෍ 𝑖=0 𝑛 −1 𝑖 𝑣0,…, ො𝑣𝑖,…, 𝑣 𝑘 Simplex 𝜕 𝑣0, 𝑣1, 𝑣2, 𝑣3 = 𝑣1, 𝑣2, 𝑣3 - 𝑣0, 𝑣2, 𝑣3 + 𝑣0, 𝑣1, 𝑣3 - 𝑣0, 𝑣1, 𝑣2 [8] Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 14. 14 Homology Groups Cycles 𝑍 𝑘 = 𝐾𝑒𝑟(𝜕 𝑘) Boundaries 𝐵 𝑘 = 𝐼𝑚(𝜕 𝑘+1) k-th Homology Group 𝐻 𝑘 = 𝑍 𝑘/𝐵 𝑘  The goal of homology is to discard cycles that are also boundaries, so we quotient 𝑍 𝑘 using the following equivalence relation  The rank of 𝐻 𝑘 is 𝛽 𝑘. 𝛽0 = #Components 𝛽1 = #Loops 𝛽2 = #Voids 𝛽 𝑛 = # n-dim Holes ∀𝑧1, 𝑧2 ∈ 𝑍 𝑘, 𝑧1~𝑧2 ⇔ 𝑧1 − 𝑧2 ∈ 𝐵 𝑘 Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 15. The sequence of simplicial complexes with its inclusion maps is a filtration. [20] 15 Filtrations Which 𝑑 should we choose? The most persistent features are detected using PH. They are supposed to represent true characteristics of the underlying space. Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 16. 16 Persistent Homology  With PH we study the homology of a filtration as a single algebraic entity.  Its features can be then analyzed using its barcode representation and this is formally justified by the Structure Theorem. Definition When 0 ≤ 𝑖 ≤ 𝑗 ≤ 𝑛, the inclusion 𝑥𝑖 𝑗 :𝑋𝑖 ↪ 𝑋𝑗 induces a homomorphism 𝐻(𝑥𝑖 𝑗 ):𝐻(𝑋𝑖) → 𝐻(𝑋𝑗) whose images are the persistent homology groups. Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 17. 17 Persistence Barcode [21] Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 18. 18 Persistence Diagram  Persistence Diagram  Given two persistent diagrams 𝐷1 and 𝐷2, the Bottleneck distance between them is 𝑑 𝐵 𝐷1, 𝐷2 = 𝑖𝑛𝑓𝛾 𝑠𝑢𝑝 𝑥∈𝐷1 | 𝑥 − 𝛾 𝑥 |∞ where 𝛾 ranges over all multi-bijections 𝐷1 → 𝐷2 . [23] Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 19. 19 Stability Results Theorem Let 𝑋 and 𝑌 be two compact metric spaces and let 𝐹𝑖𝑙𝑡(𝑋) and 𝐹𝑖𝑙𝑡(𝑌) be the Vietoris–Rips filtrations built on top them. Then 𝑑 𝐵(𝐷(𝐹𝑖𝑙𝑡 𝑋 ), 𝐷(𝐹𝑖𝑙𝑡 𝑌 )) ≤ 2𝑑 𝐺𝐻(𝑋, 𝑌) Moreover, if 𝑋 and 𝑌 are embedded in the same space then 𝑑 𝐵(𝐷(𝐹𝑖𝑙𝑡 𝑋 ), 𝐷(𝐹𝑖𝑙𝑡 𝑌 )) ≤ 2𝑑 𝐻(𝑋, 𝑌) [24] Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 20. 20 Statistical Results The most persistent features can be detected and separated from topological noise using statistical methods as the Bootstrap. Given a persistence diagram 𝑋 with an estimator ෠𝑋, with look for 𝛿 𝛼 such that 𝑃 𝑑 𝐵 𝑋, ෠𝑋 ≥ 𝛿 𝛼 ≤α, α∈(0,1) The confidence set will be 𝑋: 𝑑 𝐵 𝑋, ෠𝑋 ≥ 𝛿 𝛼 [25] Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 21. 21 Implementation [27]  To analyze the topological information of different datasets, a console application was implemented using GUDHI in Python, TDA in R and QlikView.  GUDHI proposed an efficient tree representation for simplicial complexes, the simplex tree. + + Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 22. 22 Data Upload Data Upload Consistency Checks Dimension Plot Data … pandas matplotlib Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 23. 23 TDA Application Data Simplex Tree Persistence Persistence Diagram Persistence Barcode Bootstrap .pers gudhi TDA Confidence Band … Betti Curves 𝛿 𝛼 Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 24. 24 Front End – QlikView Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 25. 25 Results Topological Data Analysis and Persistent Homology - Politecnico di Torino - 2018
  • 28. BIBLIOGRAPHY [1] G. Carlsson, The Shape of Data conference, Ayasdi Energy Summit, 2014. [2] R. Ghrist, Three examples of applied and computational homology, 2008. [3] P.Bubenik, Statistical Topological Data Analysis using Persistence Landscapes, Journal of Machine Learning Research 16, 2015. [4] M. Alagappan, J. Carlsson, G. Carlsson, T. Ishkanov, A. Lehman, P. Y. Lum, G. Singh, and M. Vejdemo-Johansson, Extracting insights from the shape of complex data using topology, Scientific Reports v. 3, Article number: 1236, 2013. [5] R. Carhart-Harris, P. Expert, P. J. Hellyer, D. Nutt, G. Petri, F. Turkheimer and F. Vaccarino, Homological scaffolds of brain functional networks, Journal of The Royal Society Interface, 11(101):20140873, 2014. [6] M. Lesnick, Studying the Shape of Data Using Topology, Institute for Advanced Study, School of Mathematics, 2013. [7] P. Chardy, V. David and B. Sautour, Fitting a predator–prey model to zooplankton time-series data in the Gironde estuary (France): Ecological significance of the parameters, Estuarine, Coastal and Shelf Science, Volume 67, Issue 4, Pages 605-617, 2006. [8] S. Maletic, M. Rajkovic and Y. Zhao, Persistent topological features of dynamical systems, doi.org/10.1063/1.4949472, 2016. [9] X. Feng, Y. Tong, G. W. Wei and K. Xia, Topological modeling of biomolecular data, Nanyang Technological University. [10] B.Cottenceau, N.Delanoue ,L.Jaulin, Guaranteeing the homotopy type of a set defined by non-linear inequalities, DOI: 10.1007/s11155-007-9043-8, 2006. [11] P. Lambrechts, The Poincaré conjecture and the shape of the universe slides, Wellesley College, 2009. [12] E.A.Coutsias,S.Martin,A.ThompsonandJ.P.Watson,Topologyofcyclo-octaneenergylandscape, doi: 10.1063/1.3445267, 2010. [13] G. Carlsson, T. Ishkhanov , D. L. Ringac, F. Memoli, G. Sapiro and G. Singh, Topological analysis of population activity in visual cortex, Journal of vision 8 8 (2008): 11.1-18. [14] A. Hatcher, Algebraic Topology, Cambridge University Press, ISBN 0-521-79540-0, 2002. [15] E.G.P. P. Bos, M. Caroli, R. van de Weygaert, H. Edelsbrunner, B. Eldering, M. van Engelen, J. Feldbrugge, E. ten Have, W. A. Hellwing, J. Hidding, B. J. T. Jones, N. Kruithof, C. Park, P. Pranav, M.Teillaud and G.Vegter, Alpha,Betti and the Megaparsec Universe: on the Topology of the Cosmic Web, arXiv:1306.3640v1 [astro-ph.CO], 2013. [16] J Cisewski-Kehea, S.B.Greenb, D.Nagai and X.Xu,Finding cosmic voids and filament loops using topological data analysis, arXiv:1811.08450v1 [astro-ph.CO], 2018. [17] H. Liang and H. Wang, Structure-Function Network Mapping and Its Assessment via Persistent Homology, doi:10.1371/journal.pcbi.1005325, 2017. [18] K. G. Wang, The Basic Theory of Presisten Homology, 2012. [19] F. Chazal and B. Michel, An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists, arXiv:1710.04019v1 [math.ST], 2017. [20] M. Wright, Introduction to Persistent Homology video on M. Wright channel https : //www.youtube.com/watch?v = 2PSqWBIrn90 consulted on 10/10/2018, 2016. [21] R. Ghrist, Barcodes: The Persistent Topology Of Data, Bull. Amer. Math. Soc. 45 (2008), 61-75 , Doi: https://doi.org/10.1090/S0273-0979-07-01191-3, 2007. [22] G. W. Wei and K. Xia, Persistent homology analysis of protein structure, flexibility and folding, arXiv:1412.2779v1 [q-bio.BM], 2014. [23] The NIPS 2012 workshop on Algebraic Topology and Machine Learning. [24] K.Fukumizu,Y.HiraokaandG.Kusano,PersistenceweightedGaussiankernelfortopologicaldata analysis, 2016. [25] J.Cisewski-Kehea,S.B.Greenb,D.NagaiandX.Xu,Findingcosmicvoidsandfilamentloopsusing topological data analysis, arXiv:1811.08450v1 [astro-ph.CO], 2018. [26] H. A. Harrington, M. A. Porter and B. J. Stolz, Persistent homology of time-dependent functional networks constructed from coupled time series, DOI:10.1063/1.4978997, 2017. [27] J. Boissonnat, C. Maria, The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes, [Research Report] RR-7993, pp.20. <hal-00707901v1>, 2012.