SlideShare a Scribd company logo
Manifold Learning and
Dimensionality Reduction
for Data Visualization and
Feature Engineering
Berlin, July 7th, 2018
PyData Conference Berlin
Dr. Stefan Kühn
https://www.xing.com/profile/Stefan_Kuehn46
https://www.linkedin.com/in/stefan-k%C3%BChn-020a34119/
https://de.slideshare.net/StefanKhn4
What is a manifold?
Mathematical concept from Differential Geometry
ManifoldLearningandDataVisualization
What are properties of a manifold?
Important Properties – Topology and more
• Number of Connected Components
• Holes
• Curvature
• Smoothness
• Dimensionality
• …you_name_it…
ManifoldLearningandDataVisualization
What are properties of a good visualization?
Preserve important properties
• Number of connected components?
• Holes?
• Curvature?
• Smoothness?
• Dimensionality?
• Distances between points?
• Angles, orientations?
• Local versus global properties?
ManifoldLearningandDataVisualization
You cannot have it all!
Manifold Learning Methods in sklearn
• Locally Linear Embedding
• Neighborhood-preserving
• Isomap
• Quasi-isometric
• Multi-Dimensional Scaling (MDS)
• Quasi-isometric
• Spectral Embedding
• Spectral clustering based on similarity
• T-Distributed Stochastic Neighbor Embedding (tSNE)
• Preserves probabilities
• Local Tangent Space Alignment (LTSA)
ManifoldLearningandDataVisualization
Local Tangent Space Alignment
ManifoldLearningandDataVisualization
Demo Time
Sometimes, words are insufficient…
But then God gave us code!
ManifoldLearningandDataVisualization
Resources
Scikit-learn documentation
http://scikit-learn.org/stable/modules/manifold.html
http://scikit-learn.org/stable/auto_examples/manifold/plot_compare_methods.html
http://scikit-learn.org/stable/auto_examples/manifold/plot_manifold_sphere.html
http://scikit-learn.org/stable/modules/random_projection.html
http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
Github repo with worked examples
https://github.com/cc-skuehn/Manifold_Learning
Jupyter Lab
https://jupyterlab.readthedocs.io/en/stable/index.html
Citation
Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.
http://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html
ManifoldLearningandDataVisualization
Thank you!
Berlin, July 7th, 2018

More Related Content

PDF
Talk at MCubed London about Manifold Learning and Applications
PPTX
Graph analysis over relational database
PPTX
Startupfest 2016: NOAH ILIINSKY (Amazon Web Services) - How to
PDF
How we use the massive open lidar dataset for the benfit of our clients
PPTX
Big data week 2018 - Graph Analytics on Big Data
PDF
Iremnur Tokac CV
PPTX
@aGlance af Preben Holst Mogensen, AU
PDF
Four pillars of visualization - by Noah Iliinsky
Talk at MCubed London about Manifold Learning and Applications
Graph analysis over relational database
Startupfest 2016: NOAH ILIINSKY (Amazon Web Services) - How to
How we use the massive open lidar dataset for the benfit of our clients
Big data week 2018 - Graph Analytics on Big Data
Iremnur Tokac CV
@aGlance af Preben Holst Mogensen, AU
Four pillars of visualization - by Noah Iliinsky

Similar to Talk at PyData Berlin about Manifold Learning and Applications (20)

PDF
Manifold Learning and Data Visualization
PPTX
fINAL Lesson_1_Course_Introduction_v1.pptx
PPTX
Information Mapping Presentation for STC West Coast Chapter - Jan 29, 2014_final
PPTX
Disrupting the Disrupters #COMIT2017
PPT
The Technology Horizon: Implications for assessment
PPTX
BimEnable & GeoEnable Overview
PPTX
Big Data - IBA.pptx
PDF
Data-Ed Webinar: Data Modeling Fundamentals
PDF
High Dimensional Data Visualization
PDF
Crafting a Compelling Data Science Resume
PDF
Taking portfolio benefits management to the next level with modern analytics ...
PDF
Digitalisation: How can we mix the "new oil" and the "old oil? The role of IT...
PDF
Neo4j GraphTalk Basel - Building intelligent Software with Graphs
PDF
CS404 Pattern Recognition - Locality Preserving Projections
PDF
Learning in a digital world: trends & implications for learning professionals
PDF
DITA and the Integrated Product Lifecycle
PPTX
Data Science Course in Kochi: Learn Python, ML, Big Data & Visualization
PPTX
Neo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
PPTX
Trends in Project Management 1966-2015
PPTX
iviva Smart BIM and Lucy Workflow - UK Launch
Manifold Learning and Data Visualization
fINAL Lesson_1_Course_Introduction_v1.pptx
Information Mapping Presentation for STC West Coast Chapter - Jan 29, 2014_final
Disrupting the Disrupters #COMIT2017
The Technology Horizon: Implications for assessment
BimEnable & GeoEnable Overview
Big Data - IBA.pptx
Data-Ed Webinar: Data Modeling Fundamentals
High Dimensional Data Visualization
Crafting a Compelling Data Science Resume
Taking portfolio benefits management to the next level with modern analytics ...
Digitalisation: How can we mix the "new oil" and the "old oil? The role of IT...
Neo4j GraphTalk Basel - Building intelligent Software with Graphs
CS404 Pattern Recognition - Locality Preserving Projections
Learning in a digital world: trends & implications for learning professionals
DITA and the Integrated Product Lifecycle
Data Science Course in Kochi: Learn Python, ML, Big Data & Visualization
Neo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
Trends in Project Management 1966-2015
iviva Smart BIM and Lucy Workflow - UK Launch
Ad

More from Stefan Kühn (14)

PDF
data2day2023_SKuehn_DataPlatformFallacy.pdf
PDF
data2day2022_SKuehn_DataValueChain.pdf
PDF
Data Science - Cargo Cult - Organizational Change
PDF
Interactive Dashboards with R
PDF
Bridging the gap
PDF
The Machinery behind Deep Learning
PDF
Becoming Data-driven - Machine Learning @ XING Marketing Solutions
PDF
Learning To Rank data2day 2017
PDF
Deep Learning and Optimization Methods
PDF
Visualizing and Communicating High-dimensional Data
PDF
Data quality - The True Big Data Challenge
PDF
Data Visualization at codetalks 2016
PDF
SKuehn_MachineLearningAndOptimization_2015
PDF
SKuehn_Talk_FootballAnalytics_data2day2015
data2day2023_SKuehn_DataPlatformFallacy.pdf
data2day2022_SKuehn_DataValueChain.pdf
Data Science - Cargo Cult - Organizational Change
Interactive Dashboards with R
Bridging the gap
The Machinery behind Deep Learning
Becoming Data-driven - Machine Learning @ XING Marketing Solutions
Learning To Rank data2day 2017
Deep Learning and Optimization Methods
Visualizing and Communicating High-dimensional Data
Data quality - The True Big Data Challenge
Data Visualization at codetalks 2016
SKuehn_MachineLearningAndOptimization_2015
SKuehn_Talk_FootballAnalytics_data2day2015
Ad

Recently uploaded (20)

PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Mega Projects Data Mega Projects Data
PPTX
Computer network topology notes for revision
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
Foundation of Data Science unit number two notes
PPTX
Database Infoormation System (DBIS).pptx
PPT
Quality review (1)_presentation of this 21
PDF
Business Analytics and business intelligence.pdf
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
annual-report-2024-2025 original latest.
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
Introduction-to-Cloud-ComputingFinal.pptx
Mega Projects Data Mega Projects Data
Computer network topology notes for revision
Introduction to Knowledge Engineering Part 1
Business Ppt On Nestle.pptx huunnnhhgfvu
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Foundation of Data Science unit number two notes
Database Infoormation System (DBIS).pptx
Quality review (1)_presentation of this 21
Business Analytics and business intelligence.pdf
Data_Analytics_and_PowerBI_Presentation.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
ISS -ESG Data flows What is ESG and HowHow
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Business Acumen Training GuidePresentation.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
annual-report-2024-2025 original latest.
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Reliability_Chapter_ presentation 1221.5784

Talk at PyData Berlin about Manifold Learning and Applications