SlideShare a Scribd company logo
Bridging the gap between 2D and 3D
with Deep Learning
Evgeny Burnaev (PhD) <e.burnaev@skoltech.ru>
assoc. prof. Skoltech
Alexandr Notchenko <a.notchenko@skoltech.ru>
PhD student
[1]
ImageNet top-5 error over the years
- Deep learning based methods
- Feature based methods
- human performance
Supervised Deep Learning data
Type
2D Image classification,
detection segmentation
Pose Estimation
Supervision
class label , object detection box,
segmentation contours
Structure of “skeleton” on image
But world is in 3D
3D deep learning is gaining popularity
Workshops:
● Deep Learning for Robotic Vision Workshop
CVPR 2017
● Geometry Meets Deep Learning ECCV 2016
● 3D Deep Learning Workshop @ NIPS 2016
● Large Scale 3D Data: Acquisition, Modelling
and Analysis CVPR 2016
● 3D from a Single Image CVPR 2015
Google Scholar when searched for "3D" "Deep
Learning" returns:
year # articles
2012 410
2013 627
2014 1210
2015 2570
2016 5440
Representation of 3D data for Deep Learning
Method Pros (+) Cons (-)
Many 2D projections sustain surface texture,
There is a lot of 2D DL methods
Redundant representation,
vulnerable to optic illusions
Voxels simple, can be sparse, has
volumetric properties
losing surface properties
Point Cloud Can be sparse losing surface properties and
volumetric properties
2.5D images Cheap measurement devices,
senses depth
self occlusion of bodies in a
scene, a lot of Noise in
measurements
[6]
[2]
3D shape as dense Point Cloud
Learning Rich Features from RGB-D Images for
Object Detection and Segmentation
[10]
Latest development in
SLAM family of methods
LSD-SLAM (Large-Scale Direct Monocular Simultaneous Localization and Mapping)
[5]
LSD-SLAM - direct (feature-less) monocular SLAM
ElasticFusion
ElasticFusion - DenseSLAM without a pose-graph
[7]
Dynamic Fusion
The technique won the prestigious CVPR 2015 best paper award.
[9]
Problems of SLAM algorithms
● Don’t represent objects (only know surfaces)
● Mostly dense representation (requires a lot of data)
● Whole scene is one big surface, e.g. cannot separate different objects that
are close to each other.
3D Shape Retrieval
3D Design Phase
•
There exists massive storages with 3D CAD models, e.g. GrabCAD
Chairs Mechanical parts
3D Design Phase
•Designers spend about 60% of their time
searching for the right information
• Massive and complex CAD models are
usually disorderly archived in enterprises,
which makes design reuse a difficult task
3D Model retrieval can significantly shorten the product lifecycles
3D Shape-based Model Retrieval
•3D models are complex = No clear search rules
•The text-based search has its limitations: e.g. often 3D
models are poorly annotated
• There is some commercial software for 3D CAD modeling, e.g.
➢ Exalead OnePart by Dassault Systems,
➢ Geolus Search by Siemens PLM, and others
• However, used methods
➢ are time-consuming,
➢ are often based on hand-crafted descriptors,
➢ could be limited to a specific class of shapes,
➢ are not robust to scaling, rotations, etc.
Sparse 3D Convolutional Neural Networks for
Large-Scale Shape Retrieval
Alexandr Notchenko, Ermek Kapushev, Evgeny Burnaev
Presented at 3D Deep Learning Workshop at NIPS 2016
Sparsity of voxel representation
30^3 Voxels is already enough
to understand simple shape
But with texture information it
would be even easier
Sparsity for all classes of
ModelNet40 train dataset at
voxel resolution 40 is only
5.5%
Shape Retrieval
Precomputed
feature vector of
dataset.
(Vcar
, Vperson
,...)
Vplane
- feature vector
of plane
Sparse3DCNN
Query
Retrieved items
Cosine distance
Triplet loss
The representation can be efficiently learned by minimizing triplet loss.
Triplet is a set (a, p, n), where
● a - anchor object
● p - positive object that is similar to anchor object
● n - negative object that is not similar to anchor object
,
where is a margin parameter, and are distances between p and a and
n and a.
Our approach
● Use very large resolutions, and sparse representations.
● Used triplet learning for 3D shapes.
● Used Large Scale Shape Datasets ModelNet and ShapeNet.
Represent voxel shape as vector
Obligatory t-SNE
Conclusions
● For small datasets of shape or 3D sparse tensors voxels
can work.
● Voxels don’t scale for hundreds of “classes” and loose
texture information.
● Cannot encode complicated object domains.
Problems for next 5 years
Autonomous Vehicles
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Learning
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Learning
Augmented (Mixed) Reality
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Learning
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Learning
Robotics in human
environments
Robotic Control in Human Environments
Commodity sensors to create 2.5D images
Intel RealSense Series
Asus Xtion Pro
Microsoft Kinect v2
Structure Sensor
What they have in
common?
What they have in
common?
They require understanding the whole scene
Problem of “Holistic” Scene understanding
Lin D., Fidler S., Urtasun R. Holistic scene understanding for 3d object detection
with rgbd cameras //Proceedings of the IEEE International Conference on Computer
Vision. – 2013. – С. 1417-1424.
● Human environments often designed by humans
● A most of the objects are created by humans
● Context provides information by joint probability functions
● Textures caused by materials and therefore can explain a functions and
structure of an object
Problem of “Holistic” Scene understanding
Connecting 3 families of CV algorithms is inevitable
Learnable Computer
Vision Systems
(Deep Learning)
Geometric Computer Vision
(SLAMs)
Probabilistic Computer
Vision
(Bayesian methods)
Connecting 3 families of CV algorithms is inevitable
Learnable Computer
Vision Systems
(Deep Learning)
Geometric Computer Vision
(SLAMs)
Probabilistic Computer
Vision
(Bayesian methods)
Probabilistic
Inverse
Graphics
Probabilistic Inverse Graphics enables
● Takes into account setting information (shop: shelves and products | street: buildings,
cars, pedestrians)
● Make maximum likelihood estimates from data and model (or give directions on how
to reduce uncertainty the best way)
● Learns structure of objects (Materials and textures / 3D shape / intrinsic dynamics)
Thank you.
Alexandr Notchenko Ermek Kapushev Evgeny Burnaev
Citations and Links
1. Deep Learning NIPS’2015 Tutorial by Geoff Hinton, Yoshua Bengio & Yann LeCun
2. Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015). 3D ShapeNets: A Deep Representation for Volumetric Shapes.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1912-1920).
3. C. Nash, C. Williams Generative Models of Part-Structured 3D Objects
4. Qin, Fei-wei, et al. "A deep learning approach to the classification of 3D CAD models." Journal of Zhejiang University SCIENCE C 15.2
(2014): 91-106.
5. Engel, Jakob, Thomas Schöps, and Daniel Cremers. "LSD-SLAM: Large-scale direct monocular SLAM." European Conference on Computer
Vision. Springer International Publishing, 2014.
6. Su, Hang, et al. "Multi-view convolutional neural networks for 3D shape recognition." Proceedings of the IEEE International Conference on
Computer Vision. 2015.
7. Whelan, Thomas, et al. "ElasticFusion: Dense SLAM Without A Pose Graph." Robotics: science and systems. Vol. 11. 2015.
8. Notchenko, Alexandr, Ermek Kapushev, and Evgeny Burnaev. "Sparse 3D Convolutional Neural Networks for Large-Scale Shape Retrieval."
arXiv preprint arXiv:1611.09159 (2016).
9. Newcombe, Richard A., Dieter Fox, and Steven M. Seitz. "Dynamicfusion: Reconstruction and tracking of non-rigid scenes in
real-time." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
10. Gupta, Saurabh, et al. "Learning rich features from RGB-D images for object detection and segmentation." European Conference on
Computer Vision. Springer International Publishing, 2014.

More Related Content

PPTX
[Skolkovo Robotics V] Applying Anthropomorphic Robots Technology
PPTX
[Skolkovo Robotics V] Assistive Market: HERCULE Project
PDF
[Skolkovo Robotics V] Robotics in Korea
PDF
[Skolkovo Robotics V] Collaborative Robots: Research, Technologies and Applic...
PPTX
Skolkovo Robotics V. International Conference ENGLISH Brief
PDF
[Skolkovo Robotics V] Перспективы и ограничения использования бас на немецком...
PDF
Innovation Projects: Robotics & more projects
PDF
Trinity Daily April 30, 2018
[Skolkovo Robotics V] Applying Anthropomorphic Robots Technology
[Skolkovo Robotics V] Assistive Market: HERCULE Project
[Skolkovo Robotics V] Robotics in Korea
[Skolkovo Robotics V] Collaborative Robots: Research, Technologies and Applic...
Skolkovo Robotics V. International Conference ENGLISH Brief
[Skolkovo Robotics V] Перспективы и ограничения использования бас на немецком...
Innovation Projects: Robotics & more projects
Trinity Daily April 30, 2018

Viewers also liked (20)

PPTX
JETSON : AI at the EDGE
PDF
Skolkovo Robotics V. International Conference
PDF
сборка 3 транспорт логистика
PPTX
Skolkovo Robotics 2017. Программный документ. 21 апреля 2017 года
PDF
Лаборатория молодости.
PPTX
СОЦМЕДИКА. электронный клинический фармаколог экф V2
PPTX
ai контакт-центр
PPTX
Институт медицинской информатики.
PDF
Моделирование сложных систем и обработка больших объемов данных: ищем общие п...
PDF
Крекс, Фекс, Пекс или как заработать на нейронных сетях
PPTX
БИОСОФТ. цифровая медицина
PDF
NVIDIA Deep Learning.
PDF
Skolkovo EdTech Projects BETT 2017
PDF
Нейронные сети в высокопроизводительных вычислениях
PDF
ФРУКТ-МД. Fructmd echo 231116
PPT
View in 3_d_asper_syllabus
PDF
Основные направления и перспективы работ в области искусственного интеллекта ...
PPTX
Фонд Перспективных Исследований. 16 11 22 семинар по мед. данным в сколково
PPTX
Shapes in the world aorund you and me
PPTX
Geometry in Everyday Things
JETSON : AI at the EDGE
Skolkovo Robotics V. International Conference
сборка 3 транспорт логистика
Skolkovo Robotics 2017. Программный документ. 21 апреля 2017 года
Лаборатория молодости.
СОЦМЕДИКА. электронный клинический фармаколог экф V2
ai контакт-центр
Институт медицинской информатики.
Моделирование сложных систем и обработка больших объемов данных: ищем общие п...
Крекс, Фекс, Пекс или как заработать на нейронных сетях
БИОСОФТ. цифровая медицина
NVIDIA Deep Learning.
Skolkovo EdTech Projects BETT 2017
Нейронные сети в высокопроизводительных вычислениях
ФРУКТ-МД. Fructmd echo 231116
View in 3_d_asper_syllabus
Основные направления и перспективы работ в области искусственного интеллекта ...
Фонд Перспективных Исследований. 16 11 22 семинар по мед. данным в сколково
Shapes in the world aorund you and me
Geometry in Everyday Things
Ad

Similar to Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Learning (20)

PPTX
Computer vision introduction
PDF
Interactive Video Search: Where is the User in the Age of Deep Learning?
PDF
最近の研究情勢についていくために - Deep Learningを中心に -
PDF
Deep Learning for X ray Image to Text Generation
PDF
Materi_01_VK_2223_3.pdf
PDF
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
PPTX
Semantic segmentation with Convolutional Neural Network Approaches
PDF
Introduction to 3D Computer Vision and Differentiable Rendering
PDF
Satellite and Land Cover Image Classification using Deep Learning
PDF
Overview of computer vision and machine learning
PDF
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
PPTX
Computer Vision and GenAI for Geoscientists.pptx
PPTX
Computer Vision and GenAI for Geoscientists.pptx
PPTX
01 CM Introduction of Computer Vision.pptx
PDF
TOP 5 Most View Article From Academia in 2019
PDF
Introduction talk to Computer Vision
PPTX
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
PPTX
Weave-D - 2nd Progress Evaluation Presentation
PPTX
A Comparative analysis of Traditional Deep Learning framework
PPTX
A Comparative analysis of Traditional Deep Learning framework for 3D Object P...
Computer vision introduction
Interactive Video Search: Where is the User in the Age of Deep Learning?
最近の研究情勢についていくために - Deep Learningを中心に -
Deep Learning for X ray Image to Text Generation
Materi_01_VK_2223_3.pdf
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Semantic segmentation with Convolutional Neural Network Approaches
Introduction to 3D Computer Vision and Differentiable Rendering
Satellite and Land Cover Image Classification using Deep Learning
Overview of computer vision and machine learning
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
Computer Vision and GenAI for Geoscientists.pptx
Computer Vision and GenAI for Geoscientists.pptx
01 CM Introduction of Computer Vision.pptx
TOP 5 Most View Article From Academia in 2019
Introduction talk to Computer Vision
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Weave-D - 2nd Progress Evaluation Presentation
A Comparative analysis of Traditional Deep Learning framework
A Comparative analysis of Traditional Deep Learning framework for 3D Object P...
Ad

More from Skolkovo Robotics Center (18)

PDF
возможности и барьеры &quot;разговорного&quot; интеллекта а. сандлер, лаборат...
PDF
когнитивные технологии, Ibm
PDF
влияние искусственного интеллекта на пользовательский опыт г. калугина, Yota
PDF
Искусственный интеллект и пользовательский опыт
PDF
как вырастить и воспитать чатбота для дела а. власова, лаборатория наносемантика
PPTX
состояние и перспективы машинного интеллекта с. шумский, нейронет
PDF
искусственный интеллект в каждый дом – как новые технологии помогают достиг...
PPTX
[Skolkovo Robotics V] Современное состояние и перспективы развития технологий...
PPTX
[Skolkovo Robotics V] Autonomous driving: context and technical challenges of...
PPTX
[Skolkovo Robotics V] Анализ задач и решений модульной, роевой и облачной роб...
PPTX
[Skolkovo Robotics V] Facial Expression Recognition in the Wild
PPTX
[Skolkovo Robotics V] Application of AI in Healthcare
PPT
[Skolkovo Robotics V] Боевые роботы: угрозы учтенные или непредвиденные
PDF
[Skolkovo Robotics V] Race for AI: What do VCs expect from AI startups?
PPTX
[Skolkovo Robotics V] Overview of the Modern Robotics Market
PDF
Финальная версия программы Skolkovo Robotics V
PPTX
Презентация Альберта Ефимова на РИФ+КИБ 2017
PDF
Брошюра для конференции Skolkovo.AI 14.11.16
возможности и барьеры &quot;разговорного&quot; интеллекта а. сандлер, лаборат...
когнитивные технологии, Ibm
влияние искусственного интеллекта на пользовательский опыт г. калугина, Yota
Искусственный интеллект и пользовательский опыт
как вырастить и воспитать чатбота для дела а. власова, лаборатория наносемантика
состояние и перспективы машинного интеллекта с. шумский, нейронет
искусственный интеллект в каждый дом – как новые технологии помогают достиг...
[Skolkovo Robotics V] Современное состояние и перспективы развития технологий...
[Skolkovo Robotics V] Autonomous driving: context and technical challenges of...
[Skolkovo Robotics V] Анализ задач и решений модульной, роевой и облачной роб...
[Skolkovo Robotics V] Facial Expression Recognition in the Wild
[Skolkovo Robotics V] Application of AI in Healthcare
[Skolkovo Robotics V] Боевые роботы: угрозы учтенные или непредвиденные
[Skolkovo Robotics V] Race for AI: What do VCs expect from AI startups?
[Skolkovo Robotics V] Overview of the Modern Robotics Market
Финальная версия программы Skolkovo Robotics V
Презентация Альберта Ефимова на РИФ+КИБ 2017
Брошюра для конференции Skolkovo.AI 14.11.16

Recently uploaded (20)

PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Cloud computing and distributed systems.
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
Teaching material agriculture food technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Spectroscopy.pptx food analysis technology
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Cloud computing and distributed systems.
Building Integrated photovoltaic BIPV_UPV.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Mobile App Security Testing_ A Comprehensive Guide.pdf
Teaching material agriculture food technology
20250228 LYD VKU AI Blended-Learning.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Per capita expenditure prediction using model stacking based on satellite ima...
Advanced methodologies resolving dimensionality complications for autism neur...
Big Data Technologies - Introduction.pptx
Encapsulation theory and applications.pdf
Encapsulation_ Review paper, used for researhc scholars
Dropbox Q2 2025 Financial Results & Investor Presentation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Spectroscopy.pptx food analysis technology
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm

Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Learning

  • 1. Bridging the gap between 2D and 3D with Deep Learning Evgeny Burnaev (PhD) <e.burnaev@skoltech.ru> assoc. prof. Skoltech Alexandr Notchenko <a.notchenko@skoltech.ru> PhD student
  • 2. [1]
  • 3. ImageNet top-5 error over the years - Deep learning based methods - Feature based methods - human performance
  • 4. Supervised Deep Learning data Type 2D Image classification, detection segmentation Pose Estimation Supervision class label , object detection box, segmentation contours Structure of “skeleton” on image
  • 5. But world is in 3D
  • 6. 3D deep learning is gaining popularity Workshops: ● Deep Learning for Robotic Vision Workshop CVPR 2017 ● Geometry Meets Deep Learning ECCV 2016 ● 3D Deep Learning Workshop @ NIPS 2016 ● Large Scale 3D Data: Acquisition, Modelling and Analysis CVPR 2016 ● 3D from a Single Image CVPR 2015 Google Scholar when searched for "3D" "Deep Learning" returns: year # articles 2012 410 2013 627 2014 1210 2015 2570 2016 5440
  • 7. Representation of 3D data for Deep Learning Method Pros (+) Cons (-) Many 2D projections sustain surface texture, There is a lot of 2D DL methods Redundant representation, vulnerable to optic illusions Voxels simple, can be sparse, has volumetric properties losing surface properties Point Cloud Can be sparse losing surface properties and volumetric properties 2.5D images Cheap measurement devices, senses depth self occlusion of bodies in a scene, a lot of Noise in measurements
  • 8. [6]
  • 9. [2]
  • 10. 3D shape as dense Point Cloud
  • 11. Learning Rich Features from RGB-D Images for Object Detection and Segmentation [10]
  • 12. Latest development in SLAM family of methods
  • 13. LSD-SLAM (Large-Scale Direct Monocular Simultaneous Localization and Mapping) [5] LSD-SLAM - direct (feature-less) monocular SLAM
  • 14. ElasticFusion ElasticFusion - DenseSLAM without a pose-graph [7]
  • 15. Dynamic Fusion The technique won the prestigious CVPR 2015 best paper award. [9]
  • 16. Problems of SLAM algorithms ● Don’t represent objects (only know surfaces) ● Mostly dense representation (requires a lot of data) ● Whole scene is one big surface, e.g. cannot separate different objects that are close to each other.
  • 18. 3D Design Phase • There exists massive storages with 3D CAD models, e.g. GrabCAD Chairs Mechanical parts
  • 19. 3D Design Phase •Designers spend about 60% of their time searching for the right information • Massive and complex CAD models are usually disorderly archived in enterprises, which makes design reuse a difficult task 3D Model retrieval can significantly shorten the product lifecycles
  • 20. 3D Shape-based Model Retrieval •3D models are complex = No clear search rules •The text-based search has its limitations: e.g. often 3D models are poorly annotated • There is some commercial software for 3D CAD modeling, e.g. ➢ Exalead OnePart by Dassault Systems, ➢ Geolus Search by Siemens PLM, and others • However, used methods ➢ are time-consuming, ➢ are often based on hand-crafted descriptors, ➢ could be limited to a specific class of shapes, ➢ are not robust to scaling, rotations, etc.
  • 21. Sparse 3D Convolutional Neural Networks for Large-Scale Shape Retrieval Alexandr Notchenko, Ermek Kapushev, Evgeny Burnaev Presented at 3D Deep Learning Workshop at NIPS 2016
  • 22. Sparsity of voxel representation 30^3 Voxels is already enough to understand simple shape But with texture information it would be even easier Sparsity for all classes of ModelNet40 train dataset at voxel resolution 40 is only 5.5%
  • 23. Shape Retrieval Precomputed feature vector of dataset. (Vcar , Vperson ,...) Vplane - feature vector of plane Sparse3DCNN Query Retrieved items Cosine distance
  • 24. Triplet loss The representation can be efficiently learned by minimizing triplet loss. Triplet is a set (a, p, n), where ● a - anchor object ● p - positive object that is similar to anchor object ● n - negative object that is not similar to anchor object , where is a margin parameter, and are distances between p and a and n and a.
  • 25. Our approach ● Use very large resolutions, and sparse representations. ● Used triplet learning for 3D shapes. ● Used Large Scale Shape Datasets ModelNet and ShapeNet.
  • 28. Conclusions ● For small datasets of shape or 3D sparse tensors voxels can work. ● Voxels don’t scale for hundreds of “classes” and loose texture information. ● Cannot encode complicated object domains.
  • 29. Problems for next 5 years
  • 37. Robotic Control in Human Environments
  • 38. Commodity sensors to create 2.5D images Intel RealSense Series Asus Xtion Pro Microsoft Kinect v2 Structure Sensor
  • 39. What they have in common?
  • 40. What they have in common? They require understanding the whole scene
  • 41. Problem of “Holistic” Scene understanding
  • 42. Lin D., Fidler S., Urtasun R. Holistic scene understanding for 3d object detection with rgbd cameras //Proceedings of the IEEE International Conference on Computer Vision. – 2013. – С. 1417-1424. ● Human environments often designed by humans ● A most of the objects are created by humans ● Context provides information by joint probability functions ● Textures caused by materials and therefore can explain a functions and structure of an object Problem of “Holistic” Scene understanding
  • 43. Connecting 3 families of CV algorithms is inevitable Learnable Computer Vision Systems (Deep Learning) Geometric Computer Vision (SLAMs) Probabilistic Computer Vision (Bayesian methods)
  • 44. Connecting 3 families of CV algorithms is inevitable Learnable Computer Vision Systems (Deep Learning) Geometric Computer Vision (SLAMs) Probabilistic Computer Vision (Bayesian methods) Probabilistic Inverse Graphics
  • 45. Probabilistic Inverse Graphics enables ● Takes into account setting information (shop: shelves and products | street: buildings, cars, pedestrians) ● Make maximum likelihood estimates from data and model (or give directions on how to reduce uncertainty the best way) ● Learns structure of objects (Materials and textures / 3D shape / intrinsic dynamics)
  • 46. Thank you. Alexandr Notchenko Ermek Kapushev Evgeny Burnaev
  • 47. Citations and Links 1. Deep Learning NIPS’2015 Tutorial by Geoff Hinton, Yoshua Bengio & Yann LeCun 2. Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., & Xiao, J. (2015). 3D ShapeNets: A Deep Representation for Volumetric Shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1912-1920). 3. C. Nash, C. Williams Generative Models of Part-Structured 3D Objects 4. Qin, Fei-wei, et al. "A deep learning approach to the classification of 3D CAD models." Journal of Zhejiang University SCIENCE C 15.2 (2014): 91-106. 5. Engel, Jakob, Thomas Schöps, and Daniel Cremers. "LSD-SLAM: Large-scale direct monocular SLAM." European Conference on Computer Vision. Springer International Publishing, 2014. 6. Su, Hang, et al. "Multi-view convolutional neural networks for 3D shape recognition." Proceedings of the IEEE International Conference on Computer Vision. 2015. 7. Whelan, Thomas, et al. "ElasticFusion: Dense SLAM Without A Pose Graph." Robotics: science and systems. Vol. 11. 2015. 8. Notchenko, Alexandr, Ermek Kapushev, and Evgeny Burnaev. "Sparse 3D Convolutional Neural Networks for Large-Scale Shape Retrieval." arXiv preprint arXiv:1611.09159 (2016). 9. Newcombe, Richard A., Dieter Fox, and Steven M. Seitz. "Dynamicfusion: Reconstruction and tracking of non-rigid scenes in real-time." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. 10. Gupta, Saurabh, et al. "Learning rich features from RGB-D images for object detection and segmentation." European Conference on Computer Vision. Springer International Publishing, 2014.