Summary of survey papers on deep learning method to 3D data

2 0 1 9 / 0 9 / 1 2
Survey of 3D Deep Learning (Robo+R3 Study Group)
Arithmer R3 Div. Takashi Nakano

Self-Introduction
• Takashi Nakano
• Graduate School
• Kyoto University
• Laboratory : Nuclear Theory Group
• Research : Theoretical Physics, Ph.D. (Science)
• Phase structure of the universe
• Theoretical properties of Lattice QCD
• Phase structure of graphene
• Former Job
• KOZO KEIKAKU ENGINEERING Inc.
• Contract analysis / Technical support / Introduction support by using software of Fluid Dynamics /
Powder engineering
• Current Job
• Application of machine learning / deep learning to fluid dynamics
• e.g. https://arithmer.co.jp/2019-12-29-1/
• Application of machine learning / deep learning to 3D data

Purpose
• Purpose of this material
• Overview of 3D deep learning
• Comparison b/w each method of 3D deep learning
• Main papers (In this material, I have summarized the material based on
following materials and cited papers therein.)
• E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations",
2018
• M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016

Application
• Application of 3D Deep Learning
Classification Segmentation Correspondence Retrieval
3D data restoration from 2D images,
Pose Estimation, etc.Per-point classification
Each label
at each vertex
same #vertex
at each model
Comparison of Global Feature
[2]
[1]
[1] C. R. Qi, "PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation", 2016
[2] J. Masci et al. "Geodesic convolutional neural networks on Riemannian manifolds", 2015
[1]
[1]
[1]

Agenda
• Methods of 3D Deep Learning
• Euclidean vs Non-Euclidean
• Euclidean Method
• Projections / Multi-View
• Voxel
• Non-Euclidean Method
• Point Cloud / Mesh / Graph
• Accuracy
• Dataset / Material
• Appendix
• Mesh Generation
• Laplacian on Graph
• Correspondence

3D Data
• 3D Data
Point Cloud Mesh
Point Cloud Mesh Graph
Vertex 〇〇〇
Face - 〇 -
Edge - - 〇
[ 𝑥0, 𝑦0, 𝑧0 , … , 𝑥 𝑁, 𝑦 𝑁, 𝑧 𝑁 ]
[ 𝑉00, 𝑉01, 𝑉02 , … , 𝑉𝐹0, 𝑉𝐹1, 𝑉𝐹2 ]
[ 𝑉00, 𝑉01 , 𝑉01, 𝑉02 , … , 𝑉𝐸2, 𝑉𝐸1 ]
𝒱 𝒱
ℱ
𝒱
ℰ
Graph

Representation
• Representation of 3D data
[1] E. Ahmed et al, "A survey on Deep Learning Advances on Different 3D Data Representations", 2018
[1]

Representation
Euclidean
Non-Euclidean
Grid
(Translational invariant)
Non-Grid
(not Translational invariant)
Local / Intrinsic
Global / Extrinsic
Point of view from 3D
Point of view from 2D Surface
3D2D
Rigid
Small deformation
Non-Rigid
Large deformation
3D CNN2D CNN
Non-trivial CNN
〇
×
[1]
[1]

Euclidean vs Non-Euclidean
• Euclidean
[1]

• Euclidean (detail of feature)
Feature Merit Demerit
Descriptors Extraction of 3D topological
feature (SHOT, PFH, etc.)
• Can convert as feature
• Can each problem
The geometric properties
of the shape is lost.
Projections Projection of 3D to 2D - The geometric properties
of the shape is lost.
RGB-D RGB + Depth map • Can use data from
RGB-D sensors
(Kinect/realsence) as
input
• Need depth map.
• Only infer some of the
3D properties based
on the depth.
Volumetric Voxelization • Expansion of 2D CNN • Need Large memories.
• (grid information)
• Need high resolution
for detailed shapes.
(e.g. segmentation)
Multi-View 2D images from multi-angles • Highest accuracy in
Euclidean method
• Need multi-view
images

• Non-Euclidean
Point Cloud Mesh Graph
Unordered point cloud
No connected information
b/w point cloud
Connected information
b/w point cloud
Graph (Vertex, edge)
Dependence of
noise and density of
point cloud
Need to convert
from point cloud to mesh
Need to create graph type
[ 𝑥0, 𝑦0, 𝑧0 , 𝑥1, 𝑦1, 𝑧1 ]
[ 𝑥1, 𝑦1, 𝑧1 , 𝑥0, 𝑦0, 𝑧0 ]
𝒱 𝒱
ℱ
𝒱
ℰ
[1] R. Hanocka et al., "MeshCNN: A Network with an Edge", 2018
[1]

• Non-Euclidean (detail of feature)
Feature Merit Demerit
Point
Cloud
• Treat point cloud
• Need to keep translational
and rotational invariance
• Treat unordered point cloud
• No connected information
b/w point cloud
• Original data is often point
cloud.
• e.g. scanned data (No CAD
data, Terrain data)
• Civil engineering, architecture,
medical care, fashion
• Treat noise
• Dependence of density of
point cloud
• Complement b/w point
cloud
• Cannot distinguish b/w
close point cloud
Mesh • Treat mesh data
• Connected information b/w
point cloud
• Convert mesh data to
structure for applying CNN
• CAD data
• e.g. design in manufacturing
• Can keep geometry in few
mesh
• Convert point cloud to mesh
data
Graph • Treat mesh as graph
• Vertex (node)
• Edge (connected information
b/w point cloud)
• Same as Mesh • Create graph type CNN
(non-trivial)

Euclidean
[1]

Euclidean
• Each Euclidean Method (Projections / RGB-D / Volumetric / Multi-View)
Method Application Link
Deep Pano Classification Paper
Two-stream CNNs on RGB-D Classification Paper
VoxNet Classification Paper
GitHub(Keras)
MVCNN Classification
Retrieval
Paper
GitHub(PyTorch/TensorFlow
etc.)

Euclidean
• Deep Pano [1]
• Projection to Panoramic image
• Row-wise max-pooling for rotational invariant
Panoramic image
[1] B. Shi et al. "DeepPano: Deep Panoramic Representation for 3D Shape Recognition", 2017
[1]

Euclidean
• Two-stream CNNs on RGB-D [1]
• Concatenate CNN of RGB and CNN of depth map
Concatenation[1]
[1] A. Eitel et al. "Multimodal Deep Learning for Robust RGB-D Object Recognition", 2015

Euclidean
• VoxNet [1]
• Voxelization of 3D point cloud to voxel
• Not robust for data loss
Voxelization
Point Cloud
Voxel
[1] D. Maturana et al. "VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition", 2015
[1]

Euclidean
• MVCNN [1]
• Merge CNN of each images
[1] H. Su et al. "Multi-view Convolutional Neural Networks for 3D Shape Recognition", 2015
[1]

Non-Euclidean (Point Clouds)
[1]

• Each Non-Euclidean Method (Point Cloud)
PointNet Classification
Segmentation
Retrieval
Correspondence
Paper
GitHub (TensorFlow)
PointNet++ Classification
Segmentation
Retrieval
Correspondence
Paper
GitHub (TensorFlow)
PyTorch-geometric (PointConv)
Dynamic Graph CNN
(DGCNN)
Classification
Segmentation
Paper
GitHub (PyTorch/TensorFlow)
PyTorch-geometric
(DynamicEdgeConv)
PointCNN Classification
Segmentation
Paper
GitHub (TensorFlow)
PyTorch-geometric (XConv)
※Some equations from following pages are referred to the documents in PyTorch-geometric.
(https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html)
I will explain PyTorch-geometric in later page.

• PointNet [1]
• Treat unordered point cloud by max-pooling
• Comparison b/w PointNet++
• Detailed information is lost
• Cannot treat different density of point cloud
Part segmentation
(Per-point classification)
Predict Affine transformation
(Transrational, Rotational Invariance)
Similar to Spatial Transformer Networks in 2D
Classification
𝑓 𝑥1, ⋯ , 𝑥 𝑛 = 𝑔(ℎ 𝑥1 , ⋯ , ℎ 𝑥 𝑛 )
Max-poolingInput feature
MLP
Symmetry Function
Global + Local Feature
Randomly
rotating the object
along up-axis,
Normalization
in unit square
Affine
transformation
[1]

• PointNet
• T-Net [1]
• Similar to Spatial Transformer Networks in 2D
• Spatial Transformer Networks
• Alignment of image (transformation, rotation, distortion etc.) by spatial transformation
• Learn affine transformation from input data (not necessarily special data)
• Can insert this networks at each point b/w networks
Reference Contents
Paper Original Paper
Sample (PyTorch) Dataset : MNIST
[1] M. Jaderberg et al. "Spatial transformer networks",2015
[1]

• PointNet
• Spatial Transformer Networks
• Localization net : output parameters 𝜃 to transform for input feature map 𝑈
• Combination of Conv, MaxPool, ReLU, FC
• Output : 2 × 3
• Grid generator : create sampling grid by using the parameters
• Sampler : Output transformed feature map 𝑉
• pixel
Spatial Transformer Networks (2D)
Grid generator
Input map to transformed map
Input
feature map
Output
feature map
𝑥𝑖
𝑠
𝑦𝑖
𝑠 = 𝒯𝜃 𝐺𝑖 = 𝐴 𝜃
𝑥𝑖
𝑡
𝑦𝑖
𝑡
1
2 × 3
[1] M. Jaderberg et al. "Spatial transformer networks",2015
[1][1]
[1]

• PointNet
• T-Net
• 3D ver. of Spatial Transformer Networks in 2D
• Not need sampling grid (There are no gird structure in 3D)
• Directly apply transformation to each point cloud
• Output parameter
• 3 × 3 in first T-Net
• 64 × 64 in second T-Net
T-Net
(input feature : 3)
T-Net
(input feature : 64)
[1]

• PointNet++ [1]
• Comparison b/w PointNet
• Detailed information is kept
• Can treat different density of point cloud
Concatenation of multi-resolution [1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017
[1]

• PointNet++
• Set abstraction
• Grouping in one scale + feature extraction
• Sampling Layer : Extraction of sampling points by farthest point sampling (FPS)
• Grouping Layer : Grouping points around sampling points
• PointNet Layer : Applying PointNet
Sampling Layer Grouping Layer
𝑟

• PointNet++
• Point Feature Propagation for segmentation
• Interpolation : interpolation from k neighbor points
• Concatenation
Interpolation
𝑓 𝑗 𝑥 =
𝑖=1
𝑘
𝑤𝑖 𝑥 𝑓𝑖
(𝑗)
𝑖=1
𝑘
𝑤𝑖 𝑥
𝑘 = 3
𝑤𝑖 𝑥 =
1
𝑑 𝑥, 𝑥𝑖
2
𝑑
𝑥
𝑥𝑖
𝑥
feature
weight
Inverse of distance[1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017
[1]

• PointNet++
• Single scale grouping
• Multi scale/resolution grouping
• Combination of features from different scales
• Robust for non-uniform sampling density
• Modifying architecture in set abstraction level
𝐿𝑖−1 Level
𝐿𝑖 Level
Original Points
multi-resolution grouping (MRG)
Concatenation of information of multi-resolutionHigh computational cost
multi-scale grouping (MSG)
Recommendation
[1] C. R. Qi et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space", 2017
[1] [1]

• PointNet++
• Detail of architecture
• Note: #vertex is fixed
𝑆𝐴 512, 0.2, 64, 64, 128 → 𝑆𝐴 128, 0.2, 64, 64, 128 → 𝑆𝐴 256,512,1024
#𝑖𝑛𝑝𝑢𝑡 𝑣𝑒𝑟𝑡𝑒𝑥: 1024
→ 𝐹𝐶 512,0.5 → 𝐹𝐶 256,0.5 → 𝐹𝐶(𝐾)
class
→ 𝐹𝑃 256, 256 → 𝐹𝑃 256,128 → 𝐹𝑃(128,128,128,128, 𝐾)
per point segmentation
1024/2 512/4
Architecture for classification
and part segmentation of ModelNet
using single scale grouping
𝑆𝐴 𝐾, 𝑟, ℓ1, ⋯ , ℓ 𝑑
#vertex radius Pointnet (#FC:d)
𝑆𝐴 ℓ1, ⋯ , ℓ 𝑑
Set abstraction level
Global Set
abstraction level #FC:d
Convert single vector by maxpooling
For classification
For part segmentation
Same in cls. and seg.
𝐹𝐶 ℓ, 𝑑𝑝
𝐹𝑃 ℓ1, ⋯ , ℓ 𝑑
Fully Connected
Feature Propagation
Channel
Ratio of dropout
#FC:d

• PointNet++
• Detail of architecture
• Note: #vertex is fixed
#𝑖𝑛𝑝𝑢𝑡 𝑣𝑒𝑟𝑡𝑒𝑥: 1024
Architecture classification of ModelNet using multi-resolution grouping (MRG)
𝑆𝐴 512, 0.2, 64, 64, 128 → 𝑆𝐴 64, 0.4, 128, 128, 256
𝑆𝐴 512, 0.4, 64, 128, 256
𝑆𝐴 64, 128, 256,512
𝑆𝐴 256,512,1024
Concat.
Concat.
→ 𝐹𝐶 512,0.5 → 𝐹𝐶 256,0.5 → 𝐹𝐶(𝐾) Same as single scale grouping
class

• Dynamic Graph CNN (DGCNN) [1]
• PointNet + w/ Edge Conv.
• Edge Conv.
• Create local edge structure dynamically (not fixed in each layer)
Edge Conv.
PointNet+ w/ Edge Conv.
𝒙𝑖′ =
𝑗∈𝑁(𝑖)
ℎ 𝚯(𝒙𝑖, 𝒙𝑗 − 𝒙𝑖)
global local
Search neighbors in feature space by kNN
[1] Y. Wang, "Dynamic Graph CNN for Learning on Point Clouds", 2018
[1]
[1]

• PointCNN [1]
• Downsampling information from neighborhoods into fewer representative
points
Χ-Conv.
Lower resolution, deeper channels
Decreasing #representative points, deeper channels
𝒙𝑖′ = 𝐶𝑜𝑛𝑣 𝑲, 𝛾Θ 𝑷𝑖 − 𝒑𝑖 × ℎΘ 𝑷𝑖 − 𝒑𝑖 , 𝒙𝑖
Input feature
MLP applied
individually on each point
like PointNet
Kernel
Concatenation
[1] Y. Li et al. "PointCNN: Convolution On X-Transformed Points", 2018
[1]
[1]

Non-Euclidean (Mesh)
[1]

• Each Non-Euclidean Method (Mesh)
MeshCNN Classification
Segmentation
Paper
GitHub (PyTorch)
MeshNet Classification Paper
GitHub (PyTorch)

• MeshCNN [1]
• Edge collapse by pooling
• Can apply only the manifold mesh
use in segmentation
EdgeEdge collapse by pooling
Input feature
Angle
Length
Pooling / Unpooling
[1] R. Hanocka et al., "MeshCNN: A Network with an Edge", 2018
[1]
[1] [1]

• MeshNet
• Input feature
• Center, corner, normal, neighbor index
Information of neighborhood of face
Mesh Conv.
(Combination + Aggregation)
[1] Y. Feng et al. "MeshNet: Mesh Neural Network for 3D Shape Representation", 2018
[1]
[1]
[1]

Non-Euclidean (Graph)
[1]

• Each Non-Euclidean Method (Graph)
• Spectral / Spatial Method
Spectral Spatial
Euclidean
(1D)
Non-Euclidean
(Manifold)
∆𝜙𝑖 = 𝜆𝑖 𝜙𝑖
𝜙𝑖 = 𝑒 𝑖𝜔𝑥
, 𝜆𝑖 = 𝜔2
Local coordinate
𝐷𝑗 𝑥 𝑓 =
𝑦∈𝑁(𝑥)
𝜔𝑗(𝒖 𝑥, 𝑦 )𝑓(𝑦)
𝑓 ∗ 𝑔 𝑥 =
𝑗=1
𝐽
𝑔𝑗 𝐷𝑗 𝑥 𝑓
Patch Operator
Pseudo-coordinate
Convolution
Generalization of
Fourier basis
[1] M. M. Bronstein et al., "Geometric deep learning: going beyond Euclidean data", 2016
[3] M. Fey et al. "SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels", 2017
[1] [1]
[3]
[2]

• Spatial method is more useful than spectral method.
Method Structure Feature
Spectral • Fourier basis in manifold
• Laplacian eigenvalue/eigenvector
• Spectral filter coefficients is base dependent in some
method
• No locality in some method
• High computational cost
Spatial • Create local coordinate
• Patch operator + Conv.
• Locality
• Efficient computational cost
※Some equations from following pages are referred to the documents in PyTorch-geometric.
(https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html)
I will explain PyTorch-geometric in later page.

• Spectral, Spectral free
Method Method Application Link
Spectral CNN Spectral Graph Paper
Chebyshev Spectral
CNN
(ChebNet)
Spectral free Graph Paper
GitHub (TensorFlow)
PyTorch-geometric
(ChebConv)
Graph Convolutional
Network
(GCN)
PyTorch-geometric
(GCNConv)
Graph Neural Network
(GNN)

• Spectral CNN [2]
• cannot use different shape
• Spectral filter coefficients is base dependent
• High computational cost
• No locality
∆𝑓 𝑖 ∝
𝑖,𝑗 ∈ℰ
𝜔𝑖𝑗(𝑓𝑖 − 𝑓𝑗)Laplacian
Different shape
-> different basis -> different result
𝒇ℓ
𝑜𝑢𝑡
= 𝜉
ℓ′=1
𝑝
𝚽 𝑘 𝑮ℓ,ℓ′ 𝚽 𝑘
𝑇
𝒇ℓ′
𝑖𝑛
Laplacian eigenvectorReLU
[2] J. Bruna et al. "Spectral Networks and Locally Connected Networks on Graphs", 2013
[1]
[1]

• Chebyshev Spectral CNN (ChebNet) [1]
• Not calculate Laplacian eigenvectors directly
• Locality (K hops)
• Approximate filter as polynomial
• Graph Convolutional Network (GCN) [2]
• Special ver. of ChebNet (𝐾 = 2)
𝑋′
=
𝑘=0
𝐾−1
𝑍(𝑘)
⋅ Θ(𝑘)
𝑍(0) = 𝑋
𝑍(1)
= 𝐿 ⋅ 𝑋
𝑍(𝑘) = 2 ⋅ 𝐿 ⋅ 𝑍 𝑘−1 − 𝑍(𝑘−2)
𝐿:scaled and normalized Laplacian
[1] M. Defferrard et al. "Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering", 2016
[2] T. N. Kipf et al. "Semi-Supervised Classification with Graph Convolutional Networks", 2016

• Charting
Application Link
Geodesic CNN Mesh
Shape retrieval /
correspondence
Paper
Anisotropic CNN Mesh / point cloud
Shape correspondence
Paper
MoNet Graph / mesh / point cloud
Paper
PyTorch-geometric (GMMConv)
SplineCNN Graph / Mesh
Classification
Paper
GitHub (PyTorch)
PyTorch-geometric
(SplineConv)
FeaStNet Graph / Mesh
Segmentation
Paper
PyTorch-geometric (FeaStConv)

• Geodesic CNN (GCNN)[1] ⊂ Anisotropic CNN (ACNN)[2] ⊂ MoNet [3]
MoNet
𝐷𝑗 𝑥 𝑓 =
𝑦∈𝑁(𝑥)
𝜔𝑗(𝒖 𝑥, 𝑦 )𝑓(𝑦)
𝑓 ∗ 𝑔 𝑥 =
𝑗=1
𝐽
𝑔𝑗 𝐷𝑗 𝑥 𝑓
Patch Operator
Pseudo-coordinate
Convolution
𝜔𝑗 𝒖 = exp −
1
2
𝒖 − 𝜇 𝑗
T
Σj
−1
(𝒖 − 𝜇 𝑗)
𝜔𝑗 𝒖 = exp −
1
2
𝒖 𝑇
𝑹 𝜃 𝑗
𝛼 0
0 1
𝑹 𝜃 𝑗
𝑇
𝒖ACNN
GCNN 𝜔𝑗 𝒖 = exp −
1
2
𝒖 − 𝑢𝑗
T 𝜎𝜌
2
0
0 𝜎 𝜃
2
(𝒖 − 𝑢𝑗)
Rotation of 𝜃 to the maximum
curvature direction The degree of anisotropy
covariance (radius, angle direction)
Learning parameters
[2] D Boscaini et al. "Learning shape correspondence with anisotropic convolutional neural networks", 2016
[3] F. Monti et al. "Geometric deep learning on graphs and manifolds using mixture model CNNs", 2016
[1]
[4]

• Geodesic CNN (GCNN)
• Create local coordinate
• Do not verify the meaningful chart (need to create small radius chart)
• Anisotropic CNN (ACNN)
• Fourier basis is based on anisotropic heat diffusion eq.
• MoNet
• Learn filter as parametric kernel
• Generalization of geodesic CNN and anisotropic CNN

• SplineCNN [1]
• Filter based on B-spline function
• Efficient computational cost
𝒙𝑖′ =
1
|𝑁 𝑖 |
𝑗∈𝑁(𝑖)
𝒙𝑖 ⋅ ℎ 𝚯(𝒆𝑖,𝑗)
Weighted B-Spline basis
[1]

• FeaStNet [1]
• Dynamically determine relation b/w filter weight and local graph of a node
𝒙𝑖′ =
1
|𝑁 𝑖 |
𝑗∈𝑁(𝑖) 𝑚=1
𝑀
𝑞 𝑚 𝒙𝑖, 𝒙𝑗 𝑾 𝑚 𝒙𝑗
Filter
(e.g. 𝑀 = 3 × 3 = 9)
Euclidean FeaStNet
Weight
Input Output Input Output
#neighbor
(e.g. 𝑁 = 6)
𝑞 𝑚 𝒙𝑖, 𝒙𝑗 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥𝑗 𝒖 𝒎
𝑻
𝒙𝒊 − 𝒙𝒋 + 𝑐 𝑚
𝒙𝑖′ =
𝑚=1
𝑀
𝑾 𝑚 𝒙 𝑛(𝑚,𝑖)
pixel
D input featureE output feature [1] N Verma et al. "FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis", 2017
[1][1]

• PyTorch-geometric
• https://github.com/rusty1s/pytorch_geometric
• Library based on PyTorch
• For point cloud, mesh (not only graph)
• Include Point cloud, graph-type approach code
• PointNet++, DGCNN, PointCNN
• ChebNet, GCN, MoNet, SplineCNN, FeaStNet
• Easy to get the famous sample data and transform same data format
• ModelNet, ShapeNet, etc.
• Many example and benchmark

Accuracy
• Accuracy (Classification)
• around 90% in any method (except VoxNet)
Method “ModelNet40”
Overall Acc. [%] / Mean
class Acc. [%]
“SHREC”
Overall Acc. [%]
VoxNet 85.9 / 83.0 −
MVCNN − / 90.1 𝟗𝟔. 𝟎𝟗
PointNet 89.2 / 86.0 −
PointNet++ 90.7 / − −
DGCNN 𝟗𝟐. 𝟗 / 𝟗𝟎. 𝟐 −
PointCNN 92.2 / 88.1 −
MeshNet − / 91.9 −
MeshCNN − / − 91.0
※Please refer the detail in each paper (mentioned in each page)

Accuracy
• Accuracy (Segmentation)
Method Part segmentation
“ShapeNet”
mIoU (mean per-class
part-averaged IoU) [%]
Part
segmentation
“ScanNet”
Acc. [%]
Part
segmentation
“COSEG”
Acc. [%]
Scene
segmentation
“S3DIS”
Acc. [%] /
mIoU [%]
Human body
segmentation
“including
SCAPE, FAUST
etc.”
Acc. [%]
PointNet 80.4 (57.9 − 95.3) 73.9 54.4 − 91.5 78.6 / − 90.77
PointNet++ 81.9 (58.7 − 95.3) 84.5 79.1 − 98.9 − / − −
DGCNN 82.3 (𝟔𝟑. 𝟓 − 𝟗𝟓. 𝟕) − − 𝟖𝟒. 𝟏 / 56.1 −
PointCNN 𝟖𝟒. 𝟔 𝟖𝟓. 𝟏 − − / 𝟔𝟓. 𝟑𝟗 −
MeshCNN − − 𝟗𝟕. 𝟓𝟔 − 𝟗𝟗. 𝟔𝟑 − / − 𝟗𝟐. 𝟑𝟎
FeaStNet 81.5 − − − / − −
※Please refer the detail in each paper (mentioned in each page)

Dataset
• 3D Dataset
Contents Data Format Purpose PyTorch-geometric
ModelNet10/40 3D CAD Model
(10 or 40 classes)
Mesh (.OFF) Classification ModelNet
ShapeNet 3D Shape Point Cloud (.pts) Segmentation ShapeNet
ScanNet Indoor Scan Data Mesh (.ply) Segmentation -
S3DIS
(original, .h5)
Indoor Scan Data Point Cloud Segmentation S3DIS
ScanNet：registration required
S3DIS : registration required (for original)

Dataset
• 3D Dataset
Contents Data Format Purpose PyTorch-geometric
SHREC many type for each
contest
- Retrieval -
SHREC2016 Animal, Human
(Part Data)
Mesh (.OFF) Correspondence SHREC2016
TOSCA Animal, Human Mesh
(same #vertices at
each category,
separate file of
vertices and
triangles)
Correspondence TOSCA
PCPNet 3D Shape Point Cloud (.xyz)
(Including normal,
curvature files.)
Estimation of local
shape (Normal,
curvature)
PCPNet
FAUST Human body Mesh Correspondence FAUST
FAUST(Note) : registration required

Material
• Material of 3D deep learning (3D / point cloud)
Paper Comment
A survey on Deep Learning Advances on
Different 3D Data Representations
• Review of 3D Deep Learning
• Easier to read it
• Written from point of view about Euclidean
and Non-Euclidean method
Paperwithcode • Paper w/ code about 3D
Point Cloud Deep Learning Survey Ver. 2 • Deep learning for point cloud
• Survey of many papers

Material
• Material of 3D deep learning (graph)
Paper Comment
Geometric deep learning: going beyond
Euclidean data
• Review of geometric deep learning
Geometric Deep Learning • summary of paper and code about geometric
deep learning
Geometric Deep Learning on Graphs and
Manifolds (NIPS2017)
• Presentation (youtube) about geometric deep
learning

Summary
• There are many methods of 3D deep learning.
• Two main method
• Euclidean vs Non-Euclidean
• Euclidean Method
• Projections / Multi-View / Voxel
• Non-Euclidean Method
• Point Cloud / Mesh / Graph
• Each method have merit and demerit.
• We need to choose the better method for each data type and application.
• The research about 3D deep learning is growing.

Appendix
• Appendix
• Mesh Generation
• Laplacian on Graph
• Correspondence

Appendix : Mesh Generation
• Mesh Generation
• In this material, I have summarized these materials.
Link Contents
点群面張り（精密工学会） • Surface reconstruction
メッシュ処理（精密工学会） • Mesh processing
CV勉強会＠関東発表資料点群再構成に関するサーベイ • Survey of point cloud reconstruction

• Difficulty of Mesh Generation
Processing Difficulty
Pre-processing Reduction of Noise / Missing / Abnormal value / density difference of vertices
Post-processing Mesh smoothing / hole filling
Ground Truth Noise
/ Abnormal value
Missing / density
difference of vertices
Mesh smoothing
/ hole filling

• Kinds of Mesh Generation
Kind Feature Classification of the method
Direct Triangulation Direct mesh generation form point cloud Explicit method
Surface Smoothness Smooth surface mesh from point cloud Implicit method
Direct Triangulation Surface Smoothness

• Classification of the method
• In general, it is easier to use the implicit method, since there are noise of point
cloud.
Classification of the method Information to use Influence of noise and density of
vertices
Guarantee of accuracy
Explicit method Vertices Large
(error of vertices = error of
meshes)
◎
Implicit method Meshes based on isosurface of
function fields which is
calculated from vertices
Small 〇

• Kinds of Mesh Generation (Detail)
• Direct Triangulation (example of built-in function in MeshLab)
Method Feature
Voronoi-Based Surface Reconstruction Creation of Delaunay diagram adding the vertices
using Voronoi diagram
Ball-Pivoting Algorithm Roll the ball over the point cloud and generate mesh
from the point cloud located within a certain distance

• Voronoi-Based Surface Reconstruction
• Voronoi diagram
• Region divided by the bisector of each vertices (in 2D)
• Delaunay triangulation
• Triangulation by connection of vertices
bisector
Vertices (S)
Voronoi
Vertices (V) Delaunay triangulation of S and V
(black + red line)
Example of 2D
Surface
(black line)
[1] N. Amenta et al. "A New Voronoi-Based Surface Reconstruction Algorithm", 1998
[1] [1]

• Ball-Pivoting Algorithm
Not created
Close point cloudSparse data
Not created
Ideal data
[1] F Bernardini et al. "The Ball-Pivoting Algorithm for Surface Reconstruction", 1999
[1]

• Kinds of Mesh Generation (Detail)
• Surface Smoothness (example of built-in function in MeshLab)
Method Feature
Signed distance function
+ Marching Cubes
Creation of Signed distance function by using the
distance b/w vertices and surface
+ Mesh generation by using Marching Cubes
Screened Poisson surface reconstruction
(Poisson surface reconstruction)
Distinguish b/w inside and outside of surface by
using Poisson eq.

• Signed distance function + Marching Cubes
Oriented tangent planes Estimated signed distance
Output of
modified marching cubes
𝑓 𝒑 = 𝒑 − 𝒐 ⋅ 𝒏 𝑓 𝒑 > 0
→ 𝑜𝑢𝑡𝑠𝑖𝑑𝑒
𝑓 𝒑 < 0
→ 𝑖𝑛𝑠𝑖𝑑𝑒
𝒑
𝒑
𝒐
𝒏
𝑓 𝒑 = 0 → 𝑠𝑢𝑟𝑓𝑎𝑐𝑒
[1] H. Hoppe et al. "Surface Reconstruction from Unorganized Points", 1992
[1]

• Screened Poisson surface reconstruction
• get Indicator Function by solving the Poisson eq.
∆𝜒 ≡ ∇ ⋅ ∇𝜒 = ∇ ⋅ 𝑽
Poisson eq.
Poisson surface reconstruction
Screened Poisson
surface reconstruction
Complement
b/w point cloud
[1] M. Kazhdan et al. "Poisson Surface Reconstruction", 2006
[2]
[1]
[2] M. Kazhdan et al. "Screened Poisson Surface Reconstruction", 2013

Appendix : Laplacian on Graph
• Laplacian on Graph [1]
∆𝑓 𝑖 =
1
𝑎𝑖
𝑖,𝑗 ∈ℰ
𝜔𝑖𝑗(𝑓𝑖 − 𝑓𝑗)Laplacian
(𝒱, ℰ)Graph (undirected)
𝒱 = {1, ⋯ , 𝑛}
ℰ ⊆ 𝒱 × 𝒱
𝑎𝑖
𝜔𝑖𝑗
weight
div. 𝑑𝑖𝑣 𝐹 𝑖 =
1
𝑎𝑖
𝑗: 𝑖,𝑗 ∈ℰ
𝜔𝑖𝑗 𝐹𝑖𝑗
Grad. ∇𝑓 𝑖𝑗 = 𝑓𝑖 − 𝑓𝑗
Mesh
𝜔𝑖𝑗 =
−ℓ𝑖𝑗
2
+ ℓ𝑗𝑘
2
+ ℓ 𝑘𝑖
2
8𝑎𝑖𝑗𝑘
+
−ℓ𝑖𝑗
2
+ ℓ𝑗ℎ
2
+ ℓℎ𝑖
2
8𝑎𝑖𝑗ℎ
=
1
2
(cot 𝛼𝑖𝑗 + cot 𝛽𝑖𝑗)
𝑎𝑖 =
1
3
𝑗𝑘: 𝑖,𝑗,𝑘 ∈ℱ
𝑎𝑖𝑗𝑘
𝑎𝑖𝑗𝑘 = 𝑠𝑖𝑗𝑘(𝑠𝑖𝑗𝑘 − ℓ𝑖𝑗)(𝑠𝑖𝑗𝑘 − ℓ𝑗𝑘)(𝑠𝑖𝑗𝑘 − ℓ 𝑘𝑖)
1/2
𝑠𝑖𝑗𝑘 =
1
2
(𝑎𝑖𝑗 + 𝑎𝑗𝑘 + 𝑎 𝑘𝑖)
𝑓: 𝒱 → ℝ, 𝐹: ℰ → ℝ
∆≡ −𝑑𝑖𝑣 ∇
→Laplacian
eigenvalues 𝜆 > 0
[1]

Δ𝒇 = 𝑨−1 𝑫 − 𝑾 𝒇
𝒇 = 𝑓1, ⋯ , 𝑓𝑛
𝑇
𝑾 = (𝜔𝑖𝑗)
𝑨 = 𝑑𝑖𝑎𝑔(𝑎1, ⋯ , 𝑎 𝑛)
𝑫 = 𝑑𝑖𝑎𝑔
𝑗:𝑗≠𝑖
𝜔𝑖𝑗
Laplacian ∆ Condition
Unnormalized graph
Laplacian
∆= 𝑫 − 𝑾 𝐴 = 𝐼
Normalized
Symmetry Laplacian
∆= 𝑰 − 𝑫−
𝟏
𝟐 𝑾𝑫
𝟏
𝟐
𝐴 = 𝐷
+ Normalization
Random walk
Laplacian
∆= 𝑰 − 𝑫−1 𝑾 𝐴 = 𝐷
Laplacian (as matrix)

• Convolution
(𝑓 ∗ 𝑔)(𝑥) =
𝑖≥0
𝑓𝑖 𝑔𝑖 𝜙𝑖(𝑥)
𝒇 ∗ 𝒈 = 𝚽𝑑𝑖𝑎𝑔 𝑔 𝚽T 𝐟
𝒇 = 𝑓1, ⋯ , 𝑓𝑛
𝑇
𝒈 = ( 𝑔1, ⋯ , 𝑔 𝒏)
𝚽 = (𝜙1, ⋯ , 𝜙 𝑛)
Matrix
Conv.

Appendix :Correspondence
• Correspondence [1]
Query Reference input
𝑥𝑖 (𝑦1, ⋯ , 𝑦 𝑁)
label
Each query vertex has labels
as all reference vertices
Output
(Probability)
Correct Label
Output
(Probability)
(𝑝1, 𝑝2, ⋯ , 𝑝 𝑁)
(1,0, ⋯ , 0)
One-hot vector
Loss
[1]

Summary of survey papers on deep learning method to 3D data

Summary of survey papers on deep learning method to 3D data

More Related Content

What's hot (20)

Similar to Summary of survey papers on deep learning method to 3D data (20)

More from Arithmer Inc. (20)

Recently uploaded (20)

Summary of survey papers on deep learning method to 3D data