How to apply deep learning to 3 d objects

How to apply Deep
Learning to 3D
objects

The Goal▸ First Part(General Strategy): Get knowledge about
how to approach making a deep learning product
▸ Second Part(Specific case): Get knowledge about
deep learning applied to 3D objects
2

Table of Contents
Strategy
1. Find the right problem
2. Find the right method
a. Getting Information
3. Keep rechallenging
a. Keep tries small
b. A lot of challenges
4. Focus
a. Focus on the right problem
b. Raise priority of the right method
3
Reference: https://www.iconfinder.com/

Table of Contents
Specific Case
1. Deep Learning Model: VoxNet
a. 3D CNN
b. 3D Max Pool
c. Fully connected
d. Output
e. Train
2. Improve technique
a. Accuracy
b. Speed
3. Results
4

HELLO!
I am Masaya Ohgushi
I work at Kabuku Inc.
I am an image processing
developer.
Twitter: @SnowGushiGit
5

Kabuku Inc.
・On-demand
manufacturing service
- Receive 3D data to
manufacture by using
3D printers and others
6
https://www.kabuku.co.jp/

Strategy
Find the Right problem
8

“Would you like to use deep learning
to solve your problem?

Find the right problem
▸ What are the best cases to use Deep
learning?
▹ Most cases
▹ Image processing
▹ Speech recognition
▹ Some cases
▹ Natural language processing
▹ Time series analysis
10

Right problem
▸ What are the worst cases to use Deep
learning?
▹ Not enough data
▹ Can’t prepare a pre-train model
▹ Need 100% accuracy
11

Strategy
Find the right method
12

“How can you find the best way to
solve your problem using deep
learning?

Find the Right method
▸ How to research the best deep learning
solution? (In my case)
▹ Google scholar
▹ It is a paper search engine
▹ You can find the following
▹ Good methods
▹ Good keywords
▹ Which university laboratories know about this
problem
16

▹ University laboratory sites
▹ It is possible to get data
▹ It is possible to get code
▹ GitXiv
▹ You can find papers and code
▹ Follow twitter users
▹ It is possible to get the latest information
17

▹ Book
▹ You can get well structured knowledge
▹ ArXiv
▹ You can find the latest methods
▹ Github
▹ You can find code
▹ Google
▹ If you already know a good keyword !!
18

Strategy
Keep rechallenging
19

“You gathered a lot of training data !!
Now let’s train using the full data set.

Keep rechallenging
▸ Keep tries small
▹ If you get a lot of data, first do the following
things.
▹ Prepare a small data set
▹ Check module works correctly
▹ Prepare an easy to verify training data set
▹ Most models can be trained with data such as “mnist” ,
you have to check it works
22
Reference: https://www.tensorflow.org/get_started/mnist/beginners

Keep rechallenging
▸ A lot of challenges
▹ There are no obvious methods to improve accuracy
▹ You have to check the results
▹ If training and validation accuracies don’t improve, you
have to stop it
▹ Check the results by visual boards such as TensorBoard
▹ You have to increasing challenge times by
improving the calculation speed
▹ Using GPU
▹ Optimize CPU
23

“
Deep learning has a lot of methods
to improve accuracy

“- Model
- How deep
- How it is structured
- Adjusting the hyper parameters
- Preprocess data
- Data augmentation (If using Graphical
data)
- Optimizer
- SGD, Adam, etc

Focus
▸ Focus on the right problem
▹ Depends on your situation
▹ Enough computation resource and enough
data
▹ Try a deep and complex model
▹ Enough computation resource, but not
enough data
▹ Find a good pre-train model
▹ Focus on the preprocess such as data augmentation
27

Focus
▸ Focus on the right problem
▹ Depending on your situation
▹ Not enough computation resource or data
▹ Consider other ways to solve your problem
▹ Logistic Regression, SVM, Random Forest
▹ Deep learning probably isn’t the best choice
28

Specific Case
Deep Learning applied to 3D objects
29

Specific Case
Deep Learning Model: VoxNet
30

“There are a lot of deep learning
models…
How to choose one

32
▸ In my case, I considered 3 things
▹ Resource
▹ Computation resources
▹ Human resources
▹ Performance
▹ accuracy
▹ Speed
▹ Speed of development

33
Reference: http://ri.cmu.edu/pub_files/2015/9/voxnet_maturana_scherer_iros15.pdf.

34
▸ VoxNet Advantage
▹ Resource
▹ Computation resources
▹ Good
▹ Memory 32GB (In my environment)
▹ GPU GeForce GTX 1080 (In my environment)
▹ Performance
▹ Accuracy
▹ 83 % accuracy (Top model 95 %)
▹ Speed
▹ Open source, simple code

35
▸ Voxelize
▹ Maps 3D data to a 32 * 32 * 32
voxel
▹ Reduce data size

Deep Learning Model: VoxNet (3D CNN 3D objects)
36
▸ Convolution 3D
Reference: https://www.youtube.com/watch?v=ecbeIRVqD7g

37
1
5
1
6
2
7
4
8
3
1
2
2
1
3
0
4
kernel: 2x2
stride: 2
Convolution 2D
Input Image(4x4)
5 1
3 2
1*5+1*1
+
5*3+6*2
2*5+4*1
+
7*3+8*2
3*5+2*1
+
1*3+2*2
1*5+0*1
+
3*3+4*2
33 51
24 8
Convoluted Image(2x2)

38
▸ Convolution 3D

39
▸ Convolution 3D

40
▸ Convolution 3D

41
▸ Convolution 3D

42
▸ Convolution 3D

43
▸ Convolution 3D

44
▸ Convolution 3D

45
7 times
▸ Convolution 3D

46
32
filter
:
:
3DCNN
Conv3D(input_shape=(32, 32, 32, 1),
kernel_size=(5, 5, 5),
strides=(2, 2, 2),
data_format=”channels_last”
)
▸ Convolution 3D

Deep Learning Model: VoxNet (Max Pool3D)
47
1
5
1
6
2
7
4
8
3
1
2
2
1
3
0
4
6 8
3 4
Max pool
Pooling size: 2 x 2
Stride: 2
▸ Max Pool 2D
Convolution Feature

48
▸ Max Pool 3D
Convolution
Feature
Pool window

49
Convolution
Feature
▸ Max Pool 3D

50
Convolution
Feature
▸ Max Pool 3D

51
MaxPooling3D(pool_size=(2, 2, 2),
data_format='channels_last',)
▸ Max Pool 3D

52
▸ Fully Connected and Output
3DCNN &
3DMaxPool
:
:
Fully
connected
Dense
128
Dense
number of
class
softmax

Deep Learning Model: VoxNet (Flatten, Dense)
53
▹ Softmax function
▹ It maps output as a probability distribution
▹ It is easy to differentiate

Deep Learning Model: VoxNet (Output)
54
model.add(Flatten())
model.add(Dense(128, activation='linear',))
model.add(Dense(output_dim=number_class,
activation='linear',))
model.add(Activation("softmax"))

55
model = Sequential()
model.add(Conv3D(input_shape=(32, 32, 32, 1),
kernel_size=(5, 5, 5), strides=(2, 2, 2),
data_format='channels_last',))
model.add(Conv3D(input_shape=(32, 32, 32, 1),
kernel_size=(3, 3, 3), strides=(1, 1, 1),
model.add(MaxPooling3D(pool_size=(2, 2, 2),
model.add(Flatten())
model.add(Dense(128, activation='linear',))
model.add(Dense(output_dim=number_class, activation='linear',))
model.add(Activation(‘softmax’))
▸ Model

56
▸ Train
model.compile(loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_voxel_data, y_class_label)

Specific Case
Improve technique accuracy
57

Improve technique
58
▸ Improving accuracy has 2 approaches
▹ Model
▹ Advantage
▹ A variety of ways to improve accuracy
▹ Disadvantage
▹ A deep model takes a lot of resources
▹ It is not obvious which model is better
▹ Data
▹ Advantage
▹ The effect of changes are obvious
▹ Disadvantage
▹ Approaches are limited

Improve accuracy
59
▸ Improve accuracy for validation data(In
my case)
▹ Model
▹ RandomDropout
▹ LeakyRelu
▹ Data
▹ Data augmentation(3D data)
▹ Data increase
▹ Class weight for Unbalanced category data

Improve technique
60
▸ Data Approach
▹ Data Augmentation has advantages over other
methods
▹ The effects are obvious
▹ It does not increase calculation time unlike
adding layers to the model

Improve technique
61
▸ Data Augmentation
▹ Rotation
▹ Shift
▹ Shear
▹ etc...

Improve technique
62
▸ Data Augmentation 3D
▹ Augmentation_matrix is changing
channel_images = [ndi.interpolation.affine_transform(x,
augmentation_matrix,
)for x in x_voxel]
x = np.stack(channel_images, axis=0)

Improve technique
63
▸ Data Augmentation 3D Rotation
rotation_matrix_y = np.array([[np.cos(theta), 0, np.sin(theta) , 0],
[0 , 1, 0 , 0],
[-np.sin(theta), 0, np.cos(theta), 0],
[0 , 0 , 0 , 1]])

Improve technique
64
▸ Data Augmentation 3D Shift
shift_matrix = np.array([[1, 0, 0, shift_x],
[0, 1, 0, shift_y],
[0, 0, 1, shift_z],
[0, 0, 0, 1 ]])

Improve technique
65
▸ Data Augmentation 3D Shear
shear_matrix = np.array([[1 , shear_x, shear_x, 0],
[shear_y, 1 , shear_y, 0],
[shear_z, shear_z, 1 , 0],
[0 , 0 , 0 , 1]
])

Improve accuracy
66
▸ Data increase
Add Data augmented data to
Training data

Specific Case
Improve technique speed
67

Improve calculation speed
68
▸ Deep Learning has a lot of ways to
improve calculation speed (In my case)
▹ Use a GPU(GeForce GTX 1080: Memory 8GB)
▹ CPU optimization
▹ Multi thread
▹ Prepare feature set

Improve calculation speed
69
▸ CPU optimize (TensorFlow build option)
bazel build -c opt --copt=-mavx --copt=-mavx2 --copt=-mfma

Results
71
▸ Validation Accuracy

Result
72
Method Explanation
Training
(accuracy)
Validation
(accuracy)
BaseLine BaseLine 90% 79%
Shift_x_Shift_y Data augmentation(x-shift, y-shift) 80% 80%
Shift_x_Shift_y_class
_weight
Data augmentation(x-shift, y-shift) + class
weight 80% 83%
Add_Shift_x_Shift_y_
class_weight
Data augmentation(x-shift, y-shift) + class
weight + ADD(x-shift, y-shift) 85% 85%

Conclusion
Summary of this presentation
74

Conclusion
75
Ref: https://www.iconfinder.com/, https://github.com/, https://arxiv.org/, http://www.gitxiv.com/, https://scholar.google.co.jp/
Right problem
laboratory
Rechallenge
Focus
▸ Strategy
Right Problem Rechallenge FocusRight Method

Conclusion
76
▸ Our case
Right Problem Right Method Rechallenge Focus
Data augmentation
Customize model
3D object recognition VoxNet
On-demand
manufacturing service

Deep Learning
For 3D objects
It is a rare case, Implementing deep
learning for 3D objects
78

“We are hiring!!
- Deep Learning for 3D objects
- Working in Japan
https://www.wantedly.com/project
s/111707
contact@kabuku.co.jp

80
THANKS!
Any questions?
You can find me at @SnowGushiGit &
masaya.ohgushi@kabuku.co.jp

References
▸ MNIST datasets
▹ https://www.tensorflow.org/get_started/mnist/beginners
▸ Flaticon
▹ www.flaticon.com
▸ 3D CNN-Action Recognition Part-1
▹ https://www.youtube.com/watch?v=ecbeIRVqD7g&t=82s
▸ Bengio, Yoshua, et al. "Curriculum learning." Proceedings of the 26th annual
international conference on machine learning. ACM, 2009.
▸ He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition. 2016.
▸ Yann, Margot Lisa-Jing, and Yichuan Tang. "Learning Deep Convolutional Neural
Networks for X-Ray Protein Crystallization Image Analysis." AAAI. 2016.
▸ Maturana, Daniel, and Sebastian Scherer. "Voxnet: A 3d convolutional neural network
for real-time object recognition." Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ
International Conference on. IEEE, 2015.
82

References
▸ Deep Learning Book
▹ http://www.deeplearningbook.org/
▸ IconFinder
▹ https://www.iconfinder.com/,
▸ Github
▹ https://github.com/,
▸ Arxiv
▹ https://arxiv.org/
▸ GitXiv
▹ http://www.gitxiv.com/
▸ GoogleSchlor
▹ https://scholar.google.co.jp/
▸
83

How to apply deep learning to 3 d objects

More Related Content

What's hot (20)

Viewers also liked (6)

Similar to How to apply deep learning to 3 d objects (20)

More from Ogushi Masaya (11)

Recently uploaded (20)

How to apply deep learning to 3 d objects