B-DCGAN Slides for ICONIP2019

Shouno Lab@UEC
B-DCGAN: Evaluation of Binarized DCGAN for FPGA
Hideo Terada
Chief Technical Officer
Open Stream, Inc.
https://www.opst.co.jp
Hayaru Shouno
Graduate School of Informatics and Engineering
University of Electro-Communications
https://www.uec.ac.jp
ICONIP2019

Shouno Lab@UEC
Outline
● Motivation
○ FPGA for Real-world AI/IoT
● Research Questions
○ Is the GAN on FPGA possible?
○ Is the binarization for GAN possible ?
● Methods
○ Binarization of DCGAN
○ Learn on CPU/GPU, Run on FPGA
○ Experiment
● Results
○ Almost all layers can be binarized
○ Full binarization is NG
2

Shouno Lab@UEC
Outline
● Motivation
● Methods
○ Experiment
● Results
3

Shouno Lab@UEC
Motivation
1) Evaluating FPGA
as low energy edge device
for real-world IoT/AI(DNN)
2) Compaction of deep neural network model
4
photo: arrow.com

Shouno Lab@UEC
FPGA overview
● Field Programmable Gate Array
● User programmable hardware circuits
● Low energy per one operation
5
Fig. Connections configured by user

Shouno Lab@UEC
FPGA vs Other Processors
6
FPGA GPU CPU
Energy
Consumption ★★★ ★ ★
Floating Point
Units ★ ★★★ ★★
Memory bandwidth ★ ★★★ ★★
Latency ★★★ ★ ★★
Flexibility ★★★ ★ ★
Ease of
development ★ ★★ ★★★
Table. Comparison of FPGA, GPU, and CPU

Shouno Lab@UEC
Outline
● Motivation
● Methods
○ Experiment
● Results
7

Shouno Lab@UEC
Research Question
1) How much can FPGA be capable of DNN?
2) How can the model compaction work well
for DNN?
8
We tried to binarize the DCGAN.
Reason:
A. Binary operations are suitable for FPGA
B. Somewhat complicated model
C. Not much previous research

Shouno Lab@UEC
Outline
● Motivation
● Methods
○ Experiment
● Results
9

Shouno Lab@UEC
The Model: DCGAN(with condition)
10
fig. Deep Convolutional & Conditional GAN
CGAN[Mirza+ 2014]
DCGAN[Radford+ 2015]

Shouno Lab@UEC
System structure
11
HLS:High Level Synthesis

Shouno Lab@UEC
The Network of the Generator
12

Shouno Lab@UEC
Forward pass, Backward pass
13
The standard NN training process includes
a forward pass and a backward pass.

Shouno Lab@UEC
How to binarize : basic idea
1) In forward-pass
Binarized function is not differentiable!
2) Binary Representation
a) 1-bit representation mapping:
bit value 0 : represents logical value '-1'
bit value 1 : represents logical value '+1'
3) Multiplication as XNOR
see the table
4) Binarize from real-value.
a) Using 'sign' function
14
Real-value(logical)
A B A✕B
-1 -1 1
-1 +1 -1
+1 -1 -1
+1 +1 1
Binary-value(on FPGA)
A B A XNOR B
0 0 1
0 1 0
1 0 0
1 1 1
BNN[Courbariaux+ 2016]

Shouno Lab@UEC
B-FC: Binary Full Connection Layer
15

Shouno Lab@UEC
B-Deconv: Binary Deconvolution Layer
16
Same manner as B-FC

Shouno Lab@UEC
Deconvolution on FPGA
17
IN(2x2)
OUT(4x4)
Kernel
Represented
Transposed Circular Matrix C
(Input Size)x(Output Size)
Figures are from Theano web site
Too Large
Ci,j
= FPGA_func(i, j, stride, padding)
# no use memory for matrix C

Shouno Lab@UEC
B-BNA: Binary Batch Normalization + Activation Layer
18
FINN [Umuroglu+ 2016]

Shouno Lab@UEC
Experiment
● Purpose
Find out what range of layers
can be binarized.
●
=Which is the best 'scenario'?
(scenario:S0, S1-1, S1-2,...)
19
Fig. Scenarios of binarization.
The red layer is binarized layer.
No
Binarize
Full
Binarize
Partial
Binarize

Shouno Lab@UEC
Experiment:Criteria
○ Image Quality
by visual assessment on PC to find peak quality(see below)
○ Check Capacity in FPGA
20
Fig. Generated images on training steps

Shouno Lab@UEC
Outline
● Motivation
● Methods
○ Experiment
● Results
21

Shouno Lab@UEC
Result
22
The Best Scenario: S3-1
● Most binarized and acceptable quality
Fig. Network of the S3-1.
Red-colored boxes are binarized layers

Shouno Lab@UEC
Result
● Most binarized and acceptable quality
23
Fig. The peak quality images of the scenarios.
(a) The peak of S3-1
Acceptable quality
(b) The peak of S3-2
(All layer binarization)
Unacceptable quality

Shouno Lab@UEC
Result
24
Scenario BRAM DSP FF LUT
S0
Build Error at HLS
(capacity overflow)
S1-1
S1-2
S2-1
S2-2 91 24 10 37
S3-1 87 22 8 34
S3-2 85 20 8 34
Table. FPGA generator's element utilization(%) for each scenario.

Shouno Lab@UEC
Conclusion
1) How much can FPGA be capable of DNN?
a) Inference: Possible
b) Training: Difficult/Not Suitable
c) Development: Relatively Difficult
2) How can the model compaction work well for DNN?
a) Forward-pass: Possible
b) Backward-pass: Difficult
25

Shouno Lab@UEC
Future direction
● Recent Model-Compaction Research(FYI):
○ QGAN[Wang+ 2019]
■ Multi-precision quantization
○ GAN-MC[Liu+ 2019]
■ GAN assisted model compression
● Productivity improvement
○ Easy FPGA development for AI software developers
○ TensorFlow XLA
■ Offloading model into FPGA
26

Shouno Lab@UEC
The End, Thanks you!
27

Shouno Lab@UEC
Previous Research
● BNN(Binarized Neural Networks)
● FINN(Fast, scalable binarized Neural Network Interface)
● CGAN
● DCGAN
28

B-DCGAN Slides for ICONIP2019

More Related Content

Similar to B-DCGAN Slides for ICONIP2019 (20)

More from Hideo Terada (9)

Recently uploaded (20)

B-DCGAN Slides for ICONIP2019