SlideShare a Scribd company logo
Shouno Lab@UEC
B-DCGAN: Evaluation of Binarized DCGAN for FPGA
Hideo Terada
Chief Technical Officer
Open Stream, Inc.
https://www.opst.co.jp
Hayaru Shouno
Graduate School of Informatics and Engineering
University of Electro-Communications
https://www.uec.ac.jp
ICONIP2019
Shouno Lab@UEC
Outline
● Motivation
○ FPGA for Real-world AI/IoT
● Research Questions
○ Is the GAN on FPGA possible?
○ Is the binarization for GAN possible ?
● Methods
○ Binarization of DCGAN
○ Learn on CPU/GPU, Run on FPGA
○ Experiment
● Results
○ Almost all layers can be binarized
○ Full binarization is NG
2
Shouno Lab@UEC
Outline
● Motivation
○ FPGA for Real-world AI/IoT
● Research Questions
○ Is the GAN on FPGA possible?
○ Is the binarization for GAN possible ?
● Methods
○ Binarization of DCGAN
○ Learn on CPU/GPU, Run on FPGA
○ Experiment
● Results
○ Almost all layers can be binarized
○ Full binarization is NG
3
Shouno Lab@UEC
Motivation
1) Evaluating FPGA
as low energy edge device
for real-world IoT/AI(DNN)
2) Compaction of deep neural network model
4
photo: arrow.com
Shouno Lab@UEC
FPGA overview
● Field Programmable Gate Array
● User programmable hardware circuits
● Low energy per one operation
5
Fig. Connections configured by user
Shouno Lab@UEC
FPGA vs Other Processors
6
FPGA GPU CPU
Energy
Consumption ★★★ ★ ★
Floating Point
Units ★ ★★★ ★★
Memory bandwidth ★ ★★★ ★★
Latency ★★★ ★ ★★
Flexibility ★★★ ★ ★
Ease of
development ★ ★★ ★★★
Table. Comparison of FPGA, GPU, and CPU
Shouno Lab@UEC
Outline
● Motivation
○ FPGA for Real-world AI/IoT
● Research Questions
○ Is the GAN on FPGA possible?
○ Is the binarization for GAN possible ?
● Methods
○ Binarization of DCGAN
○ Learn on CPU/GPU, Run on FPGA
○ Experiment
● Results
○ Almost all layers can be binarized
○ Full binarization is NG
7
Shouno Lab@UEC
Research Question
1) How much can FPGA be capable of DNN?
2) How can the model compaction work well
for DNN?
8
We tried to binarize the DCGAN.
Reason:
A. Binary operations are suitable for FPGA
B. Somewhat complicated model
C. Not much previous research
Shouno Lab@UEC
Outline
● Motivation
○ FPGA for Real-world AI/IoT
● Research Questions
○ Is the GAN on FPGA possible?
○ Is the binarization for GAN possible ?
● Methods
○ Binarization of DCGAN
○ Learn on CPU/GPU, Run on FPGA
○ Experiment
● Results
○ Almost all layers can be binarized
○ Full binarization is NG
9
Shouno Lab@UEC
The Model: DCGAN(with condition)
10
fig. Deep Convolutional & Conditional GAN
CGAN[Mirza+ 2014]
DCGAN[Radford+ 2015]
Shouno Lab@UEC
System structure
11
HLS:High Level Synthesis
Shouno Lab@UEC
The Network of the Generator
12
Shouno Lab@UEC
Forward pass, Backward pass
13
The standard NN training process includes
a forward pass and a backward pass.
Shouno Lab@UEC
How to binarize : basic idea
1) In forward-pass
Binarized function is not differentiable!
2) Binary Representation
a) 1-bit representation mapping:
bit value 0 : represents logical value '-1'
bit value 1 : represents logical value '+1'
3) Multiplication as XNOR
see the table
4) Binarize from real-value.
a) Using 'sign' function
14
Real-value(logical)
A B A✕B
-1 -1 1
-1 +1 -1
+1 -1 -1
+1 +1 1
Binary-value(on FPGA)
A B A XNOR B
0 0 1
0 1 0
1 0 0
1 1 1
BNN[Courbariaux+ 2016]
Shouno Lab@UEC
B-FC: Binary Full Connection Layer
15
Shouno Lab@UEC
B-Deconv: Binary Deconvolution Layer
16
Same manner as B-FC
Shouno Lab@UEC
Deconvolution on FPGA
17
IN(2x2)
OUT(4x4)
Kernel
Represented
Transposed Circular Matrix C
(Input Size)x(Output Size)
Figures are from Theano web site
Too Large
Ci,j
= FPGA_func(i, j, stride, padding)
# no use memory for matrix C
Shouno Lab@UEC
B-BNA: Binary Batch Normalization + Activation Layer
18
FINN [Umuroglu+ 2016]
Shouno Lab@UEC
Experiment
● Purpose
Find out what range of layers
can be binarized.
●
=Which is the best 'scenario'?
(scenario:S0, S1-1, S1-2,...)
19
Fig. Scenarios of binarization.
The red layer is binarized layer.
No
Binarize
Full
Binarize
Partial
Binarize
Shouno Lab@UEC
Experiment:Criteria
○ Image Quality
by visual assessment on PC to find peak quality(see below)
○ Check Capacity in FPGA
20
Fig. Generated images on training steps
Shouno Lab@UEC
Outline
● Motivation
○ FPGA for Real-world AI/IoT
● Research Questions
○ Is the GAN on FPGA possible?
○ Is the binarization for GAN possible ?
● Methods
○ Binarization of DCGAN
○ Learn on CPU/GPU, Run on FPGA
○ Experiment
● Results
○ Almost all layers can be binarized
○ Full binarization is NG
21
Shouno Lab@UEC
Result
22
The Best Scenario: S3-1
● Most binarized and acceptable quality
Fig. Network of the S3-1.
Red-colored boxes are binarized layers
Shouno Lab@UEC
Result
The Best Scenario: S3-1
● Most binarized and acceptable quality
23
Fig. The peak quality images of the scenarios.
(a) The peak of S3-1
Acceptable quality
(b) The peak of S3-2
(All layer binarization)
Unacceptable quality
Shouno Lab@UEC
Result
The Best Scenario: S3-1
24
Scenario BRAM DSP FF LUT
S0
Build Error at HLS
(capacity overflow)
S1-1
S1-2
S2-1
S2-2 91 24 10 37
S3-1 87 22 8 34
S3-2 85 20 8 34
Table. FPGA generator's element utilization(%) for each scenario.
Shouno Lab@UEC
Conclusion
1) How much can FPGA be capable of DNN?
a) Inference: Possible
b) Training: Difficult/Not Suitable
c) Development: Relatively Difficult
2) How can the model compaction work well for DNN?
a) Forward-pass: Possible
b) Backward-pass: Difficult
25
Shouno Lab@UEC
Future direction
● Recent Model-Compaction Research(FYI):
○ QGAN[Wang+ 2019]
■ Multi-precision quantization
○ GAN-MC[Liu+ 2019]
■ GAN assisted model compression
● Productivity improvement
○ Easy FPGA development for AI software developers
○ TensorFlow XLA
■ Offloading model into FPGA
26
Shouno Lab@UEC
The End, Thanks you!
27
Shouno Lab@UEC
Previous Research
● BNN(Binarized Neural Networks)
● FINN(Fast, scalable binarized Neural Network Interface)
● CGAN
● DCGAN
28

More Related Content

PPTX
NLP in 2020
PDF
resume_parbhat
PDF
Towards Automatic Code Selection with ppOpen-AT: A Case of FDM - Variants of ...
PDF
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
PDF
Two-level Just-in-Time Compilation with One Interpreter and One Engine
PPTX
What's new in AI in 2020 (very short)
PDF
Transformers in 2021
PDF
Gayathri_Physical_Design_Intel
NLP in 2020
resume_parbhat
Towards Automatic Code Selection with ppOpen-AT: A Case of FDM - Variants of ...
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
Two-level Just-in-Time Compilation with One Interpreter and One Engine
What's new in AI in 2020 (very short)
Transformers in 2021
Gayathri_Physical_Design_Intel

Similar to B-DCGAN Slides for ICONIP2019 (20)

PDF
ICTP Workshop 2022
PDF
INFN SOSC 2022 Talk
PDF
Cygnus - World First Multi-Hybrid Accelerated Cluster with GPU and FPGA Coupling
PDF
digitaldesign-s20-lecture3b-fpga-afterlecture.pdf
PDF
2013 06-ohkawa-heart-presen
PPTX
Reconfigurable ICs
PDF
Approximate Personalized PageRank on FPGA .
PDF
The basic graphics architecture for all modern PCs and game consoles is similar
PDF
Development of accelerators for ML and I(nference)aaS systems on FPGA
PDF
FPGAs for Supercomputing: The Why and How
PDF
Stadnford University practical presentation.pdf
PDF
Raspberry Pi GPIO Tutorial - Make Your Own Game Console
PDF
DALL-E.pdf
PDF
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
PPTX
Dataflow Visualization using ASCII DAG
PDF
Doomba presentation
PPTX
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
PDF
Mikrotik Bridge Deep Dive
PDF
Mender.io | Develop embedded applications faster | Comparing C and Golang
ICTP Workshop 2022
INFN SOSC 2022 Talk
Cygnus - World First Multi-Hybrid Accelerated Cluster with GPU and FPGA Coupling
digitaldesign-s20-lecture3b-fpga-afterlecture.pdf
2013 06-ohkawa-heart-presen
Reconfigurable ICs
Approximate Personalized PageRank on FPGA .
The basic graphics architecture for all modern PCs and game consoles is similar
Development of accelerators for ML and I(nference)aaS systems on FPGA
FPGAs for Supercomputing: The Why and How
Stadnford University practical presentation.pdf
Raspberry Pi GPIO Tutorial - Make Your Own Game Console
DALL-E.pdf
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
Dataflow Visualization using ASCII DAG
Doomba presentation
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Mikrotik Bridge Deep Dive
Mender.io | Develop embedded applications faster | Comparing C and Golang
Ad

More from Hideo Terada (9)

PDF
2021 09 豆寄席:(公開用)長く生き残るitエンジニアの”リベラル・アーツ”
PPTX
画像処理AIを用いた異常検知
PDF
論文紹介 dhSegment:文書セグメンテーションのための包括的ディープラーニングアプローチ
PDF
FPGA, AI, エッジコンピューティング
PDF
ディープラーニングの2値化(Binarized Neural Network)
PDF
機械学習のための数学のおさらい
PDF
スパースモデリング入門
PDF
データ中心の時代を生き抜くエンジニアに知ってほしい10?のこと
PDF
技術系文書作成のコツ
2021 09 豆寄席:(公開用)長く生き残るitエンジニアの”リベラル・アーツ”
画像処理AIを用いた異常検知
論文紹介 dhSegment:文書セグメンテーションのための包括的ディープラーニングアプローチ
FPGA, AI, エッジコンピューティング
ディープラーニングの2値化(Binarized Neural Network)
機械学習のための数学のおさらい
スパースモデリング入門
データ中心の時代を生き抜くエンジニアに知ってほしい10?のこと
技術系文書作成のコツ
Ad

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Machine Learning_overview_presentation.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Getting Started with Data Integration: FME Form 101
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Group 1 Presentation -Planning and Decision Making .pptx
Machine Learning_overview_presentation.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Getting Started with Data Integration: FME Form 101
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Building Integrated photovoltaic BIPV_UPV.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Digital-Transformation-Roadmap-for-Companies.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
Heart disease approach using modified random forest and particle swarm optimi...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
OMC Textile Division Presentation 2021.pptx
SOPHOS-XG Firewall Administrator PPT.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing

B-DCGAN Slides for ICONIP2019

  • 1. Shouno Lab@UEC B-DCGAN: Evaluation of Binarized DCGAN for FPGA Hideo Terada Chief Technical Officer Open Stream, Inc. https://www.opst.co.jp Hayaru Shouno Graduate School of Informatics and Engineering University of Electro-Communications https://www.uec.ac.jp ICONIP2019
  • 2. Shouno Lab@UEC Outline ● Motivation ○ FPGA for Real-world AI/IoT ● Research Questions ○ Is the GAN on FPGA possible? ○ Is the binarization for GAN possible ? ● Methods ○ Binarization of DCGAN ○ Learn on CPU/GPU, Run on FPGA ○ Experiment ● Results ○ Almost all layers can be binarized ○ Full binarization is NG 2
  • 3. Shouno Lab@UEC Outline ● Motivation ○ FPGA for Real-world AI/IoT ● Research Questions ○ Is the GAN on FPGA possible? ○ Is the binarization for GAN possible ? ● Methods ○ Binarization of DCGAN ○ Learn on CPU/GPU, Run on FPGA ○ Experiment ● Results ○ Almost all layers can be binarized ○ Full binarization is NG 3
  • 4. Shouno Lab@UEC Motivation 1) Evaluating FPGA as low energy edge device for real-world IoT/AI(DNN) 2) Compaction of deep neural network model 4 photo: arrow.com
  • 5. Shouno Lab@UEC FPGA overview ● Field Programmable Gate Array ● User programmable hardware circuits ● Low energy per one operation 5 Fig. Connections configured by user
  • 6. Shouno Lab@UEC FPGA vs Other Processors 6 FPGA GPU CPU Energy Consumption ★★★ ★ ★ Floating Point Units ★ ★★★ ★★ Memory bandwidth ★ ★★★ ★★ Latency ★★★ ★ ★★ Flexibility ★★★ ★ ★ Ease of development ★ ★★ ★★★ Table. Comparison of FPGA, GPU, and CPU
  • 7. Shouno Lab@UEC Outline ● Motivation ○ FPGA for Real-world AI/IoT ● Research Questions ○ Is the GAN on FPGA possible? ○ Is the binarization for GAN possible ? ● Methods ○ Binarization of DCGAN ○ Learn on CPU/GPU, Run on FPGA ○ Experiment ● Results ○ Almost all layers can be binarized ○ Full binarization is NG 7
  • 8. Shouno Lab@UEC Research Question 1) How much can FPGA be capable of DNN? 2) How can the model compaction work well for DNN? 8 We tried to binarize the DCGAN. Reason: A. Binary operations are suitable for FPGA B. Somewhat complicated model C. Not much previous research
  • 9. Shouno Lab@UEC Outline ● Motivation ○ FPGA for Real-world AI/IoT ● Research Questions ○ Is the GAN on FPGA possible? ○ Is the binarization for GAN possible ? ● Methods ○ Binarization of DCGAN ○ Learn on CPU/GPU, Run on FPGA ○ Experiment ● Results ○ Almost all layers can be binarized ○ Full binarization is NG 9
  • 10. Shouno Lab@UEC The Model: DCGAN(with condition) 10 fig. Deep Convolutional & Conditional GAN CGAN[Mirza+ 2014] DCGAN[Radford+ 2015]
  • 12. Shouno Lab@UEC The Network of the Generator 12
  • 13. Shouno Lab@UEC Forward pass, Backward pass 13 The standard NN training process includes a forward pass and a backward pass.
  • 14. Shouno Lab@UEC How to binarize : basic idea 1) In forward-pass Binarized function is not differentiable! 2) Binary Representation a) 1-bit representation mapping: bit value 0 : represents logical value '-1' bit value 1 : represents logical value '+1' 3) Multiplication as XNOR see the table 4) Binarize from real-value. a) Using 'sign' function 14 Real-value(logical) A B A✕B -1 -1 1 -1 +1 -1 +1 -1 -1 +1 +1 1 Binary-value(on FPGA) A B A XNOR B 0 0 1 0 1 0 1 0 0 1 1 1 BNN[Courbariaux+ 2016]
  • 15. Shouno Lab@UEC B-FC: Binary Full Connection Layer 15
  • 16. Shouno Lab@UEC B-Deconv: Binary Deconvolution Layer 16 Same manner as B-FC
  • 17. Shouno Lab@UEC Deconvolution on FPGA 17 IN(2x2) OUT(4x4) Kernel Represented Transposed Circular Matrix C (Input Size)x(Output Size) Figures are from Theano web site Too Large Ci,j = FPGA_func(i, j, stride, padding) # no use memory for matrix C
  • 18. Shouno Lab@UEC B-BNA: Binary Batch Normalization + Activation Layer 18 FINN [Umuroglu+ 2016]
  • 19. Shouno Lab@UEC Experiment ● Purpose Find out what range of layers can be binarized. ● =Which is the best 'scenario'? (scenario:S0, S1-1, S1-2,...) 19 Fig. Scenarios of binarization. The red layer is binarized layer. No Binarize Full Binarize Partial Binarize
  • 20. Shouno Lab@UEC Experiment:Criteria ○ Image Quality by visual assessment on PC to find peak quality(see below) ○ Check Capacity in FPGA 20 Fig. Generated images on training steps
  • 21. Shouno Lab@UEC Outline ● Motivation ○ FPGA for Real-world AI/IoT ● Research Questions ○ Is the GAN on FPGA possible? ○ Is the binarization for GAN possible ? ● Methods ○ Binarization of DCGAN ○ Learn on CPU/GPU, Run on FPGA ○ Experiment ● Results ○ Almost all layers can be binarized ○ Full binarization is NG 21
  • 22. Shouno Lab@UEC Result 22 The Best Scenario: S3-1 ● Most binarized and acceptable quality Fig. Network of the S3-1. Red-colored boxes are binarized layers
  • 23. Shouno Lab@UEC Result The Best Scenario: S3-1 ● Most binarized and acceptable quality 23 Fig. The peak quality images of the scenarios. (a) The peak of S3-1 Acceptable quality (b) The peak of S3-2 (All layer binarization) Unacceptable quality
  • 24. Shouno Lab@UEC Result The Best Scenario: S3-1 24 Scenario BRAM DSP FF LUT S0 Build Error at HLS (capacity overflow) S1-1 S1-2 S2-1 S2-2 91 24 10 37 S3-1 87 22 8 34 S3-2 85 20 8 34 Table. FPGA generator's element utilization(%) for each scenario.
  • 25. Shouno Lab@UEC Conclusion 1) How much can FPGA be capable of DNN? a) Inference: Possible b) Training: Difficult/Not Suitable c) Development: Relatively Difficult 2) How can the model compaction work well for DNN? a) Forward-pass: Possible b) Backward-pass: Difficult 25
  • 26. Shouno Lab@UEC Future direction ● Recent Model-Compaction Research(FYI): ○ QGAN[Wang+ 2019] ■ Multi-precision quantization ○ GAN-MC[Liu+ 2019] ■ GAN assisted model compression ● Productivity improvement ○ Easy FPGA development for AI software developers ○ TensorFlow XLA ■ Offloading model into FPGA 26
  • 27. Shouno Lab@UEC The End, Thanks you! 27
  • 28. Shouno Lab@UEC Previous Research ● BNN(Binarized Neural Networks) ● FINN(Fast, scalable binarized Neural Network Interface) ● CGAN ● DCGAN 28