PRMU201902 Presentation document

田中正行
PRMU研究会2019/02
深層学習に関する
（個人的な）取り組みの紹介

概要
1
１．mgq (Minimal Gram task Queue)
２．train1000 (Train with small samples)
３．WiG (Weighted Sigmoid Gate Unit)
（便利な）シンプルタスクキュー
（練習用）小サンプル学習
新しい活性化関数
https://github.com/likesilkto/mgqueue
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/train1000/
http://www.ok.sc.e.titech.ac.jp/~mtanaka/proj/WiG/

タスクキューと深層学習
2 時間
18:00
20:00
22:00
00:00
02:00
04:00
06:00
08:00
10:00
GPU0 GPU1
Task0
Task1
Task2
Task3
（少ない）GPU資源を効率的に活
用したい
処理終了後，すぐに次の処理を行
いたい！

mgq (Minimal Gram task Queue)
3
% pip install git+https://github.com/likesilkto/mgqueue
インストール:
python application
タスク追加:
% mgq queue_name ad ‘python train1.py’
スタート:
% mgq queue_name start
% mgq queue_name start –gmail [@マークより前のgmail アカウント]
タスク追加:

Train1000 project
4
Cifar-10, 100
学習データ数： 50,000枚
テストデータ数：10,000枚
GPUを使って学習に数時間かかる
いろいろ試すには時間がかかるし，
初学者の練習には大変
少数データから学習できるのか？
1,000個の少数データから、
どれくらい性能が出せるのか？
#train1000

Train1000 project
5
#train1000
mnist
100 samples x 10 classes = 1,000 samples Test Acc.: 0.9786
fashion_mnist
cifar-10
cifar-100

概要
6

Activation Functions for DNNs
Input
x
Activation
function
Weight
Output
y
Conv.
Activation
function
Input
x
Output
y
Activation functions
Sigmoid tanh ReLU
𝜎𝜎 𝑥𝑥 =
1
1 + 𝑒𝑒−𝑥𝑥
max(𝑥𝑥, 0)

Advanced Activation Functions
ReLU
max(𝑥𝑥, 0)
�
𝑥𝑥 (𝑥𝑥 ≥ 0)
𝛼𝛼𝛼𝛼 (𝑥𝑥 < 0)
Leaky ReLU
Parametric ReLU
swish, SiL
𝑥𝑥 𝜎𝜎 𝑤𝑤𝑤𝑤 + 𝑏𝑏
Existing activation functions are
element-wise function.
Dying ReLU:
Dead ReLU units always
return zero.

WiG: Weighted Sigmoid Gate (Proposed)
Existing activation functions are
element-wise function.
Sigmoid Gated Network can be
used as activation function.
Weight
Activation
function
Weight
Activation
networkunit
Proposed WiG (Weighted sigmoid gate unit)
W ×
Wg
WiG activation unit
It is compatible to existing activation functions.
It includes the ReLU.
Sigmoid
W
Wg
×
My recommendation is:
You can improve the network performance just by
replacing the ReLU by the proposed WiG.

WiG: Three-state
10
𝒚𝒚 = 𝜎𝜎 𝑾𝑾𝒈𝒈 𝒙𝒙 + 𝒃𝒃𝒈𝒈 ⊗ (𝑾𝑾𝑾𝑾 + 𝒃𝒃)
人の網膜細胞
オン中心型受容野
オフ中心型受容野
中心が明るいほど
大きな出力
中心が暗いほど
大きな出力
反応なし
⊗
反応の大きさ閾値制御
（符号付の）
反応の大きさ制御
閾値制御
独立に制御できる
（かもしれない）
Uchida, Coupled convolution layer for convolutional neural network, 2018

WiG: 側抑制
11
脳の測抑制
WiGは測抑制を実現できる！
測抑制
ニューロンの空間分布
大きな反応の周辺のニューロンの
反応が抑制される
⊗
反応の大きさ閾値制御
要素独立の活性化関数
測抑制は実現不可能
測抑制を実現するWgを簡単に設計可能

WiG with sparseness constraint
12
スパースネス： yの非ゼロ要素が少ない
スパースネス拘束： 𝜎𝜎 𝑾𝑾𝒈𝒈 𝒙𝒙 + 𝒃𝒃𝒈𝒈 1

WiG  ReLU
13
𝑦𝑦 = 𝜎𝜎(𝛼𝛼𝛼𝛼) × 𝑥𝑥
𝑦𝑦 = 𝜎𝜎(𝛼𝛼𝛼𝛼)
𝛼𝛼 → ∞
𝑦𝑦 = �
0 (𝑥𝑥 < 0)
1 (𝑥𝑥 ≥ 0)
𝑦𝑦 = 𝜎𝜎(𝛼𝛼𝛼𝛼) × 𝑥𝑥
𝛼𝛼 → ∞
𝑦𝑦 = max(0, 𝑥𝑥)
WiGはReLUを再現できる！
既存ネットワークのReLUをWiGに置き換えて，
高性能化できる！（かも）

Experimental Validations
Object recognition
Average accuracy
Image denoising
The reproduction code is available

まとめ
15

PRMU201902 Presentation document

More Related Content

What's hot (20)

Similar to PRMU201902 Presentation document (9)

More from Masayuki Tanaka (20)

PRMU201902 Presentation document