SlideShare a Scribd company logo
Recurrent Neural Networks
Sang Jun Lee
Ph.D. candidate, POSTECH
Email: lsj4u0208@postech.ac.kr
EECE695J 전자전기공학특론J(딥러닝기초및철강공정에의활용) – LECTURE 7 (2017. 11. 10)
2
▣ Lecture 6: Convolutional Neural Network
1-page Review
Convolution layer Pooling layer
32x32x3 image
5x5x3 filter
Convolve (slide)
over all spatial
locations
Activation maps
Depth slice
Max pool with
2x2 filters and
stride 2
“Parameters are shared on
spatial domain”
3
Introduction to recurrent neural network
Vanilla neural network
h
𝑥
𝑦
𝑥𝑥
𝑥 : concatenated data of 𝑥 , 𝑥 , 𝑥 , ⋯
h
𝑥
y
𝑊
𝑊𝑊𝑊
𝑊 𝑊 ; 𝑊 ; 𝑊 ; ⋯
A naive idea for handling sequential data
 We usually want to predict a vector at a time step for a time domain data 𝑥
4
Introduction to recurrent neural network
ℎ
𝑥
𝑦
𝑥𝑥
ℎℎ
𝑊𝑊𝑊
𝑊𝑊𝑊
Recurrent neural network (RNN)
 Assume that
the relation between 𝑥 and 𝑥 is similar to the relation between 𝑥 and 𝑥
→ Parameter sharing for 𝑊
 Identical feature extraction from inputs
→ Parameter sharing for 𝑊
5
Introduction to recurrent neural network
ℎ
𝑥
𝑦
𝑥𝑥
ℎℎ
𝑊𝑊𝑊
𝑊𝑊
Recurrent neural network (RNN)
 Multiple copies of a same network (same function and same paramters)
 ℎ : a hidden state that consists of a vector
ℎ 𝑓 ℎ , 𝑥
ℎ tanh 𝑊 ⋅ ℎ 𝑊 ⋅ 𝑥
𝑦 𝑊 ⋅ ℎ
ℎ
Usually set to 0
Fully-
connected
layer
RNN cell
Input layer
Output layer
(RNN feature)
6
Introduction to recurrent neural network
Various architectures of RNN
 Flexibility for handling various types of data
Vanilla neural network
7
Introduction to recurrent neural network
Various architectures of RNN
 Flexibility for handling various types of data
e.g. machine translation
(sequence of words
→ sequence of words)
8
Introduction to recurrent neural network
Limitations of vanilla RNN
 Vanilla RNN works well for a small time step
 However, the sensitivity of the input values decays over time in a standard RNN
“the clouds are in the sky”
“I grew up in France
…
I speak fluent French.”
9
LSTM (long short-term memory)
 A standard RNN contains a single layer in the repeating module
10
LSTM (long short-term memory)
 A special kind of RNN for learning long-term dependencies
 Introduced by Hochreiter & Schmidhuber (1997)
11
LSTM (long short-term memory)
The key idea of LSTMs : cell state
 The cell state is kind of like a conveyor belt
12
LSTM (long short-term memory)
Forget gate
 LSTM have the ability to remove or add information to the cell state, carefully regulated by
structures call gates
 The decision what information we’re going to throw away from the cell state is made by a
sigmoid layer called forget gate layer
13
LSTM (long short-term memory)
Input gate layer
 Decide what new information we’re going to store in the cell state
 First, input gate layer decide which values we’ll update
 Next, tanh layer creates a vector of new candidate values
 Finally, combine two to create an update to the state
14
LSTM (long short-term memory)
Update
Output
Forget previous information
Add new information
Output is based on the cell state
15
LSTM (long short-term memory)
16
Variants of RNN
Gated Recurrent Unit (GRU)
 Combine the forget and input gates into a single update gate
 Merge the cell state and hidden state
17
Implementation of RNN
Manipulation of time series data
Split raw data into train, validation, and test dataset
def split_data(data, val_size=0.2, test_size=0.2):
ntest = int(round(len(data) * (1 ‐ test_size)))
nval = int(round(len(data.iloc[:ntest]) * (1 ‐ val_size)))
df_train, df_val, df_test = data.iloc[:nval], data.iloc[nval:ntest], 
data.iloc[ntest:]
return df_train, df_val, df_test
train, val, test = split_data(raw_data, val_size=0.2, test_size=0.2)
Raw data
(100%)
Train
(80%)
Validation
(20%)
Test
(20%)
18
Implementation of RNN
Manipulation of time series data
Generate sequence pair (x, y)
def rnn_data(data, time_steps, labels=False):
"""
creates new data frame based on previous observation
* example:
l = [1, 2, 3, 4, 5]
time_steps = 2
‐> labels == False [[1, 2], [2, 3], [3, 4]]
‐> labels == True [3, 4, 5]
"""
rnn_df = []
for i in range(len(data) ‐ time_steps):
if labels:
try:
rnn_df.append(data.iloc[i + time_steps].as_matrix())
except AttributeError:
rnn_df.append(data.iloc[i + time_steps])
else:
data_ = data.iloc[i: i + time_steps].as_matrix()
rnn_df.append(data_ if len(data_.shape) > 1 else [[i] for i in data_])
return np.array(rnn_df)
19
Implementation of RNN
Manipulation of time series data
Generate sequence pair (x, y)
time_steps = 10
train_x = rnn_data(df_train, time_steps, labels=false)
train_y = rnn_data(df_train, time_steps, labels=true)
Training data [1:10000]
x #01
[1, 2, 3, …,10]
y #01
11
…
…
x #02
[2, 3, 4, …,11]
y #02
12
x #9990
[9990, 9991, 9992, …,9999]
y #9990
10000
train_x
train_y
20
Implementation of RNN
Manipulation of time series data
Split each sample data
time_step = 10
x_split = tf.unpack(x_data, time_steps,1)
tf.unpack
1 2 3 10
𝑥 𝑥 𝑥 … 𝑥
…
x #01
[1, 2, 3, …,10]
Placeholder
21
Implementation of RNN
Choose a RNN cell
Connect input and recurrent layer
rnn_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
output, state = tf.nn.rnn(rnn_cell, x_split)
Import tensorflow as tf
num_units = 100
rnn_cell = tf.nn.rnn_cell.BasicRNNCell(num_units)
rnn_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units)
rnn_cell = tf.nn.rnn_cell.GRUCell(num_units)
22
Case study 1: MNIST classification
Hyper parameters for implementing a RNN
 Learning rate, training iteration, batch size, etc.
 Time step, the number of RNN neurons
Placeholder and variable tensor preparation
One-hot encoding 된 라벨
“Sequential processing of
non-sequence data”
23
Case study 1: MNIST classification
RNN cell 구성
28x28 sample을 28개의 28-dimensional vector로 split
Vanilla RNN: rnn.rnn_cell.BasicRNNCell
Output layer 구성
RNN cell의 neuron 개수
각 category에 속할 추정 확률
24
Case study 1: MNIST classification
Define loss and training operation
tf.Session()
Session을 열고 train_op run!
25
Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)
 예측 전략
• 일별 최대전력 수요 예측을 통한 하계 최대전력수요 예측
 알고리즘 개요
• 특별시 및 광역시의 평균 온도를 전력수요 비율로 weighted sum하여 일별 우리나라의 대표 기온 데이터 구성
• 과거 전력/기온데이터를 활용한 RNN/CNN 복합모델 기반의 일별 최대전력수요/기온 예측
• 전력수요 데이터의 특징인 요일과 계절에 따른 주기성을 반영하기 위한 딥러닝 알고리즘 개발
26
Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)
 RNN 구조의 학습을 위한 학습 데이터 구성
• 과거 28일간의 전력/온도데이터를 이용하여 향후 28일간의 전력/온도 예측
 Vanilla RNN model
Electricity (E)
Temperature (T)
Training data Test data
A training sample A label data
Time step Output dimension
Fully-connected layer
RNN cell
Input layer
𝑊
𝑊
𝑊
𝑊
𝐸
𝑇
𝑡𝑡 1
Output layer (→ RNN feature)
27
Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)
 Seasonal data
• 계절성을 학습에 반영하기 위한 데이터 구성
 계절성을 반영하기 위한 CNN model
Electricity (E)
Temperature (T)
1st sample of the training data
Time step (𝑡𝑠) Output dimension
𝑡𝑡 𝑇
𝑡 2𝑇
𝑡 3𝑇
𝑘 x 𝑡𝑠
𝑋
𝑋
𝑋
𝑋
𝑋
𝑋
2𝑘 x 𝑡𝑠
Convolution layer
(2 x 𝑡𝑠 x 1 x 𝐶𝑁𝑁 𝑑𝑒𝑝𝑡ℎ)
𝑘 x 𝐶𝑁𝑁 𝑑𝑒𝑝𝑡ℎ
Fully-connected layer
CNN feature
28
Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)
 RNN과 CNN의 복합 모델
 Training
• Total loss Loss Loss
• Backpropagation via Adam optimizer
CNN feature
(50)
RNN feature
(200)
Fully-connectedlayer
(100)
Outputlayer
Predicted electricity
Outputlayer
Predicted temperature
𝐿𝑜𝑠𝑠
𝐿𝑜𝑠𝑠
RNN cell
(200
Convolutionlayer
(2x28x1x200)
Convolutionlayer
(5x1x200x50)
(100)
Electricity &
Temperature
Seasonal data
29
Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)
 2017년 하계 최대 전력 수요 예측 결과: 86,477MW
(2017년도 하계 최대 전력 수요: 86,298MW, 오차: 0.21%)
 Back testing
• 2016.5.31 이전 데이터로 학습하여 2016.6.1 이후 데이터에 대하여 테스트
• Averaged error rate : 2.37%/2.81% (28-day/56-day prediction)
Introduction to recurrent neural network
- Properties of RNN: parameter sharing
- Various architectures
- Limitation
LSTM (long short-term memory)
- Components of LSTM
- Forget gate, input gate, update, output
Implementation of RNN
Case studies
- MNIST classification
- 2017 하계 최대전력수요 예측
30
Summary

More Related Content

PDF
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
PDF
Recurrent Neural Networks
PDF
Recurrent Neural Networks. Part 1: Theory
PDF
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
PPTX
Electricity price forecasting with Recurrent Neural Networks
PDF
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
PPTX
Introduction For seq2seq(sequence to sequence) and RNN
PDF
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Recurrent Neural Networks
Recurrent Neural Networks. Part 1: Theory
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Electricity price forecasting with Recurrent Neural Networks
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Introduction For seq2seq(sequence to sequence) and RNN
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)

What's hot (20)

PPTX
PDF
Multidimensional RNN
PDF
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
PDF
Deep Learning: Recurrent Neural Network (Chapter 10)
PDF
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
PDF
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
PDF
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
PDF
RNN and its applications
PPTX
PPTX
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
PDF
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
PDF
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
PPTX
RNN & LSTM: Neural Network for Sequential Data
PDF
PDF
Long Short Term Memory
PPTX
Understanding RNN and LSTM
PDF
Recent Progress in RNN and NLP
PPTX
TypeScript and Deep Learning
PDF
LSTM Tutorial
PDF
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Multidimensional RNN
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
RNN and its applications
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Convolutional Neural Networks - Veronica Vilaplana - UPC Barcelona 2018
Learning Financial Market Data with Recurrent Autoencoders and TensorFlow
RNN & LSTM: Neural Network for Sequential Data
Long Short Term Memory
Understanding RNN and LSTM
Recent Progress in RNN and NLP
TypeScript and Deep Learning
LSTM Tutorial
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Ad

Similar to Lecture 7: Recurrent Neural Networks (20)

PDF
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
PPTX
10.0 SequenceModeling-merged-compressed_edited.pptx
PDF
Recurrent Neural Networks
PDF
Convolutional and Recurrent Neural Networks
PDF
Concepts of Temporal CNN, Recurrent Neural Network, Attention
PPTX
Introduction to deep learning
PPTX
240219_RNN, LSTM code.pptxdddddddddddddddd
PDF
rnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PDF
A Hybrid Deep Neural Network Model For Time Series Forecasting
PDF
Lecture on Recurrent Neural Network (RNN)
PPT
14889574 dl ml RNN Deeplearning MMMm.ppt
PPTX
Recurrent Neural Network
PDF
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
PDF
Recurrent and Recursive Nets (part 2)
PPTX
Recurrent Neural Networks from scratch.pptx
PPTX
recurrent_neural_networks_april_2020.pptx
PDF
Introduction to Recurrent Neural Network
PDF
Recurrent Neural Networks
PPTX
RNN and LSTM model description and working advantages and disadvantages
PPTX
Recurrent neural network
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
10.0 SequenceModeling-merged-compressed_edited.pptx
Recurrent Neural Networks
Convolutional and Recurrent Neural Networks
Concepts of Temporal CNN, Recurrent Neural Network, Attention
Introduction to deep learning
240219_RNN, LSTM code.pptxdddddddddddddddd
rnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
A Hybrid Deep Neural Network Model For Time Series Forecasting
Lecture on Recurrent Neural Network (RNN)
14889574 dl ml RNN Deeplearning MMMm.ppt
Recurrent Neural Network
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks
Recurrent and Recursive Nets (part 2)
Recurrent Neural Networks from scratch.pptx
recurrent_neural_networks_april_2020.pptx
Introduction to Recurrent Neural Network
Recurrent Neural Networks
RNN and LSTM model description and working advantages and disadvantages
Recurrent neural network
Ad

More from Sang Jun Lee (7)

PDF
[5분 논문요약] Structured Knowledge Distillation for Semantic Segmentation
PDF
Lecture 6: Convolutional Neural Networks
PDF
Lecture 5: Neural Networks II
PDF
Lecture 4: Neural Networks I
PDF
Lecture 3: Unsupervised Learning
PDF
Lecture 2: Supervised Learning
PDF
Lecture 1: Introduction to Python and TensorFlow
[5분 논문요약] Structured Knowledge Distillation for Semantic Segmentation
Lecture 6: Convolutional Neural Networks
Lecture 5: Neural Networks II
Lecture 4: Neural Networks I
Lecture 3: Unsupervised Learning
Lecture 2: Supervised Learning
Lecture 1: Introduction to Python and TensorFlow

Recently uploaded (20)

PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
composite construction of structures.pdf
PPTX
web development for engineering and engineering
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Construction Project Organization Group 2.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPT
Mechanical Engineering MATERIALS Selection
DOCX
573137875-Attendance-Management-System-original
PPT
Project quality management in manufacturing
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Digital Logic Computer Design lecture notes
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
composite construction of structures.pdf
web development for engineering and engineering
CYBER-CRIMES AND SECURITY A guide to understanding
Construction Project Organization Group 2.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Mechanical Engineering MATERIALS Selection
573137875-Attendance-Management-System-original
Project quality management in manufacturing
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Digital Logic Computer Design lecture notes
UNIT 4 Total Quality Management .pptx
Foundation to blockchain - A guide to Blockchain Tech
Operating System & Kernel Study Guide-1 - converted.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
Model Code of Practice - Construction Work - 21102022 .pdf

Lecture 7: Recurrent Neural Networks

  • 1. Recurrent Neural Networks Sang Jun Lee Ph.D. candidate, POSTECH Email: lsj4u0208@postech.ac.kr EECE695J 전자전기공학특론J(딥러닝기초및철강공정에의활용) – LECTURE 7 (2017. 11. 10)
  • 2. 2 ▣ Lecture 6: Convolutional Neural Network 1-page Review Convolution layer Pooling layer 32x32x3 image 5x5x3 filter Convolve (slide) over all spatial locations Activation maps Depth slice Max pool with 2x2 filters and stride 2 “Parameters are shared on spatial domain”
  • 3. 3 Introduction to recurrent neural network Vanilla neural network h 𝑥 𝑦 𝑥𝑥 𝑥 : concatenated data of 𝑥 , 𝑥 , 𝑥 , ⋯ h 𝑥 y 𝑊 𝑊𝑊𝑊 𝑊 𝑊 ; 𝑊 ; 𝑊 ; ⋯ A naive idea for handling sequential data  We usually want to predict a vector at a time step for a time domain data 𝑥
  • 4. 4 Introduction to recurrent neural network ℎ 𝑥 𝑦 𝑥𝑥 ℎℎ 𝑊𝑊𝑊 𝑊𝑊𝑊 Recurrent neural network (RNN)  Assume that the relation between 𝑥 and 𝑥 is similar to the relation between 𝑥 and 𝑥 → Parameter sharing for 𝑊  Identical feature extraction from inputs → Parameter sharing for 𝑊
  • 5. 5 Introduction to recurrent neural network ℎ 𝑥 𝑦 𝑥𝑥 ℎℎ 𝑊𝑊𝑊 𝑊𝑊 Recurrent neural network (RNN)  Multiple copies of a same network (same function and same paramters)  ℎ : a hidden state that consists of a vector ℎ 𝑓 ℎ , 𝑥 ℎ tanh 𝑊 ⋅ ℎ 𝑊 ⋅ 𝑥 𝑦 𝑊 ⋅ ℎ ℎ Usually set to 0 Fully- connected layer RNN cell Input layer Output layer (RNN feature)
  • 6. 6 Introduction to recurrent neural network Various architectures of RNN  Flexibility for handling various types of data Vanilla neural network
  • 7. 7 Introduction to recurrent neural network Various architectures of RNN  Flexibility for handling various types of data e.g. machine translation (sequence of words → sequence of words)
  • 8. 8 Introduction to recurrent neural network Limitations of vanilla RNN  Vanilla RNN works well for a small time step  However, the sensitivity of the input values decays over time in a standard RNN “the clouds are in the sky” “I grew up in France … I speak fluent French.”
  • 9. 9 LSTM (long short-term memory)  A standard RNN contains a single layer in the repeating module
  • 10. 10 LSTM (long short-term memory)  A special kind of RNN for learning long-term dependencies  Introduced by Hochreiter & Schmidhuber (1997)
  • 11. 11 LSTM (long short-term memory) The key idea of LSTMs : cell state  The cell state is kind of like a conveyor belt
  • 12. 12 LSTM (long short-term memory) Forget gate  LSTM have the ability to remove or add information to the cell state, carefully regulated by structures call gates  The decision what information we’re going to throw away from the cell state is made by a sigmoid layer called forget gate layer
  • 13. 13 LSTM (long short-term memory) Input gate layer  Decide what new information we’re going to store in the cell state  First, input gate layer decide which values we’ll update  Next, tanh layer creates a vector of new candidate values  Finally, combine two to create an update to the state
  • 14. 14 LSTM (long short-term memory) Update Output Forget previous information Add new information Output is based on the cell state
  • 16. 16 Variants of RNN Gated Recurrent Unit (GRU)  Combine the forget and input gates into a single update gate  Merge the cell state and hidden state
  • 17. 17 Implementation of RNN Manipulation of time series data Split raw data into train, validation, and test dataset def split_data(data, val_size=0.2, test_size=0.2): ntest = int(round(len(data) * (1 ‐ test_size))) nval = int(round(len(data.iloc[:ntest]) * (1 ‐ val_size))) df_train, df_val, df_test = data.iloc[:nval], data.iloc[nval:ntest],  data.iloc[ntest:] return df_train, df_val, df_test train, val, test = split_data(raw_data, val_size=0.2, test_size=0.2) Raw data (100%) Train (80%) Validation (20%) Test (20%)
  • 18. 18 Implementation of RNN Manipulation of time series data Generate sequence pair (x, y) def rnn_data(data, time_steps, labels=False): """ creates new data frame based on previous observation * example: l = [1, 2, 3, 4, 5] time_steps = 2 ‐> labels == False [[1, 2], [2, 3], [3, 4]] ‐> labels == True [3, 4, 5] """ rnn_df = [] for i in range(len(data) ‐ time_steps): if labels: try: rnn_df.append(data.iloc[i + time_steps].as_matrix()) except AttributeError: rnn_df.append(data.iloc[i + time_steps]) else: data_ = data.iloc[i: i + time_steps].as_matrix() rnn_df.append(data_ if len(data_.shape) > 1 else [[i] for i in data_]) return np.array(rnn_df)
  • 19. 19 Implementation of RNN Manipulation of time series data Generate sequence pair (x, y) time_steps = 10 train_x = rnn_data(df_train, time_steps, labels=false) train_y = rnn_data(df_train, time_steps, labels=true) Training data [1:10000] x #01 [1, 2, 3, …,10] y #01 11 … … x #02 [2, 3, 4, …,11] y #02 12 x #9990 [9990, 9991, 9992, …,9999] y #9990 10000 train_x train_y
  • 20. 20 Implementation of RNN Manipulation of time series data Split each sample data time_step = 10 x_split = tf.unpack(x_data, time_steps,1) tf.unpack 1 2 3 10 𝑥 𝑥 𝑥 … 𝑥 … x #01 [1, 2, 3, …,10] Placeholder
  • 21. 21 Implementation of RNN Choose a RNN cell Connect input and recurrent layer rnn_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units) output, state = tf.nn.rnn(rnn_cell, x_split) Import tensorflow as tf num_units = 100 rnn_cell = tf.nn.rnn_cell.BasicRNNCell(num_units) rnn_cell = tf.nn.rnn_cell.BasicLSTMCell(num_units) rnn_cell = tf.nn.rnn_cell.GRUCell(num_units)
  • 22. 22 Case study 1: MNIST classification Hyper parameters for implementing a RNN  Learning rate, training iteration, batch size, etc.  Time step, the number of RNN neurons Placeholder and variable tensor preparation One-hot encoding 된 라벨 “Sequential processing of non-sequence data”
  • 23. 23 Case study 1: MNIST classification RNN cell 구성 28x28 sample을 28개의 28-dimensional vector로 split Vanilla RNN: rnn.rnn_cell.BasicRNNCell Output layer 구성 RNN cell의 neuron 개수 각 category에 속할 추정 확률
  • 24. 24 Case study 1: MNIST classification Define loss and training operation tf.Session() Session을 열고 train_op run!
  • 25. 25 Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)  예측 전략 • 일별 최대전력 수요 예측을 통한 하계 최대전력수요 예측  알고리즘 개요 • 특별시 및 광역시의 평균 온도를 전력수요 비율로 weighted sum하여 일별 우리나라의 대표 기온 데이터 구성 • 과거 전력/기온데이터를 활용한 RNN/CNN 복합모델 기반의 일별 최대전력수요/기온 예측 • 전력수요 데이터의 특징인 요일과 계절에 따른 주기성을 반영하기 위한 딥러닝 알고리즘 개발
  • 26. 26 Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)  RNN 구조의 학습을 위한 학습 데이터 구성 • 과거 28일간의 전력/온도데이터를 이용하여 향후 28일간의 전력/온도 예측  Vanilla RNN model Electricity (E) Temperature (T) Training data Test data A training sample A label data Time step Output dimension Fully-connected layer RNN cell Input layer 𝑊 𝑊 𝑊 𝑊 𝐸 𝑇 𝑡𝑡 1 Output layer (→ RNN feature)
  • 27. 27 Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)  Seasonal data • 계절성을 학습에 반영하기 위한 데이터 구성  계절성을 반영하기 위한 CNN model Electricity (E) Temperature (T) 1st sample of the training data Time step (𝑡𝑠) Output dimension 𝑡𝑡 𝑇 𝑡 2𝑇 𝑡 3𝑇 𝑘 x 𝑡𝑠 𝑋 𝑋 𝑋 𝑋 𝑋 𝑋 2𝑘 x 𝑡𝑠 Convolution layer (2 x 𝑡𝑠 x 1 x 𝐶𝑁𝑁 𝑑𝑒𝑝𝑡ℎ) 𝑘 x 𝐶𝑁𝑁 𝑑𝑒𝑝𝑡ℎ Fully-connected layer CNN feature
  • 28. 28 Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)  RNN과 CNN의 복합 모델  Training • Total loss Loss Loss • Backpropagation via Adam optimizer CNN feature (50) RNN feature (200) Fully-connectedlayer (100) Outputlayer Predicted electricity Outputlayer Predicted temperature 𝐿𝑜𝑠𝑠 𝐿𝑜𝑠𝑠 RNN cell (200 Convolutionlayer (2x28x1x200) Convolutionlayer (5x1x200x50) (100) Electricity & Temperature Seasonal data
  • 29. 29 Case study 2 (2017년도 하계 최대전력수요 예측, 대한전기학회)  2017년 하계 최대 전력 수요 예측 결과: 86,477MW (2017년도 하계 최대 전력 수요: 86,298MW, 오차: 0.21%)  Back testing • 2016.5.31 이전 데이터로 학습하여 2016.6.1 이후 데이터에 대하여 테스트 • Averaged error rate : 2.37%/2.81% (28-day/56-day prediction)
  • 30. Introduction to recurrent neural network - Properties of RNN: parameter sharing - Various architectures - Limitation LSTM (long short-term memory) - Components of LSTM - Forget gate, input gate, update, output Implementation of RNN Case studies - MNIST classification - 2017 하계 최대전력수요 예측 30 Summary