SlideShare a Scribd company logo
Lab 1
TensorFlow Basics
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 1
TensorFlow Basics
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://www.tensorflow.org
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 1
TensorFlow Basics
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://www.tensorflow.org
https://twitter.com/fchollet/status/830499993450450944/
TensorFlow
https://twitter.com/fchollet/status/830499993450450944/
TensorFlow
● TensorFlow™ is an open source software library for
numerical computation using data flow graphs.
● Python!
https://www.tensorflow.org/
What is a Data Flow Graph?
● Nodes in the graph represent
mathematical operations
● Edges represent the
multidimensional data arrays
(tensors) communicated
between them.
https://www.tensorflow.org/
Installing TensorFlow
● Linux, Max OSX, Windows
• (sudo -H) pip install --upgrade tensorflow
• (sudo -H) pip install --upgrade tensorflow-gpu
● From source
• bazel ...
• https://www.tensorflow.org/install/install_sources
● Google search/Community help
• https://www.facebook.com/groups/TensorFlowKR/
https://www.tensorflow.org/install/
Check installation and version
Sungs-MacBook-Pro:hunkim$ python3
Python 3.6.0 (v3.6.0:41df79263a11, Dec 22 2016, 17:23:13)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more
information.
>>> import tensorflow as tf
>>> tf.__version__
'1.0.0'
>>>
https://github.com/hunkim/DeepLearningZeroToAll/
TensorFlow Hello World!
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-01-basics.ipynb
b’String’ ‘b’ indicates Bytes literals. http://stackoverflow.com/questions/6269765/
Computational Graph
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-01-basics.ipynb
TensorFlow Mechanics
feed data and run graph (operation)
sess.run (op)
update variables
in the graph
(and return values)
Build graph using
TensorFlow operations
Computational Graph
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-01-basics.ipynb
(1) Build graph (tensors) using TensorFlow operations
(2) feed data and run graph (operation)
sess.run (op)
(3) update variables in the graph
(and return values)
Placeholder
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-01-basics.ipynb
TensorFlow Mechanics
feed data and run graph (operation)
sess.run (op, feed_dict={x: x_data})
update variables
in the graph
(and return values)
Build graph using
TensorFlow operations
Everything is Tensor
t = tf.Constant([1., 2., 3.])
Tensor Ranks, Shapes, and Types
https://www.tensorflow.org/programmers_guide/dims_types
Tensor Ranks, Shapes, and Types
https://www.tensorflow.org/programmers_guide/dims_types
Tensor Ranks, Shapes, and Types
https://www.quora.com/When-should-I-use-tf-float32-vs-tf-float64-in-TensorFlow
...
TensorFlow Mechanics
feed data and run graph (operation)
sess.run (op, feed_dict={x: x_data})
update variables
in the graph
(and return values)
Build graph using
TensorFlow operations
Lab 2
Linear Regression
Sung Kim <hunkim+ml@gmail.com>
Tensor flow description of ML Lab. document
Variables
https://www.tensorflow.org/programmers_guide/variables
Lab 2
Linear Regression
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 2
Linear Regression
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Hypothesis and cost function
TensorFlow Mechanics
feed data and run graph (operation)
sess.run (op, feed_dict={x: x_data})
update variables
in the graph
(and return values)
Build graph using
TensorFlow operations
Build graph using TF operations
# X and Y data
x_train = [1, 2, 3]
y_train = [1, 2, 3]
W = tf.Variable(tf.random_normal([1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Our hypothesis XW+b
hypothesis = x_train * W + b
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - y_train))
Build graph using TF operations
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(cost)
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - y_train))
GradientDescent
https://www.tensorflow.org/api_docs/python/tf/reduce_mean
Run/update graph and get results
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Fit the line
for step in range(2001):
sess.run(train)
if step % 20 == 0:
print(step, sess.run(cost), sess.run(W), sess.run(b))
import tensorflow as tf
# X and Y data
x_train = [1, 2, 3]
y_train = [1, 2, 3]
W = tf.Variable(tf.random_normal([1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Our hypothesis XW+b
hypothesis = x_train * W + b
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - y_train))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Fit the line
for step in range(2001):
sess.run(train)
if step % 20 == 0:
print(step, sess.run(cost), sess.run(W), sess.run(b))
'''
0 2.82329 [ 2.12867713] [-0.85235667]
20 0.190351 [ 1.53392804] [-1.05059612]
40 0.151357 [ 1.45725465] [-1.02391243]
...
1920 1.77484e-05 [ 1.00489295] [-0.01112291]
1940 1.61197e-05 [ 1.00466311] [-0.01060018]
1960 1.46397e-05 [ 1.004444] [-0.01010205]
1980 1.32962e-05 [ 1.00423515] [-0.00962736]
2000 1.20761e-05 [ 1.00403607] [-0.00917497]
'''
Full code (less than 20 lines)
Placeholders
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-01-basics.ipynb
Placeholders
# X and Y data
x_train = [1, 2, 3]
y_train = [1, 2, 3]
# Now we can use X and Y in place of x_data and y_data
# # placeholders for a tensor that will be always fed using feed_dict
# See http://stackoverflow.com/questions/36693740/
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
...
# Fit the line
# Fit the line
for step in range(2001):
cost_val, W_val, b_val, _ = 
sess.run([cost, W, b, train],
feed_dict={X: [1, 2, 3], Y: [1, 2, 3]})
if step % 20 == 0:
print(step, cost_val, W_val, b_val)
import tensorflow as tf
W = tf.Variable(tf.random_normal([1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
X = tf.placeholder(tf.float32, shape=[None])
Y = tf.placeholder(tf.float32, shape=[None])
# Our hypothesis XW+b
hypothesis = X * W + b
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Fit the line
for step in range(2001):
cost_val, W_val, b_val, _ = sess.run([cost, W, b, train],
feed_dict={X: [1, 2, 3], Y: [1, 2, 3]})
if step % 20 == 0:
print(step, cost_val, W_val, b_val)
...
1980 1.32962e-05 [ 1.00423515] [-0.00962736]
2000 1.20761e-05 [ 1.00403607] [-0.00917497]
# Testing our model
print(sess.run(hypothesis, feed_dict={X: [5]}))
print(sess.run(hypothesis, feed_dict={X: [2.5]}))
print(sess.run(hypothesis,
feed_dict={X: [1.5, 3.5]}))
[ 5.0110054]
[ 2.50091505]
[ 1.49687922 3.50495124]
Full code with placeholders
import tensorflow as tf
W = tf.Variable(tf.random_normal([1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
X = tf.placeholder(tf.float32, shape=[None])
Y = tf.placeholder(tf.float32, shape=[None])
# Our hypothesis XW+b
hypothesis = X * W + b
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Fit the line with new training data
for step in range(2001):
cost_val, W_val, b_val, _ = sess.run([cost, W, b, train],
feed_dict={X: [1, 2, 3, 4, 5],
Y: [2.1, 3.1, 4.1, 5.1, 6.1]})
if step % 20 == 0:
print(step, cost_val, W_val, b_val)
…
1960 3.32396e-07 [ 1.00037301] [ 1.09865296]
1980 2.90429e-07 [ 1.00034881] [ 1.09874094]
2000 2.5373e-07 [ 1.00032604] [ 1.09882331]
# Testing our model
print(sess.run(hypothesis, feed_dict={X: [5]}))
print(sess.run(hypothesis, feed_dict={X: [2.5]}))
print(sess.run(hypothesis,
feed_dict={X: [1.5, 3.5]}))
[ 6.10045338]
[ 3.59963846]
[ 2.59931231 4.59996414]
Full code with placeholders
TensorFlow Mechanics
feed data and run graph (operation)
sess.run (op, feed_dict={x: x_data})
update variables
in the graph
(and return values)
Build graph using
TensorFlow operations
feed_dict={X: [1, 2, 3, 4, 5],
Y: [2.1, 3.1, 4.1, 5.1, 6.1]})
Lab 3
Minimizing Cost
Sung Kim <hunkim+ml@gmail.com>
With TF 1.0!
Lab 3
Minimizing Cost
With TF 1.0!
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 3
Minimizing Cost
With TF 1.0!
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
https://github.com/hunkim/DeepLearningZeroToAll/
Simplified hypothesis
import tensorflow as tf
import matplotlib.pyplot as plt
X = [1, 2, 3]
Y = [1, 2, 3]
W = tf.placeholder(tf.float32)
# Our hypothesis for linear model X * W
hypothesis = X * W
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Variables for plotting cost function
W_val = []
cost_val = []
for i in range(-30, 50):
feed_W = i * 0.1
curr_cost, curr_W = sess.run([cost, W], feed_dict={W: feed_W})
W_val.append(curr_W)
cost_val.append(curr_cost)
# Show the cost function
plt.plot(W_val, cost_val)
plt.show()
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-1-minimizing_cost_show_graph.py
http://matplotlib.org/users/installing.html
W
cost (W)
import tensorflow as tf
import matplotlib.pyplot as plt
X = [1, 2, 3]
Y = [1, 2, 3]
W = tf.placeholder(tf.float32)
# Our hypothesis for linear model X * W
hypothesis = X * W
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Variables for plotting cost function
W_val = []
cost_val = []
for i in range(-30, 50):
feed_W = i * 0.1
curr_cost, curr_W = sess.run([cost, W], feed_dict={W: feed_W})
W_val.append(curr_W)
cost_val.append(curr_cost)
# Show the cost function
plt.plot(W_val, cost_val)
plt.show()
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-1-minimizing_cost_show_graph.py
W
cost (W)
Gradient descent
W
cost (W)
Gradient descent
# Minimize: Gradient Descent using derivative:
W -= learning_rate * derivative
learning_rate = 0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W - learning_rate * gradient
update = W.assign(descent)
import tensorflow as tf
x_data = [1, 2, 3]
y_data = [1, 2, 3]
W = tf.Variable(tf.random_normal([1]), name='weight')
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
# Our hypothesis for linear model X * W
hypothesis = X * W
# cost/loss function
cost = tf.reduce_sum(tf.square(hypothesis - Y))
# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative
learning_rate = 0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W - learning_rate * gradient
update = W.assign(descent)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(21):
sess.run(update, feed_dict={X: x_data, Y: y_data})
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-2-minimizing_cost_gradient_update.py
import tensorflow as tf
x_data = [1, 2, 3]
y_data = [1, 2, 3]
W = tf.Variable(tf.random_normal([1]), name='weight')
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
# Our hypothesis for linear model X * W
hypothesis = X * W
# cost/loss function
cost = tf.reduce_sum(tf.square(hypothesis - Y))
# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative
learning_rate = 0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W - learning_rate * gradient
update = W.assign(descent)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(21):
sess.run(update, feed_dict={X: x_data, Y: y_data})
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-2-minimizing_cost_gradient_update.py
0 5.81756 [ 1.64462376]
1 1.65477 [ 1.34379935]
2 0.470691 [ 1.18335962]
3 0.133885 [ 1.09779179]
4 0.0380829 [ 1.05215561]
5 0.0108324 [ 1.0278163]
6 0.00308123 [ 1.01483536]
7 0.000876432 [ 1.00791216]
8 0.00024929 [ 1.00421977]
9 7.09082e-05 [ 1.00225055]
10 2.01716e-05 [ 1.00120032]
11 5.73716e-06 [ 1.00064015]
12 1.6319e-06 [ 1.00034142]
13 4.63772e-07 [ 1.00018203]
14 1.31825e-07 [ 1.00009704]
15 3.74738e-08 [ 1.00005174]
16 1.05966e-08 [ 1.00002754]
17 2.99947e-09 [ 1.00001466]
18 8.66635e-10 [ 1.00000787]
19 2.40746e-10 [ 1.00000417]
20 7.02158e-11 [ 1.00000226]
# Minimize: Gradient Descent Magic
optimizer =
tf.train.GradientDescentOptimizer(learning_rate=0.1)
train = optimizer.minimize(cost)
import tensorflow as tf
x_data = [1, 2, 3]
y_data = [1, 2, 3]
W = tf.Variable(tf.random_normal([1]), name='weight')
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
# Our hypothesis for linear model X * W
hypothesis = X * W
# cost/loss function
cost = tf.reduce_sum(tf.square(hypothesis - Y))
# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative
learning_rate = 0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W - learning_rate * gradient
update = W.assign(descent)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(21):
sess.run(update, feed_dict={X: x_data, Y: y_data})
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-2-minimizing_cost_gradient_update.py
Output when W=5
import tensorflow as tf
# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]
# Set wrong model weights
W = tf.Variable(5.0)
# Linear model
hypothesis = X * W
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize: Gradient Descent Magic
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(100):
print(step, sess.run(W))
sess.run(train)
0 5.0
1 1.26667
2 1.01778
3 1.00119
4 1.00008
5 1.00001
6 1.0
7 1.0
8 1.0
9 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-3-minimizing_cost_tf_optimizer.py
Output when W=-3
0 -3.0
1 0.733334
2 0.982222
3 0.998815
4 0.999921
5 0.999995
6 1.0
7 1.0
8 1.0
9 1.0
import tensorflow as tf
# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]
# Set wrong model weights
W = tf.Variable(-3.0)
# Linear model
hypothesis = X * W
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize: Gradient Descent Magic
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(100):
print(step, sess.run(W))
sess.run(train) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-3-minimizing_cost_tf_optimizer.py
import tensorflow as tf
X = [1, 2, 3]
Y = [1, 2, 3]
# Set wrong model weights
W = tf.Variable(5.)
# Linear model
hypothesis = X * W
# Manual gradient
gradient = tf.reduce_mean((W * X - Y) * X) * 2
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
# Get gradients
gvs = optimizer.compute_gradients(cost, [W])
# Apply gradients
apply_gradients = optimizer.apply_gradients(gvs)
# Launch the graph in a session.
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for step in range(100):
print(step, sess.run([gradient, W, gvs]))
sess.run(apply_gradients)
Optional: compute_gradient
and apply_gradient
0 [37.333332, 5.0, [(37.333336, 5.0)]]
1 [33.848888, 4.6266665, [(33.848888, 4.6266665)]]
2 [30.689657, 4.2881775, [(30.689657, 4.2881775)]]
3 [27.825287, 3.9812808, [(27.825287, 3.9812808)]]
4 [25.228262, 3.703028, [(25.228264, 3.703028)]]
...
96 [0.0030694802, 1.0003289, [(0.0030694804, 1.0003289)]]
97 [0.0027837753, 1.0002983, [(0.0027837753, 1.0002983)]]
98 [0.0025234222, 1.0002704, [(0.0025234222, 1.0002704)]]
99 [0.0022875469, 1.0002451, [(0.0022875469, 1.0002451)]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-X-minimizing_cost_tf_gradient.py
Lab 4
Multi-variable linear regression
Sung Kim <hunkim+ml@gmail.com>
With TF 1.0!
Lab 4
Multi-variable linear regression
With TF 1.0!
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 4-1
Multi-variable linear regression
With TF 1.0!
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
https://github.com/hunkim/DeepLearningZeroToAll/
Hypothesis using matrix
x1
x2
x3
Y
73 80 75 152
93 88 93 185
89 91 90 180
96 98 100 196
73 66 70 142
Test Scores for General Psychology
Hypothesis using matrix
x1
x2
x3
Y
73 80 75 152
93 88 93 185
89 91 90 180
96 98 100 196
73 66 70 142
Test Scores for General Psychology
x1_data = [73., 93., 89., 96., 73.]
x2_data = [80., 88., 91., 98., 66.]
x3_data = [75., 93., 90., 100., 70.]
y_data = [152., 185., 180., 196., 142.]
# placeholders for a tensor that will be always fed.
x1 = tf.placeholder(tf.float32)
x2 = tf.placeholder(tf.float32)
x3 = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
w1 = tf.Variable(tf.random_normal([1]), name='weight1')
w2 = tf.Variable(tf.random_normal([1]), name='weight2')
w3 = tf.Variable(tf.random_normal([1]), name='weight3')
b = tf.Variable(tf.random_normal([1]), name='bias')
hypothesis = x1 * w1 + x2 * w2 + x3 * w3 + b
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-1-multi_variable_linear_regression.py
import tensorflow as tf
x1_data = [73., 93., 89., 96., 73.]
x2_data = [80., 88., 91., 98., 66.]
x3_data = [75., 93., 90., 100., 70.]
y_data = [152., 185., 180., 196., 142.]
# placeholders for a tensor that will be always fed.
x1 = tf.placeholder(tf.float32)
x2 = tf.placeholder(tf.float32)
x3 = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
w1 = tf.Variable(tf.random_normal([1]), name='weight1')
w2 = tf.Variable(tf.random_normal([1]), name='weight2')
w3 = tf.Variable(tf.random_normal([1]), name='weight3')
b = tf.Variable(tf.random_normal([1]), name='bias')
hypothesis = x1 * w1 + x2 * w2 + x3 * w3 + b
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize. Need a very small learning rate for this data set
optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(2001):
cost_val, hy_val, _ = sess.run([cost, hypothesis, train],
feed_dict={x1: x1_data, x2: x2_data, x3: x3_data, Y: y_data})
if step % 10 == 0:
print(step, "Cost: ", cost_val, "nPrediction:n", hy_val)
0 Cost: 19614.8
Prediction:
[ 21.69748688
39.10213089 31.82624626
35.14236832
32.55316544]
10 Cost: 14.0682
Prediction:
[ 145.56100464
187.94958496
178.50236511
194.86721802
146.08096313]
...
1990 Cost: 4.9197
Prediction:
[ 148.15084839
186.88632202
179.6293335
195.81796265
144.46044922]
2000 Cost: 4.89449
Prediction:
[ 148.15931702
186.8805542
179.63194275
195.81971741
144.45298767]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-1-multi_variable_linear_regression.py
Matrix
x_data = [[73., 80., 75.], [93., 88., 93.],
[89., 91., 90.], [96., 98., 100.], [73., 66., 70.]]
y_data = [[152.], [185.], [180.], [196.], [142.]]
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 3])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([3, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis
hypothesis = tf.matmul(X, W) + b
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-2-multi_variable_matmul_linear_regression.py
Matrix
import tensorflow as tf
x_data = [[73., 80., 75.], [93., 88., 93.],
[89., 91., 90.], [96., 98., 100.], [73., 66., 70.]]
y_data = [[152.], [185.], [180.], [196.], [142.]]
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 3])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([3, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis
hypothesis = tf.matmul(X, W) + b
# Simplified cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
for step in range(2001):
cost_val, hy_val, _ = sess.run(
[cost, hypothesis, train], feed_dict={X: x_data, Y: y_data})
if step % 10 == 0:
print(step, "Cost: ", cost_val, "nPrediction:n", hy_val)
0 Cost: 7105.46
Prediction:
[[ 80.82241058]
[ 92.26364136]
[ 93.70250702]
[ 98.09217834]
[ 72.51759338]]
10 Cost: 5.89726
Prediction:
[[ 155.35159302]
[ 181.85691833]
[ 181.97254944]
[ 194.21760559]
[ 140.85707092]]
...
1990 Cost: 3.18588
Prediction:
[[ 154.36352539]
[ 182.94833374]
[ 181.85189819]
[ 194.35585022]
[ 142.03240967]]
2000 Cost: 3.1781
Prediction:
[[ 154.35881042]
[ 182.95147705]
[ 181.85035706]
[ 194.35533142]
[ 142.036026 ]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-2-multi_variable_matmul_linear_regression.py
Lab 4-2
Loading Data from File
With TF 1.0!
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
https://github.com/hunkim/DeepLearningZeroToAll/
Loading data from file
data-01-test-score.csv
# EXAM1,EXAM2,EXAM3,FINAL
73,80,75,152
93,88,93,185
89,91,90,180
96,98,100,196
73,66,70,142
53,46,55,101
import numpy as np
xy = np.loadtxt('data-01-test-score.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# Make sure the shape and data are OK
print(x_data.shape, x_data, len(x_data))
print(y_data.shape, y_data)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py
Slicing
http://cs231n.github.io/python-numpy-tutorial/
http://slides.com/wigging
Loading data from file
data-01-test-score.csv
# EXAM1,EXAM2,EXAM3,FINAL
73,80,75,152
93,88,93,185
89,91,90,180
96,98,100,196
73,66,70,142
53,46,55,101
import numpy as np
xy = np.loadtxt('data-01-test-score.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# Make sure the shape and data are OK
print(x_data.shape, x_data, len(x_data))
print(y_data.shape, y_data)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py
import tensorflow as tf
import numpy as np
tf.set_random_seed(777) # for reproducibility
xy = np.loadtxt('data-01-test-score.csv', delimiter=',',
dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# Make sure the shape and data are OK
print(x_data.shape, x_data, len(x_data))
print(y_data.shape, y_data)
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 3])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([3, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis
hypothesis = tf.matmul(X, W) + b
# Simplified cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Set up feed_dict variables inside the loop.
for step in range(2001):
cost_val, hy_val, _ = sess.run(
[cost, hypothesis, train],
feed_dict={X: x_data, Y: y_data})
if step % 10 == 0:
print(step, "Cost: ", cost_val,
"nPrediction:n", hy_val)
# Ask my score
print("Your score will be ", sess.run(hypothesis,
feed_dict={X: [[100, 70, 101]]}))
print("Other scores will be ", sess.run(hypothesis,
feed_dict={X: [[60, 70, 110], [90, 100, 80]]}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py
Your score will be [[ 181.73277283]]
Other scores will be [[ 145.86265564] [ 187.23129272]]
Output
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Set up feed_dict variables inside the loop.
for step in range(2001):
cost_val, hy_val, _ = sess.run(
[cost, hypothesis, train],
feed_dict={X: x_data, Y: y_data})
if step % 10 == 0:
print(step, "Cost: ", cost_val,
"nPrediction:n", hy_val)
# Ask my score
print("Your score will be ", sess.run(hypothesis,
feed_dict={X: [[100, 70, 101]]}))
print("Other scores will be ", sess.run(hypothesis,
feed_dict={X: [[60, 70, 110], [90, 100, 80]]}))
Queue Runners
https://www.tensorflow.org/programmers_guide/reading_data
filename_queue = tf.train.string_input_producer(
['data-01-test-score.csv', 'data-02-test-score.csv', ... ],
shuffle=False, name='filename_queue')
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
record_defaults = [[0.], [0.], [0.], [0.]]
xy = tf.decode_csv(value, record_defaults=record_defaults)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-4-tf_reader_linear_regression.py
tf.train.batch
# collect batches of csv in
train_x_batch, train_y_batch = 
tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10)
sess = tf.Session()
...
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
for step in range(2001):
x_batch, y_batch = sess.run([train_x_batch, train_y_batch])
...
coord.request_stop()
coord.join(threads)
https://www.tensorflow.org/programmers_guide/reading_data
import tensorflow as tf
filename_queue = tf.train.string_input_producer(
['data-01-test-score.csv'], shuffle=False, name='filename_queue')
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[0.], [0.], [0.], [0.]]
xy = tf.decode_csv(value, record_defaults=record_defaults)
# collect batches of csv in
train_x_batch, train_y_batch = 
tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10)
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 3])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([3, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis
hypothesis = tf.matmul(X, W) + b
# Simplified cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)
train = optimizer.minimize(cost)
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
for step in range(2001):
x_batch, y_batch = sess.run([train_x_batch, train_y_batch])
cost_val, hy_val, _ = sess.run(
[cost, hypothesis, train],
feed_dict={X: x_batch, Y: y_batch})
if step % 10 == 0:
print(step, "Cost: ", cost_val,
"nPrediction:n", hy_val)
coord.request_stop()
coord.join(threads)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-4-tf_reader_linear_regression.py
shuffle_batch
# min_after_dequeue defines how big a buffer we will randomly sample
# from -- bigger means better shuffling but slower start up and more
# memory used.
# capacity must be larger than min_after_dequeue and the amount larger
# determines the maximum we will prefetch. Recommendation:
# min_after_dequeue + (num_threads + a small safety margin) * batch_size
min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch(
[example, label], batch_size=batch_size, capacity=capacity,
min_after_dequeue=min_after_dequeue)
https://www.tensorflow.org/programmers_guide/reading_data
Lab 5
Logistic (regression) classifier
Sung Kim <hunkim+ml@gmail.com>
Lab 5
Logistic (regression) classifier
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 5
Logistic (regression) classifier
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Logistic Regression
Training Data
x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]
y_data = [[0], [0], [0], [1], [1], [1]]
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 2])
Y = tf.placeholder(tf.float32, shape=[None, 1])
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py
X = tf.placeholder(tf.float32, shape=[None, 2])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W) + b))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) *
tf.log(1 - hypothesis))
train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)
# Accuracy computation
# True if hypothesis>0.5 else False
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py
Train the model
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
for step in range(10001):
cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})
if step % 200 == 0:
print(step, cost_val)
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy],
feed_dict={X: x_data, Y: y_data})
print("nHypothesis: ", h, "nCorrect (Y): ", c, "nAccuracy: ", a)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py
x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]
y_data = [[0], [0], [0], [1], [1], [1]]
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 2])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis))
train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)
# Accuracy computation
# True if hypothesis>0.5 else False
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
for step in range(10001):
cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})
if step % 200 == 0:
print(step, cost_val)
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy],
feed_dict={X: x_data, Y: y_data})
print("nHypothesis: ", h, "nCorrect (Y): ", c, "nAccuracy: ", a)
# step, cost
0 1.73078
200 0.571512
400 0.507414
...
9600 0.154132
9800 0.151778
10000 0.149496
Hypothesis:
[[ 0.03074029]
[ 0.15884677]
[ 0.30486736]
[ 0.78138196]
[ 0.93957496]
[ 0.98016882]]
Correct (Y):
[[ 0.]
[ 0.]
[ 0.]
[ 1.]
[ 1.]
[ 1.]]
Accuracy: 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py
Classifying diabetes
xy = np.loadtxt('data-03-diabetes.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-2-logistic_regression_diabetes.py
xy = np.loadtxt('data-03-diabetes.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 8])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([8, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis))
train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)
# Accuracy computation
# True if hypothesis>0.5 else False
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))
# Launch graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
feed = {X: x_data, Y: y_data}
for step in range(10001):
sess.run(train, feed_dict=feed)
if step % 200 == 0:
print(step, sess.run(cost, feed_dict=feed))
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict=feed)
print("nHypothesis: ", h, "nCorrect (Y): ", c, "nAccuracy: ", a)
0 0.82794
200 0.755181
400 0.726355
600 0.705179
800 0.686631
...
9600 0.492056
9800 0.491396
10000 0.490767
[ 0.7461012 ]
[ 0.79919308]
[ 0.72995949]
[ 0.88297188]]
[ 1.]
[ 1.]
[ 1.]]
Accuracy:
0.762846
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-2-logistic_regression_diabetes.py
Exercise
● CSV reading using tf.decode_csv
● Try other classification data from Kaggle
○ https://www.kaggle.com
Lab 6
Softmax classifier
Sung Kim <hunkim+ml@gmail.com>
Lab 6
Softmax Classifier
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 6-1
Softmax Classifier
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Softmax function
https://www.udacity.com/course/viewer#!/c-ud730/l-6370362152/m-6379811817
tf.matmul(X,W)+b
hypothesis = tf.nn.softmax(tf.matmul(X,W)+b)
https://www.udacity.com/course/viewer#!/c-ud730/l-6370362152/m-6379811817
Cost function: cross entropy
# Cross entropy cost/loss
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
https://www.udacity.com/course/viewer#!/c-ud730/l-6370362152/m-6379811817
x_data = [[1, 2, 1, 1], [2, 1, 3, 2], [3, 1, 3, 4], [4, 1, 5, 5], [1, 7, 5, 5],
[1, 2, 5, 6], [1, 6, 6, 6], [1, 7, 7, 7]]
y_data = [[0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 1, 0], [0, 1, 0], [0, 1, 0], [1, 0, 0], [1, 0, 0]]
X = tf.placeholder("float", [None, 4])
Y = tf.placeholder("float", [None, 3])
nb_classes = 3
W = tf.Variable(tf.random_normal([4, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')
# tf.nn.softmax computes softmax activations
# softmax = exp(logits) / reduce_sum(exp(logits), dim)
hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)
# Cross entropy cost/loss
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Launch graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for step in range(2001):
sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
if step % 200 == 0:
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-1-softmax_classifier.py
Test & one-hot encoding
# Testing & One-hot encoding
a = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9]]})
print(a, sess.run(tf.arg_max(a, 1)))
hypothesis = tf.nn.softmax(tf.matmul(X,W)+b)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-1-softmax_classifier.py
[[ 1.38904958e-03 9.98601854e-01 9.06129117e-06]] [1]
Test & one-hot encoding
all = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9],
[1, 3, 4, 3],
[1, 1, 0, 1]]})
print(all, sess.run(tf.arg_max(all, 1)))
[[ 1.38904958e-03 9.98601854e-01 9.06129117e-06]
[ 9.31192040e-01 6.29020557e-02 5.90589503e-03]
[ 1.27327668e-08 3.34112905e-04 9.99665856e-01]]
[1 0 2]
hypothesis = tf.nn.softmax(tf.matmul(X,W)+b)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-1-softmax_classifier.py
Lab 6-2
Fancy Softmax Classifier
cross_entropy, one_hot, reshape
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
softmax_cross_entropy_with_logits
logits = tf.matmul(X, W) + b
hypothesis = tf.nn.softmax(logits)
# Cross entropy cost/loss
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
# Cross entropy cost/loss
cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits,
labels=Y_one_hot)
cost = tf.reduce_mean(cost_i)
https://www.udacity.com/course/viewer#!/c-ud730/l-6370362152/m-6379811817
tf.matmul(X,W)
hypothesis = tf.nn.softmax(tf.matmul(X,W))
softmax_cross_entropy_with_logits
logits = tf.matmul(X, W) + b
hypothesis = tf.nn.softmax(logits)
# Cross entropy cost/loss
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
# Cross entropy cost/loss
cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits,
labels=Y_one_hot)
cost = tf.reduce_mean(cost_i)
Animal classification
with softmax_cross_entropy_with_logits
https://kr.pinterest.com/explore/animal-classification-activity/
# Predicting animal type based on various features
xy = np.loadtxt('data-04-zoo.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
tf.one_hot and reshape
Y = tf.placeholder(tf.int32, [None, 1]) # 0 ~ 6, shape=(?, 1)
Y_one_hot = tf.one_hot(Y, nb_classes) # one hot shape=(?, 1, 7)
Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes]) # shape=(?, 7)
If the input indices is rank N, the output will have rank N+1. The new axis is
created at dimension axis (default: the new axis is appended at the end).
https://www.tensorflow.org/api_docs/python/tf/one_hot
# Predicting animal type based on various features
xy = np.loadtxt('data-04-zoo.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
nb_classes = 7 # 0 ~ 6
X = tf.placeholder(tf.float32, [None, 16])
Y = tf.placeholder(tf.int32, [None, 1]) # 0 ~ 6
Y_one_hot = tf.one_hot(Y, nb_classes) # one hot
Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes])
W = tf.Variable(tf.random_normal([16, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')
# tf.nn.softmax computes softmax activations
# softmax = exp(logits) / reduce_sum(exp(logits), dim)
logits = tf.matmul(X, W) + b
hypothesis = tf.nn.softmax(logits)
# Cross entropy cost/loss
cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits,
labels=Y_one_hot)
cost = tf.reduce_mean(cost_i)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-2-softmax_zoo_classifier.py
cost = tf.reduce_mean(cost_i)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
prediction = tf.argmax(hypothesis, 1)
correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Launch graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for step in range(2000):
sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
if step % 100 == 0:
loss, acc = sess.run([cost, accuracy], feed_dict={
X: x_data, Y: y_data})
print("Step: {:5}tLoss: {:.3f}tAcc: {:.2%}".format(
step, loss, acc))
# Let's see if we can predict
pred = sess.run(prediction, feed_dict={X: x_data})
# y_data: (N,1) = flatten => (N, ) matches pred.shape
for p, y in zip(pred, y_data.flatten()):
print("[{}] Prediction: {} True Y: {}".format(p == int(y), p, int(y)))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-2-softmax_zoo_classifier.py
cost = tf.reduce_mean(cost_i)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
prediction = tf.argmax(hypothesis, 1)
correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Launch graph
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for step in range(2000):
sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
if step % 100 == 0:
loss, acc = sess.run([cost, accuracy], feed_dict={
X: x_data, Y: y_data})
print("Step: {:5}tLoss: {:.3f}tAcc: {:.2%}".format(
step, loss, acc))
# Let's see if we can predict
pred = sess.run(prediction, feed_dict={X: x_data})
# y_data: (N,1) = flatten => (N, ) matches pred.shape
for p, y in zip(pred, y_data.flatten()):
print("[{}] Prediction: {} True Y: {}".
format(p == int(y), p, int(y)))
Step: 1100 Loss: 0.101 Acc: 99.01%
Step: 1200 Loss: 0.092 Acc: 100.00%
Step: 1300 Loss: 0.084 Acc: 100.00%
...
[True] Prediction: 0 True Y: 0
[True] Prediction: 0 True Y: 0
[True] Prediction: 3 True Y: 3
[True] Prediction: 0 True Y: 0
[True] Prediction: 0 True Y: 0
[True] Prediction: 0 True Y: 0
[True] Prediction: 0 True Y: 0
[True] Prediction: 3 True Y: 3
[True] Prediction: 3 True Y: 3
[True] Prediction: 0 True Y: 0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-2-softmax_zoo_classifier.py
Lab 7
Learning rate, Evaluation
Sung Kim <hunkim+ml@gmail.com>
Lab 7-1
Learning rate, Evaluation
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 7-1
Learning rate, Evaluation
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Training and Test datasets
x_data = [[1, 2, 1], [1, 3, 2], [1, 3, 4], [1, 5, 5], [1, 7, 5], [1, 2, 5], [1, 6, 6], [1, 7, 7]]
y_data = [[0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 1, 0], [0, 1, 0], [0, 1, 0], [1, 0, 0], [1, 0, 0]]
# Evaluation our model using this test dataset
x_test = [[2, 1, 1], [3, 1, 2], [3, 3, 4]]
y_test = [[0, 0, 1], [0, 0, 1], [0, 0, 1]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py
X = tf.placeholder("float", [None, 3])
Y = tf.placeholder("float", [None, 3])
W = tf.Variable(tf.random_normal([3, 3]))
b = tf.Variable(tf.random_normal([3]))
hypothesis = tf.nn.softmax(tf.matmul(X, W)+b)
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Correct prediction Test model
prediction = tf.arg_max(hypothesis, 1)
is_correct = tf.equal(prediction, tf.arg_max(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
for step in range(201):
cost_val, W_val, _ = sess.run([cost, W, optimizer],
feed_dict={X: x_data, Y: y_data})
print(step, cost_val, W_val)
# predict
print("Prediction:", sess.run(prediction, feed_dict={X: x_test}))
# Calculate the accuracy
print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test}))
199 0.672261 [[-1.15377033 0.28146935
1.13632679]
[ 0.37484586 0.18958236 0.33544877]
[-0.35609841 -0.43973011 -1.25604188]]
200 0.670909 [[-1.15885413 0.28058422
1.14229572]
[ 0.37609792 0.19073224 0.33304682]
[-0.35536593 -0.44033223 -1.2561723 ]]
Prediction: [2 2 2]
Accuracy: 1.0
Training and Test datasets
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py
Learning rate: NaN!
http://sebastianraschka.com/Articles/2015_singlelayer_neurons.html
Big learning rate
2 27.2798 [[ 0.44451016 0.85699677
-1.03748143]
[ 0.48429942 0.98872018 -0.57314301]
[ 1.52989244 1.16229868 -4.74406147]]
3 8.668 [[ 0.12396193 0.61504567
-0.47498202]
[ 0.22003263 -0.2470119 0.9268558 ]
[ 0.96035379 0.41933775 -3.43156195]]
4 5.77111 [[-0.9524312 1.13037777
0.08607888]
[-3.78651619 2.26245379 2.42393875]
[-3.07170963 3.14037919 -2.12054014]]
5 inf [[ nan nan nan]
[ nan nan nan]
[ nan nan nan]]
6 nan [[ nan nan nan]
[ nan nan nan]
[ nan nan nan]]
...
Prediction: [0 0 0]
Accuracy: 0.0
X = tf.placeholder("float", [None, 3])
Y = tf.placeholder("float", [None, 3])
W = tf.Variable(tf.random_normal([3, 3]))
b = tf.Variable(tf.random_normal([3]))
hypothesis = tf.nn.softmax(tf.matmul(X, W)+b)
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer
(learning_rate=1.5).minimize(cost)
# Correct prediction Test model
prediction = tf.arg_max(hypothesis, 1)
is_correct = tf.equal(prediction, tf.arg_max(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
for step in range(201):
cost_val, W_val, _ = sess.run([cost, W, optimizer],
feed_dict={X: x_data, Y: y_data})
print(step, cost_val, W_val)
# predict
print("Prediction:", sess.run(prediction, feed_dict={X: x_test}))
# Calculate the accuracy
print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py
Small learning rate
X = tf.placeholder("float", [None, 3])
Y = tf.placeholder("float", [None, 3])
W = tf.Variable(tf.random_normal([3, 3]))
b = tf.Variable(tf.random_normal([3]))
hypothesis = tf.nn.softmax(tf.matmul(X, W)+b)
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer
(learning_rate=1e-10).minimize(cost)
# Correct prediction Test model
prediction = tf.arg_max(hypothesis, 1)
is_correct = tf.equal(prediction, tf.arg_max(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
for step in range(201):
cost_val, W_val, _ = sess.run([cost, W, optimizer],
feed_dict={X: x_data, Y: y_data})
print(step, cost_val, W_val)
# predict
print("Prediction:", sess.run(prediction, feed_dict={X: x_test}))
# Calculate the accuracy
print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py
0 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
1 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
...
198 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
199 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
200 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
Prediction: [0 0 0]
Accuracy: 0.0
Non-normalized inputs
xy = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973],
[823.02002, 828.070007, 1828100, 821.655029, 828.070007],
[819.929993, 824.400024, 1438100, 818.97998, 824.159973],
[816, 820.958984, 1008100, 815.48999, 819.23999],
[819.359985, 823, 1188100, 818.469971, 818.97998],
[819, 823, 1198100, 816, 820.450012],
[811.700012, 815.25, 1098100, 809.780029, 813.669983],
[809.51001, 816.659973, 1398100, 804.539978, 809.559998]])
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-2-linear_regression_without_min_max.py
Non-normalized
inputs
xy=...
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 4])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([4, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
hypothesis = tf.matmul(X, W) + b
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)
train = optimizer.minimize(cost)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for step in range(2001):
cost_val, hy_val, _ = sess.run(
[cost, hypothesis, train], feed_dict={X: x_data, Y: y_data})
print(step, "Cost: ", cost_val, "nPrediction:n", hy_val)
5 Cost: inf
Prediction:
[[ inf]
[ inf]
[ inf]
...
6 Cost: nan
Prediction:
[[ nan]
[ nan]
[ nan]
...
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-2-linear_regression_without_min_max.py
Normalized inputs (min-max scale)
xy = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973],
[823.02002, 828.070007, 1828100, 821.655029, 828.070007],
[819.929993, 824.400024, 1438100, 818.97998, 824.159973],
[816, 820.958984, 1008100, 815.48999, 819.23999],
[819.359985, 823, 1188100, 818.469971, 818.97998],
[819, 823, 1198100, 816, 820.450012],
[811.700012, 815.25, 1098100, 809.780029, 813.669983],
[809.51001, 816.659973, 1398100, 804.539978, 809.559998]])
[[ 0.99999999 0.99999999 0. 1. 1. ]
[ 0.70548491 0.70439552 1. 0.71881782 0.83755791]
[ 0.54412549 0.50274824 0.57608696 0.606468 0.6606331 ]
[ 0.33890353 0.31368023 0.10869565 0.45989134 0.43800918]
[ 0.51436 0.42582389 0.30434783 0.58504805 0.42624401]
[ 0.49556179 0.42582389 0.31521739 0.48131134 0.49276137]
[ 0.11436064 0. 0.20652174 0.22007776 0.18597238]
[ 0. 0.07747099 0.5326087 0. 0. ]]
xy = MinMaxScaler(xy)
print(xy)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-3-linear_regression_min_max.py
Normalized inputs
xy=...
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# placeholders for a tensor that will be always fed.
X = tf.placeholder(tf.float32, shape=[None, 4])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([4, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
hypothesis = tf.matmul(X, W) + b
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)
train = optimizer.minimize(cost)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for step in range(2001):
cost_val, hy_val, _ = sess.run(
[cost, hypothesis, train], feed_dict={X: x_data, Y: y_data})
print(step, "Cost: ", cost_val, "nPrediction:n", hy_val)
Prediction:
[[ 1.63450289]
[ 0.06628087]
[ 0.35014752]
[ 0.67070574]
[ 0.61131608]
[ 0.61466062]
[ 0.23175186]
[-0.13716528]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-3-linear_regression_min_max.py
Lab 7-2
MNIST data
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
MNIST Dataset
http://yann.lecun.com/exdb/mnist/
28x28x1 image
http://derindelimavi.blogspot.hk/2015/04/mnist-el-yazs-rakam-veri-seti.html
# MNIST data image of shape 28 * 28 = 784
X = tf.placeholder(tf.float32, [None, 784])
# 0 - 9 digits recognition = 10 classes
Y = tf.placeholder(tf.float32, [None, nb_classes])
MNIST Dataset
from tensorflow.examples.tutorials.mnist import input_data
# Check out https://www.tensorflow.org/get_started/mnist/beginners for
# more information about the mnist dataset
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
…
batch_xs, batch_ys = mnist.train.next_batch(100)
…
print("Accuracy: ", accuracy.eval(session=sess,
feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
Reading data and set variables
from tensorflow.examples.tutorials.mnist import input_data
# Check out https://www.tensorflow.org/get_started/mnist/beginners for
# more information about the mnist dataset
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
nb_classes = 10
# MNIST data image of shape 28 * 28 = 784
X = tf.placeholder(tf.float32, [None, 784])
# 0 - 9 digits recognition = 10 classes
Y = tf.placeholder(tf.float32, [None, nb_classes])
W = tf.Variable(tf.random_normal([784, nb_classes]))
b = tf.Variable(tf.random_normal([nb_classes]))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
Softmax!
# Hypothesis (using softmax)
hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Test model
is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1))
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
Training epoch/batch
# parameters
training_epochs = 15
batch_size = 100
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
# Training cycle
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
c, _ = sess.run([cost, optimizer], feed_dict={X: batch_xs, Y: batch_ys})
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
Training epoch/batch
In the neural network terminology:
● one epoch = one forward pass and one backward pass of all the training examples
● batch size = the number of training examples in one forward/backward pass. The higher
the batch size, the more memory space you'll need.
● number of iterations = number of passes, each pass using [batch size] number of
examples. To be clear, one pass = one forward pass + one backward pass (we do not count the
forward pass and backward pass as two different passes).
Example: if you have 1000 training examples, and your batch size is 500, then it will take 2 iterations to
complete 1 epoch.
http://stackoverflow.com/questions/4752626/epoch-vs-iteration-when-training-neural-networks
Training epoch/batch
# parameters
training_epochs = 15
batch_size = 100
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
# Training cycle
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
c, _ = sess.run([cost, optimizer], feed_dict={X: batch_xs, Y: batch_ys})
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
Report results on test dataset
# Test the model using test sets
print("Accuracy: ", accuracy.eval(session=sess,
feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
# parameters
training_epochs = 15
batch_size = 100
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
# Training cycle
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
c, _ = sess.run([cost, optimizer],
feed_dict={X: batch_xs, Y: batch_ys})
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1),
'cost =', '{:.9f}'.format(avg_cost))
Epoch: 0001 cost = 2.868104637
Epoch: 0002 cost = 1.134684615
Epoch: 0003 cost = 0.908220728
Epoch: 0004 cost = 0.794199896
Epoch: 0005 cost = 0.721815854
Epoch: 0006 cost = 0.670184430
Epoch: 0007 cost = 0.630576546
Epoch: 0008 cost = 0.598888191
Epoch: 0009 cost = 0.573027079
Epoch: 0010 cost = 0.550497213
Epoch: 0011 cost = 0.532001859
Epoch: 0012 cost = 0.515517795
Epoch: 0013 cost = 0.501175288
Epoch: 0014 cost = 0.488425370
Epoch: 0015 cost = 0.476968593
Learning finished
Accuracy: 0.888
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
Sample image show and prediction
import matplotlib.pyplot as plt
import random
# Get one and predict
r = random.randint(0, mnist.test.num_examples - 1)
print("Label:", sess.run(tf.argmax(mnist.test.labels[r:r+1], 1)))
print("Prediction:", sess.run(tf.argmax(hypothesis, 1),
feed_dict={X: mnist.test.images[r:r + 1]}))
plt.imshow(mnist.test.images[r:r + 1].
reshape(28, 28), cmap='Greys', interpolation='nearest')
plt.show()
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
Lab 8
Tensor Manipulation
Sung Kim <hunkim+ml@gmail.com>
Lab 8
Tensor Manipulation
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 8
Tensor Manipulation
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Simple 1D array and slicing
Image from http://www.frosteye.net/1233
t = np.array([0., 1., 2., 3., 4., 5., 6.])
Simple 1D array and slicing
Image from http://www.frosteye.net/1233
2D Array
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Shape, Rank, Axis
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Shape, Rank, Axis
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Matmul VS multiply
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Matmul VS multiply
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Broadcasting
https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Broadcasting
https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html
Reduce mean
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Reduce sum
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Argmax
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Reshape**
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Reshape (squeeze, expand)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
One hot
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Casting
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Stack
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Ones and Zeros like
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Zip
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
Tensor flow description of ML Lab. document
Lab 9-1
NN for XOR
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Lab 9
NN for XOR
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 9-1
NN for XOR
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
XOR data set
x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y_data = np.array([[0], [1], [1], [0]], dtype=np.float32)
http://templ25.mandaringardencity.com/xor-gate-truth-table-2/
x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y_data = np.array([[0], [1], [1], [0]], dtype=np.float32)
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis))
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Accuracy computation
# True if hypothesis>0.5 else False
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
for step in range(10001):
sess.run(train, feed_dict={X: x_data, Y: y_data})
if step % 100 == 0:
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data})
print("nHypothesis: ", h, "nCorrect: ", c, "nAccuracy: ", a)
XOR with
logistic regression?
But
it doesn’t work
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-1-xor.py
XOR with
logistic regression?
But
it doesn’t work!
x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y_data = np.array([[0], [1], [1], [0]], dtype=np.float32)
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis))
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Accuracy computation
# True if hypothesis>0.5 else False
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
for step in range(10001):
sess.run(train, feed_dict={X: x_data, Y: y_data})
if step % 100 == 0:
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data})
print("nHypothesis: ", h, "nCorrect: ", c, "nAccuracy: ", a)
Hypothesis:
[[ 0.5]
[ 0.5]
[ 0.5]
[ 0.5]]
Correct:
[[ 0.]
[ 0.]
[ 0.]
[ 0.]]
Accuracy: 0.5
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-1-xor.py
Neural Net
W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-1-xor.py
Neural Net
W = tf.Variable(tf.random_normal([2, 1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1')
b1 = tf.Variable(tf.random_normal([2]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2')
b2 = tf.Variable(tf.random_normal([1]), name='bias2')
hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-2-xor-nn.py
NN for XOR
x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y_data = np.array([[0], [1], [1], [0]], dtype=np.float32)
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1')
b1 = tf.Variable(tf.random_normal([2]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2')
b2 = tf.Variable(tf.random_normal([1]), name='bias2')
hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2)
# cost/loss function
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis))
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
# Accuracy computation
# True if hypothesis>0.5 else False
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
sess.run(tf.global_variables_initializer())
for step in range(10001):
sess.run(train, feed_dict={X: x_data, Y: y_data})
if step % 100 == 0:
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run([W1, W2]))
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy],
feed_dict={X: x_data, Y: y_data})
print("nHypothesis: ", h, "nCorrect: ", c, "nAccuracy: ", a)
Hypothesis:
[[ 0.01338218]
[ 0.98166394]
[ 0.98809403]
[ 0.01135799]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-2-xor-nn.py
Wide NN for XOR
W1 = tf.Variable(tf.random_normal([2, 10]), name='weight1')
b1 = tf.Variable(tf.random_normal([10]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
W2 = tf.Variable(tf.random_normal([10, 1]), name='weight2')
b2 = tf.Variable(tf.random_normal([1]), name='bias2')
hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2)
[2,10], [10,1]
Hypothesis:
[[ 0.00358802]
[ 0.99366933]
[ 0.99204296]
[ 0.0095663 ]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
[2,2], [2,1]
Hypothesis:
[[ 0.01338218]
[ 0.98166394]
[ 0.98809403]
[ 0.01135799]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-3-xor-nn-wide-deep.py
Deep NN for XOR
W1 = tf.Variable(tf.random_normal([2, 10]), name='weight1')
b1 = tf.Variable(tf.random_normal([10]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
W2 = tf.Variable(tf.random_normal([10, 10]), name='weight2')
b2 = tf.Variable(tf.random_normal([10]), name='bias2')
layer2 = tf.sigmoid(tf.matmul(layer1, W2) + b2)
W3 = tf.Variable(tf.random_normal([10, 10]), name='weight3')
b3 = tf.Variable(tf.random_normal([10]), name='bias3')
layer3 = tf.sigmoid(tf.matmul(layer2, W3) + b3)
W4 = tf.Variable(tf.random_normal([10, 1]), name='weight4')
b4 = tf.Variable(tf.random_normal([1]), name='bias4')
hypothesis = tf.sigmoid(tf.matmul(layer3, W4) + b4)
4 layers
Hypothesis:
[[ 7.80e-04]
[ 9.99e-01]
[ 9.98e-01]
[ 1.55e-03]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
2 layers
Hypothesis:
[[ 0.01338218]
[ 0.98166394]
[ 0.98809403]
[ 0.01135799]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-3-xor-nn-wide-deep.py
Exercise
● Wide and Deep NN for MNIST
Lab 9-2
Tensorboard for XOR NN
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
TensorBoard: TF logging/debugging tool
●Visualize your TF graph
●Plot quantitative metrics
●Show additional data
https://www.tensorflow.org/get_started/summaries_and_tensorboard
Old fashion: print, print, print
9400 0.0151413 [array([[ 6.21692038, 6.05913448],
[-6.33773184, -5.75189114]], dtype=float32), array([[ 9.93581772],
[-9.43034935]], dtype=float32)]
9500 0.014909 [array([[ 6.22498751, 6.07049847],
[-6.34637976, -5.76352596]], dtype=float32), array([[ 9.96414757],
[-9.45942593]], dtype=float32)]
9600 0.0146836 [array([[ 6.23292685, 6.08166742],
[-6.35489035, -5.77496052]], dtype=float32), array([[ 9.99207973],
[-9.48807526]], dtype=float32)]
9700 0.0144647 [array([[ 6.24074268, 6.09264851],
[-6.36326933, -5.78619957]], dtype=float32), array([[ 10.01962471],
[ -9.51631165]], dtype=float32)]
9800 0.0142521 [array([[ 6.24843407, 6.10344648],
[-6.37151814, -5.79724932]], dtype=float32), array([[ 10.04679298],
[ -9.54414845]], dtype=float32)]
9900 0.0140456 [array([[ 6.25601053, 6.11406422],
[-6.3796401 , -5.80811596]], dtype=float32), array([[ 10.07359505],
[ -9.57159519]], dtype=float32)]
10000 0.0138448 [array([[ 6.26347113, 6.12451124],
[-6.38764334, -5.81880617]], dtype=float32), array([[ 10.10004139],
[ -9.59866238]], dtype=float32)]
New way!
5 steps of using
TensorBoard
From TF graph, decide which tensors you want to log
w2_hist = tf.summary.histogram("weights2", W2)
cost_summ = tf.summary.scalar("cost", cost)
Merge all summaries
summary = tf.summary.merge_all()
Create writer and add graph
# Create summary writer
writer = tf.summary.FileWriter(‘./logs’)
writer.add_graph(sess.graph)
Run summary merge and add_summary
s, _ = sess.run([summary, optimizer], feed_dict=feed_dict)
writer.add_summary(s, global_step=global_step)
Launch TensorBoard
tensorboard --logdir=./logs
Scalar tensors
cost_summ = tf.summary.scalar("cost", cost)
Histogram (multi-dimensional tensors)
W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2')
b2 = tf.Variable(tf.random_normal([1]), name='bias2')
hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2)
w2_hist = tf.summary.histogram("weights2", W2)
b2_hist = tf.summary.histogram("biases2", b2)
hypothesis_hist = tf.summary.histogram("hypothesis", hypothesis)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py
Add scope for better
graph hierarchy
with tf.name_scope("layer1") as scope:
W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1')
b1 = tf.Variable(tf.random_normal([2]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
w1_hist = tf.summary.histogram("weights1", W1)
b1_hist = tf.summary.histogram("biases1", b1)
layer1_hist = tf.summary.histogram("layer1", layer1)
with tf.name_scope("layer2") as scope:
W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2')
b2 = tf.Variable(tf.random_normal([1]), name='bias2')
hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2)
w2_hist = tf.summary.histogram("weights2", W2)
b2_hist = tf.summary.histogram("biases2", b2)
hypothesis_hist = tf.summary.histogram("hypothesis", hypothesis)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py
Merge summaries and create writer
after creating session
# Summary
summary = tf.summary.merge_all()
# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# Create summary writer
writer = tf.summary.FileWriter(TB_SUMMARY_DIR)
writer.add_graph(sess.graph) # Add graph in the tensorboard
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py
Run merged summary and write (add summary)
s, _ = sess.run([summary, optimizer], feed_dict=feed_dict)
writer.add_summary(s, global_step=global_step)
global_step += 1
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py
Launch tensorboard (local)
writer = tf.summary.FileWriter("./logs/xor_logs")
$ tensorboard —logdir=./logs/xor_logs
Starting TensorBoard b'41' on port 6006
(You can navigate to http://127.0.0.1:6006)
Launch tensorboard (remote server)
ssh -L local_port:127.0.0.1:remote_port username@server.com
local> $ ssh -L 7007:121.0.0.0:6006 hunkim@server.com
server> $ tensorboard —logdir=./logs/xor_logs
(You can navigate to http://127.0.0.1:7007)
Tensor flow description of ML Lab. document
Multiple runs learning_rate=0.1 VS learning_rate=0.01
Multiple runs
tensorboard —logdir=./logs/xor_logs
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
...
writer = tf.summary.FileWriter("./logs/xor_logs")
tensorboard —logdir=./logs/xor_logs_r0_01
train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)
...
writer = tf.summary.FileWriter(“"./logs/xor_logs_r0_01"”)
tensorboard —logdir=./logs
Multiple runs
5 steps of using
TensorBoard
From TF graph, decide which tensors you want to log
w2_hist = tf.summary.histogram("weights2", W2)
cost_summ = tf.summary.scalar("cost", cost)
Merge all summaries
summary = tf.summary.merge_all()
Create writer and add graph
# Create summary writer
writer = tf.summary.FileWriter(‘./logs’)
writer.add_graph(sess.graph)
Run summary merge and add_summary
s, _ = sess.run([summary, optimizer], feed_dict=feed_dict)
writer.add_summary(s, global_step=global_step)
Launch TensorBoard
tensorboard --logdir=./logs
Exercise
● Wide and Deep NN for MNIST
● Add tensorboard
Lab 9-2-E
Tensorboard for MNIST
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Visualizing your Deep learning using
TensorBoard (TensorFlow)
Sung Kim <hunkim+ml@gmail.com>
TensorBoard: TF logging/debugging tool
●Visualize your TF graph
●Plot quantitative metrics
●Show additional data
https://www.tensorflow.org/get_started/summaries_and_tensorboard
Old fashion: print, print, print
New way!
5 steps of using
TensorBoard
From TF graph, decide which tensors you want to log
with tf.variable_scope('layer1') as scope:
tf.summary.image('input', x_image, 3)
tf.summary.histogram("layer", L1)
tf.summary.scalar("loss", cost)
Merge all summaries
summary = tf.summary.merge_all()
Create writer and add graph
# Create summary writer
writer = tf.summary.FileWriter(TB_SUMMARY_DIR)
writer.add_graph(sess.graph)
Run summary merge and add_summary
s, _ = sess.run([summary, optimizer], feed_dict=feed_dict)
writer.add_summary(s, global_step=global_step)
Launch TensorBoard
tensorboard --logdir=/tmp/mnist_logs
Image Input
# Image input
x_image = tf.reshape(X, [-1, 28, 28, 1])
tf.summary.image('input', x_image, 3)
Histogram (multi-dimensional tensors)
with tf.variable_scope('layer1') as scope:
W1 = tf.get_variable("W", shape=[784, 512])
b1 = tf.Variable(tf.random_normal([512]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
L1 = tf.nn.dropout(L1, keep_prob=keep_prob)
tf.summary.histogram("X", X)
tf.summary.histogram("weights", W1)
tf.summary.histogram("bias", b1)
tf.summary.histogram("layer", L1)
Scalar tensors
tf.summary.scalar("loss", cost)
Add scope for better hierarchy
with tf.variable_scope('layer1') as scope:
W1 = tf.get_variable("W", shape=[784, 512],...
b1 = tf.Variable(tf.random_normal([512]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
L1 = tf.nn.dropout(L1, keep_prob=keep_prob)
tf.summary.histogram("X", X)
tf.summary.histogram("weights", W1)
tf.summary.histogram("bias", b1)
tf.summary.histogram("layer", L1)
with tf.variable_scope('layer2') as scope:
...
with tf.variable_scope('layer3') as scope:
...
with tf.variable_scope('layer4') as scope:
...
with tf.variable_scope('layer5') as scope:
...
Merge summaries and create writer
after creating session
# Summary
summary = tf.summary.merge_all()
# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# Create summary writer
writer = tf.summary.FileWriter(TB_SUMMARY_DIR)
writer.add_graph(sess.graph)
Run merged summary and write (add summary)
s, _ = sess.run([summary, optimizer], feed_dict=feed_dict)
writer.add_summary(s, global_step=global_step)
global_step += 1
Launch tensorboard (local)
writer = tf.summary.FileWriter(“/tmp/mnist_logs”)
$ tensorboard —logdir=/tmp/mnist_logs
Starting TensorBoard b'41' on port 6006
(You can navigate to http://127.0.0.1:6006)
Launch tensorboard (remote server)
ssh -L local_port:127.0.0.1:remote_port username@server.com
local> $ ssh -L 7007:127.0.0.1:6006 hunkim@server.com
server> $ tensorboard —logdir=/tmp/mnist_logs
(You can navigate to http://127.0.0.1:7007)
Multiple runs
tensorboard —logdir=/tmp/mnist_logs/run1
writer = tf.summary.FileWriter(“/tmp/mnist_logs/run1”)
tensorboard —logdir=/tmp/mnist_logs/run2
writer = tf.summary.FileWriter(“/tmp/mnist_logs/run1”)
tensorboard —logdir=/tmp/mnist_logs
5 steps of using
TensorBoard
From TF graph, decide which tensors you want to log
with tf.variable_scope('layer1') as scope:
tf.summary.image('input', x_image, 3)
tf.summary.histogram("layer", L1)
tf.summary.scalar("loss", cost)
Merge all summaries
summary = tf.summary.merge_all()
Create writer and add graph
# Create summary writer
writer = tf.summary.FileWriter(TB_SUMMARY_DIR)
writer.add_graph(sess.graph)
Run summary merge and add_summary
s, _ = sess.run([summary, optimizer], feed_dict=feed_dict)
writer.add_summary(s, global_step=global_step)
Launch TensorBoard
tensorboard --logdir=/tmp/mnist_logs
Lab 9-3 (optional)
NN Backpropagation
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
How to train?
Gradient descent algorithm
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
Tensorflow
“Yes you should understand backprop”
https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b
• “If you try to ignore how it works under the hood because TensorFlow
automagically makes my networks learn”
- “You will not be ready to wrestle with the dangers it presents”
- “You will be much less effective at building and debugging neural networks.”
• “The good news is that backpropagation is not that difficult to understand”
- “if presented properly.”
Back propagation (chain rule)
http://cs231n.stanford.edu/
w
Back propagation (chain rule)
http://cs231n.stanford.edu/
w
* + Sigmoid loss
w b
Logistic Regression Network
a0
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l) (4) E=loss(a1
,t)
Network forward
a0
Forward pass, OK? Just follow (1), (2), (3) and (4)
(1) o=a0
*w
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l) (4) E=loss(a1
,t)
Network forward
a0
(1) o=a0
*w
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
a0
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
a0
Let’s do back propagation!
will be given. What would be
We can use the chain rule.
backward prop
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
backward prop
a0
In the same manner, we can get back prop (4), (3), (1) and (1)!
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Gate derivatives
Network forward
a0
These derivatives for each gate will be given.
We can just use them in the chain rule.
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule) Gate derivatives
Network forward
backward prop
a0
Given from the pre
computed derivative
Just apply them one by one and solve each
derivative one by one!
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule) Gate derivatives
Network forward
backward prop
a0
Given from the pre
computed derivative
Just apply them one by one and solve each
derivative one by one!
Given
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule) Gate derivatives
Network forward
backward prop
a0
Given from the pre
computed derivative
Just apply them one by one and solve each
derivative one by one!
Given
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule) Gate derivatives
Network forward
backward prop
a0
Matrix
(1) o=a0
*w (4) E=loss(a1
,t)
For Matrix: http://cs231n.github.io/optimization-2/#staged
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule) Gate derivatives
Network update (learning rate, alpha)
Network forward
backward prop
a0
Matrix
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule) Gate derivatives
Network update (learning rate, alpha)
Network forward
backward prop
a0
Done! Let’s update our
network using
derivatives!
(1) o=a0
*w (4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule)
Network forward
backward prop
a0
(1) o=a0
*w
d_a1 = (a1 - t) / (a1 * (1. - t) + 1e-7)
d_sigma = a1 * (1 - a1) # sigma prime
d_l = d_a1 * d_sigma # (a1 - t)
d_b = d_l * 1
d_o = d_1 * 1
d_W = tf.matmul(tf.transpose(a0), d_o)
# Updating network using gradients
learning_rate = 0.01
train_step = [
tf.assign(W, W - learning_rate * d_W),
tf.assign(b, b - learning_rate * tf.reduce_sum(d_b))]
(4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule)
Network forward
backward prop
a0
(1) o=a0
*w
d_a1 = (a1 - t) / (a1 * (1. - a1) + 1e-7)
d_sigma = a1 * (1 - a1) # sigma prime
d_l = d_a1 * d_sigma # (a1 - t)
d_b = d_l * 1
d_o = d_1 * 1
d_W = tf.matmul(tf.transpose(a0), d_o)
# Updating network using gradients
learning_rate = 0.01
train_step = [
tf.assign(W, W - learning_rate * d_W),
tf.assign(b, b - learning_rate * d_b)]
(4) E=loss(a1
,t)
* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule)
Network forward
backward prop
a0
(1) o=a0
*w
d_a1 = (a1 - t) / (a1 * (1. - a1) + 1e-7)
d_sigma = a1 * (1 - a1) # sigma prime
d_l = d_a1 * d_sigma # (a1 - t)
d_b = d_l * 1
d_o = d_1 * 1
d_W = tf.matmul(tf.transpose(a0), d_o)
# Updating network using gradients
learning_rate = 0.01
train_step = [
tf.assign(W, W - learning_rate * d_W / N), # sample size
tf.assign(b, b - learning_rate * tf.reduce_mean(d_b))]
(4) E=loss(a1
,t)
Exercise
● See more backprop code samples at
https://github.com/hunkim/DeepLearningZeroToAll
● https://github.com/hunkim/DeepLearningZeroToAll/blob/mast
er/lab-09-7-sigmoid_back_prop.py
● Solve XOR using NN backprop
Lab 10
NN, ReLu, Xavier, Dropout, and Adam
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 10
NN, ReLu, Xavier, Dropout, and Adam
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Softmax classifier for MNIST
# weights & bias for nn layers
W = tf.Variable(tf.random_normal([784, 10]))
b = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(X, W) + b
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# train my model
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
feed_dict = {X: batch_xs, Y: batch_ys}
c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
print('Learning Finished!')
# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
Epoch: 0001 cost = 5.888845987
Epoch: 0002 cost = 1.860620173
Epoch: 0003 cost = 1.159035648
Epoch: 0004 cost = 0.892340870
Epoch: 0005 cost = 0.751155428
Epoch: 0006 cost = 0.662484806
Epoch: 0007 cost = 0.601544010
Epoch: 0008 cost = 0.556526115
Epoch: 0009 cost = 0.521186961
Epoch: 0010 cost = 0.493068354
Epoch: 0011 cost = 0.469686249
Epoch: 0012 cost = 0.449967254
Epoch: 0013 cost = 0.433519321
Epoch: 0014 cost = 0.419000337
Epoch: 0015 cost = 0.406490815
Learning Finished!
Accuracy: 0.9035
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-1-mnist_softmax.py
Tensor flow description of ML Lab. document
Softmax classifier for MNIST
# weights & bias for nn layers
W = tf.Variable(tf.random_normal([784, 10]))
b = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(X, W) + b
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# train my model
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
feed_dict = {X: batch_xs, Y: batch_ys}
c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
print('Learning Finished!')
# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
Epoch: 0001 cost = 5.888845987
Epoch: 0002 cost = 1.860620173
Epoch: 0003 cost = 1.159035648
Epoch: 0004 cost = 0.892340870
Epoch: 0005 cost = 0.751155428
Epoch: 0006 cost = 0.662484806
Epoch: 0007 cost = 0.601544010
Epoch: 0008 cost = 0.556526115
Epoch: 0009 cost = 0.521186961
Epoch: 0010 cost = 0.493068354
Epoch: 0011 cost = 0.469686249
Epoch: 0012 cost = 0.449967254
Epoch: 0013 cost = 0.433519321
Epoch: 0014 cost = 0.419000337
Epoch: 0015 cost = 0.406490815
Learning Finished!
Accuracy: 0.9035
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-1-mnist_softmax.py
NN for MNIST
# input place holders
X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])
# weights & bias for nn layers
W1 = tf.Variable(tf.random_normal([784, 256]))
b1 = tf.Variable(tf.random_normal([256]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
W2 = tf.Variable(tf.random_normal([256, 256]))
b2 = tf.Variable(tf.random_normal([256]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)
W3 = tf.Variable(tf.random_normal([256, 10]))
b3 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L2, W3) + b3
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
Epoch: 0001 cost = 141.207671860
Epoch: 0002 cost = 38.788445864
Epoch: 0003 cost = 23.977515479
Epoch: 0004 cost = 16.315132428
Epoch: 0005 cost = 11.702554882
Epoch: 0006 cost = 8.573139748
Epoch: 0007 cost = 6.370995680
Epoch: 0008 cost = 4.537178684
Epoch: 0009 cost = 3.216900532
Epoch: 0010 cost = 2.329708954
Epoch: 0011 cost = 1.715552875
Epoch: 0012 cost = 1.189857912
Epoch: 0013 cost = 0.820965160
Epoch: 0014 cost = 0.624131458
Epoch: 0015 cost = 0.454633765
Learning Finished!
Accuracy: 0.9455
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-2-mnist_nn.py
http://stackoverflow.com/questions/33640581/how-to-do-xavier-initialization-on-tensorflow
Xavier for MNIST
# input place holders
X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])
# weights & bias for nn layers
# http://stackoverflow.com/questions/33640581
W1 = tf.get_variable("W1", shape=[784, 256],
initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.Variable(tf.random_normal([256]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
W2 = tf.get_variable("W2", shape=[256, 256],
initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.Variable(tf.random_normal([256]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)
W3 = tf.get_variable("W3", shape=[256, 10],
initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L2, W3) + b3
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
Epoch: 0001 cost = 0.301498963
Epoch: 0002 cost = 0.107252513
Epoch: 0003 cost = 0.064888892
Epoch: 0004 cost = 0.044463030
Epoch: 0005 cost = 0.029951642
Epoch: 0006 cost = 0.020663404
Epoch: 0007 cost = 0.015853033
Epoch: 0008 cost = 0.011764387
Epoch: 0009 cost = 0.008598264
Epoch: 0010 cost = 0.007383116
Epoch: 0011 cost = 0.006839140
Epoch: 0012 cost = 0.004672963
Epoch: 0013 cost = 0.003979437
Epoch: 0014 cost = 0.002714260
Epoch: 0015 cost = 0.004707661
Learning Finished!
Accuracy: 0.9783
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-3-mnist_nn_xavier.py
Xavier for MNIST
# input place holders
X = tf.placeholder(tf.float32, [None, 784])
Y = tf.placeholder(tf.float32, [None, 10])
# weights & bias for nn layers
# http://stackoverflow.com/questions/33640581
W1 = tf.get_variable("W1", shape=[784, 256],
initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.Variable(tf.random_normal([256]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
W2 = tf.get_variable("W2", shape=[256, 256],
initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.Variable(tf.random_normal([256]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)
W3 = tf.get_variable("W3", shape=[256, 10],
initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L2, W3) + b3
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
Epoch: 0001 cost = 0.301498963
Epoch: 0002 cost = 0.107252513
Epoch: 0003 cost = 0.064888892
Epoch: 0004 cost = 0.044463030
Epoch: 0005 cost = 0.029951642
Epoch: 0006 cost = 0.020663404
Epoch: 0007 cost = 0.015853033
Epoch: 0008 cost = 0.011764387
Epoch: 0009 cost = 0.008598264
Epoch: 0010 cost = 0.007383116
Epoch: 0011 cost = 0.006839140
Epoch: 0012 cost = 0.004672963
Epoch: 0013 cost = 0.003979437
Epoch: 0014 cost = 0.002714260
Epoch: 0015 cost = 0.004707661
Learning Finished!
Accuracy: 0.9783 (xavier)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-3-mnist_nn_xavier.py
Epoch: 0001 cost = 141.207671860
Epoch: 0002 cost = 38.788445864
Epoch: 0003 cost = 23.977515479
Epoch: 0004 cost = 16.315132428
Epoch: 0005 cost = 11.702554882
Epoch: 0006 cost = 8.573139748
Epoch: 0007 cost = 6.370995680
Epoch: 0008 cost = 4.537178684
Epoch: 0009 cost = 3.216900532
Epoch: 0010 cost = 2.329708954
Epoch: 0011 cost = 1.715552875
Epoch: 0012 cost = 1.189857912
Epoch: 0013 cost = 0.820965160
Epoch: 0014 cost = 0.624131458
Epoch: 0015 cost = 0.454633765
Learning Finished!
Accuracy: 0.9455 (normal dist)
Deep NN for MNIST
W1 = tf.get_variable("W1", shape=[784, 512],
initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.Variable(tf.random_normal([512]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
W2 = tf.get_variable("W2", shape=[512, 512],
initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.Variable(tf.random_normal([512]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)
W3 = tf.get_variable("W3", shape=[512, 512],
initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.Variable(tf.random_normal([512]))
L3 = tf.nn.relu(tf.matmul(L2, W3) + b3)
W4 = tf.get_variable("W4", shape=[512, 512],
initializer=tf.contrib.layers.xavier_initializer())
b4 = tf.Variable(tf.random_normal([512]))
L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)
W5 = tf.get_variable("W5", shape=[512, 10],
initializer=tf.contrib.layers.xavier_initializer())
b5 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L4, W5) + b5
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
Epoch: 0001 cost = 0.266061549
Epoch: 0002 cost = 0.080796588
Epoch: 0003 cost = 0.049075800
Epoch: 0004 cost = 0.034772298
Epoch: 0005 cost = 0.024780529
Epoch: 0006 cost = 0.017072763
Epoch: 0007 cost = 0.014031383
Epoch: 0008 cost = 0.013763446
Epoch: 0009 cost = 0.009164047
Epoch: 0010 cost = 0.008291388
Epoch: 0011 cost = 0.007319742
Epoch: 0012 cost = 0.006434021
Epoch: 0013 cost = 0.005684378
Epoch: 0014 cost = 0.004781207
Epoch: 0015 cost = 0.004342310
Learning Finished!
Accuracy: 0.9742
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-4-mnist_nn_deep.py
Dropout for MNIST
# dropout (keep_prob) rate 0.7 on training, but should be 1 for testing
keep_prob = tf.placeholder(tf.float32)
W1 = tf.get_variable("W1", shape=[784, 512])
b1 = tf.Variable(tf.random_normal([512]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
L1 = tf.nn.dropout(L1, keep_prob=keep_prob)
W2 = tf.get_variable("W2", shape=[512, 512])
b2 = tf.Variable(tf.random_normal([512]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)
L2 = tf.nn.dropout(L2, keep_prob=keep_prob)
…
# train my model
for epoch in range(training_epochs):
...
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
feed_dict = {X: batch_xs, Y: batch_ys, keep_prob: 0.7}
c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
avg_cost += c / total_batch
# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={
X: mnist.test.images, Y: mnist.test.labels, keep_prob: 1}))
Epoch: 0001 cost = 0.447322626
Epoch: 0002 cost = 0.157285590
Epoch: 0003 cost = 0.121884535
Epoch: 0004 cost = 0.098128681
Epoch: 0005 cost = 0.082901778
Epoch: 0006 cost = 0.075337573
Epoch: 0007 cost = 0.069752543
Epoch: 0008 cost = 0.060884363
Epoch: 0009 cost = 0.055276413
Epoch: 0010 cost = 0.054631256
Epoch: 0011 cost = 0.049675195
Epoch: 0012 cost = 0.049125314
Epoch: 0013 cost = 0.047231930
Epoch: 0014 cost = 0.041290121
Epoch: 0015 cost = 0.043621063
Learning Finished!
Accuracy: 0.9804!!
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-5-mnist_nn_dropout.py
Optimizers
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
https://www.tensorflow.org/api_guides/python/train
Optimizers
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)
● tf.train.AdadeltaOptimizer
● tf.train.AdagradOptimizer
● tf.train.AdagradDAOptimizer
● tf.train.MomentumOptimizer
● tf.train.AdamOptimizer
● tf.train.FtrlOptimizer
● tf.train.ProximalGradientDescentOptimizer
● tf.train.ProximalAdagradOptimizer
● tf.train.RMSPropOptimizer
https://www.tensorflow.org/api_guides/python/train
http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html
ADAM: a method for stochastic optimization
[Kingma et al. 2015]
Use Adam Optimizer
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
Summary
●Softmax VS Neural Nets for MNIST, 90% and 94.5%
●Xavier initialization: 97.8%
●Deep Neural Nets with Dropout: 98%
●Adam and other optimizers
●Exercise: Batch Normalization
- https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-
10-6-mnist_nn_batchnorm.ipynb
Lecture and Lab 11
CNN
Sung Kim <hunkim+ml@gmail.com>
http://hunkim.github.io/ml/
Lab 11
CNN
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 11-1
CNN Basics
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
CNN
http://parse.ele.tue.nl/cluster/2/CNNArchitecture.jpg
CNN for CT images
Asan Medical Center & Microsoft Medical Bigdata Contest Winner by GeunYoung Lee and Alex Kim
https://www.slideshare.net/GYLee3/ss-72966495
Convolution layer and max pooling
Simple convolution layer
Stride: 1x1
3x3x1
2x2x1 filter
Toy image
Simple convolution layer
Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: VALID
1
1
1
1
1
2
3
4
5
6
7
8
9
[[[[1.]],[[1.]]],
[[[1.]],[[1.]]]]
shape=(2,2,1,1)
Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: VALID
1
1
1
1
1
2
3
4
5
6
7
8
9
1
1
1
1
1
2
3
4
5
6
7
8
9
0
0
0
0
0
0
0
3
3
Simple convolution layer
Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: SAME
Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: SAME
1
1
1
1
1
2
3
4
5
6
7
8
9
0
0
0
0
0
0
0
Tensor flow description of ML Lab. document
Max Pooling
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-0-cnn_basics.ipynb
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-0-cnn_basics.ipynb
MNIST image loading
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-0-cnn_basics.ipynb
MNIST Convolution layer
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-0-cnn_basics.ipynb
MNIST Max pooling
Lab 11-2
CNN MNIST: 99%!
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
CNN
http://parse.ele.tue.nl/cluster/2/CNNArchitecture.jpg
Simple CNN
# input placeholders
X = tf.placeholder(tf.float32, [None, 784])
X_img = tf.reshape(X, [-1, 28, 28, 1]) # img 28x28x1 (black/white)
Y = tf.placeholder(tf.float32, [None, 10])
# L1 ImgIn shape=(?, 28, 28, 1)
W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01))
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME')
L1 = tf.nn.relu(L1)
L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
'''
Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32)
'''
Conv layer 1
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py
'''
Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32)
'''
# L2 ImgIn shape=(?, 14, 14, 32)
W2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01))
# Conv ->(?, 14, 14, 64)
# Pool ->(?, 7, 7, 64)
L2 = tf.nn.conv2d(L1, W2, strides=[1, 1, 1, 1], padding='SAME')
L2 = tf.nn.relu(L2)
L2 = tf.nn.max_pool(L2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
L2 = tf.reshape(L2, [-1, 7 * 7 * 64])
'''
Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("Relu_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("MaxPool_1:0", shape=(?, 7, 7, 64), dtype=float32)
Tensor("Reshape_1:0", shape=(?, 3136), dtype=float32)
Conv layer 2
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py
'''
Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("Relu_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("MaxPool_1:0", shape=(?, 7, 7, 64), dtype=float32)
Tensor("Reshape_1:0", shape=(?, 3136), dtype=float32)
'''
L2 = tf.reshape(L2, [-1, 7 * 7 * 64])
# Final FC 7x7x64 inputs -> 10 outputs
W3 = tf.get_variable("W3", shape=[7 * 7 * 64, 10],
initializer=tf.contrib.layers.xavier_initializer())
b = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L2, W3) + b
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
Fully Connected (FC, Dense) layer
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py
Training and
Evaluation
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py
# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# train my model
print('Learning stared. It takes sometime.')
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
feed_dict = {X: batch_xs, Y: batch_ys}
c, _, = sess.run([cost, optimizer], feed_dict=feed_dict)
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
print('Learning Finished!')
# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py
# initialize
sess = tf.Session()
sess.run(tf.global_variables_initializer())
# train my model
print('Learning stared. It takes sometime.')
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
feed_dict = {X: batch_xs, Y: batch_ys}
c, _, = sess.run([cost, optimizer], feed_dict=feed_dict)
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))
print('Learning Finished!')
# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
Epoch: 0001 cost = 0.340291267
Epoch: 0002 cost = 0.090731326
Epoch: 0003 cost = 0.064477619
Epoch: 0004 cost = 0.050683064
...
Epoch: 0011 cost = 0.017758641
Epoch: 0012 cost = 0.014156652
Epoch: 0013 cost = 0.012397016
Epoch: 0014 cost = 0.010693789
Epoch: 0015 cost = 0.009469977
Learning Finished!
Accuracy: 0.9885
Training and
Evaluation
Deep CNN
Image credit: http://personal.ie.cuhk.edu.hk/~ccloy/project_target_code/index.html
# L3 ImgIn shape=(?, 7, 7, 64)
W3 = tf.Variable(tf.random_normal([3, 3, 64, 128], stddev=0.01))
# Conv ->(?, 7, 7, 128)
# Pool ->(?, 4, 4, 128)
# Reshape ->(?, 4 * 4 * 128) # Flatten them for FC
L3 = tf.nn.conv2d(L2, W3, strides=[1, 1, 1, 1], padding='SAME')
L3 = tf.nn.relu(L3)
L3 = tf.nn.max_pool(L3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
padding='SAME')
L3 = tf.nn.dropout(L3, keep_prob=keep_prob)
L3 = tf.reshape(L3, [-1, 128 * 4 * 4])
'''Tensor("Conv2D_2:0", shape=(?, 7, 7, 128), dtype=float32)
Tensor("Relu_2:0", shape=(?, 7, 7, 128), dtype=float32)
Tensor("MaxPool_2:0", shape=(?, 4, 4, 128), dtype=float32)
Tensor("dropout_2/mul:0", shape=(?, 4, 4, 128), dtype=float32)
Tensor("Reshape_1:0", shape=(?, 2048), dtype=float32)'''
# L4 FC 4x4x128 inputs -> 625 outputs
W4 = tf.get_variable("W4", shape=[128 * 4 * 4, 625],
initializer=tf.contrib.layers.xavier_initializer())
b4 = tf.Variable(tf.random_normal([625]))
L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)
L4 = tf.nn.dropout(L4, keep_prob=keep_prob)
'''Tensor("Relu_3:0", shape=(?, 625), dtype=float32)
Tensor("dropout_3/mul:0", shape=(?, 625), dtype=float32)'''
# L5 Final FC 625 inputs -> 10 outputs
W5 = tf.get_variable("W5", shape=[625, 10],
initializer=tf.contrib.layers.xavier_initializer())
b5 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L4, W5) + b5
'''Tensor("add_1:0", shape=(?, 10), dtype=float32)'''
Deep CNN
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-2-mnist_deep_cnn.py
# L1 ImgIn shape=(?, 28, 28, 1)
W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01))
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME')
L1 = tf.nn.relu(L1)
L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
L1 = tf.nn.dropout(L1, keep_prob=keep_prob)
'''Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32)
Tensor("dropout/mul:0", shape=(?, 14, 14, 32), dtype=float32)'''
# L2 ImgIn shape=(?, 14, 14, 32)
W2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01))
# Conv ->(?, 14, 14, 64)
# Pool ->(?, 7, 7, 64)
L2 = tf.nn.conv2d(L1, W2, strides=[1, 1, 1, 1], padding='SAME')
L2 = tf.nn.relu(L2)
L2 = tf.nn.max_pool(L2, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
L2 = tf.nn.dropout(L2, keep_prob=keep_prob)
'''Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("Relu_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("MaxPool_1:0", shape=(?, 7, 7, 64), dtype=float32)
Tensor("dropout_1/mul:0", shape=(?, 7, 7, 64), dtype=float32)'''
Deep CNN
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-2-mnist_deep_cnn.py
# L1 ImgIn shape=(?, 28, 28, 1)
W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01))
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME')
L1 = tf.nn.relu(L1)
L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
L1 = tf.nn.dropout(L1, keep_prob=keep_prob)
'''Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32)
Tensor("dropout/mul:0", shape=(?, 14, 14, 32), dtype=float32)'''
...
...
# L4 FC 4x4x128 inputs -> 625 outputs
W4 = tf.get_variable("W4", shape=[128 * 4 * 4, 625],
initializer=tf.contrib.layers.xavier_initializer())
b4 = tf.Variable(tf.random_normal([625]))
L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)
L4 = tf.nn.dropout(L4, keep_prob=keep_prob)
'''Tensor("Relu_3:0", shape=(?, 625), dtype=float32)
Tensor("dropout_3/mul:0", shape=(?, 625), dtype=float32)'''
# L5 Final FC 625 inputs -> 10 outputs
W5 = tf.get_variable("W5", shape=[625, 10],
initializer=tf.contrib.layers.xavier_initializer())
b5 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L4, W5) + b5
'''Tensor("add_1:0", shape=(?, 10), dtype=float32)'''
# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1),
tf.argmax(Y, 1))
accuracy =
tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print('Accuracy:', sess.run(accuracy,
feed_dict={X: mnist.test.images,
Y: mnist.test.labels, keep_prob: 1}))
Epoch: 0013 cost = 0.027188021
Epoch: 0014 cost = 0.023604777
Epoch: 0015 cost = 0.024607201
Learning Finished!
Accuracy: 0.9938
Lab 11-3
Class, Layers, Ensemble
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
CNN
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-2-mnist_deep_cnn.py
# L1 ImgIn shape=(?, 28, 28, 1)
W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01))
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME')
L1 = tf.nn.relu(L1)
L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
L1 = tf.nn.dropout(L1, keep_prob=keep_prob)
'''Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32)
Tensor("dropout/mul:0", shape=(?, 14, 14, 32), dtype=float32)'''
...
...
# L4 FC 4x4x128 inputs -> 625 outputs
W4 = tf.get_variable("W4", shape=[128 * 4 * 4, 625],
initializer=tf.contrib.layers.xavier_initializer())
b4 = tf.Variable(tf.random_normal([625]))
L4 = tf.nn.relu(tf.matmul(L3, W4) + b4)
L4 = tf.nn.dropout(L4, keep_prob=keep_prob)
'''Tensor("Relu_3:0", shape=(?, 625), dtype=float32)
Tensor("dropout_3/mul:0", shape=(?, 625), dtype=float32)'''
# L5 Final FC 625 inputs -> 10 outputs
W5 = tf.get_variable("W5", shape=[625, 10],
initializer=tf.contrib.layers.xavier_initializer())
b5 = tf.Variable(tf.random_normal([10]))
hypothesis = tf.matmul(L4, W5) + b5
'''Tensor("add_1:0", shape=(?, 10), dtype=float32)'''
# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1),
tf.argmax(Y, 1))
accuracy =
tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print('Accuracy:', sess.run(accuracy,
feed_dict={X: mnist.test.images,
Y: mnist.test.labels, keep_prob: 1}))
Epoch: 0013 cost = 0.027188021
Epoch: 0014 cost = 0.023604777
Epoch: 0015 cost = 0.024607201
Learning Finished!
Accuracy: 0.9938
Python Class
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-3-mnist_cnn_class.py
class Model:
def __init__(self, sess, name):
self.sess = sess
self.name = name
self._build_net()
def _build_net(self):
with tf.variable_scope(self.name):
# input place holders
self.X = tf.placeholder(tf.float32, [None, 784])
# img 28x28x1 (black/white)
X_img = tf.reshape(self.X, [-1, 28, 28, 1])
self.Y = tf.placeholder(tf.float32, [None, 10])
# L1 ImgIn shape=(?, 28, 28, 1)
W1 = tf.Variable(tf.random_normal([3, 3, 1, 32],
stddev=0.01))
...
def predict(self, x_test, keep_prop=1.0):
return self.sess.run(self.logits,
feed_dict={self.X: x_test, self.keep_prob: keep_prop})
def get_accuracy(self, x_test, y_test, keep_prop=1.0):
return self.sess.run(self.accuracy,
feed_dict={self.X: x_test, self.Y: y_test, self.keep_prob: keep_prop})
def train(self, x_data, y_data, keep_prop=0.7):
return self.sess.run([self.cost, self.optimizer], feed_dict={
self.X: x_data, self.Y: y_data, self.keep_prob: keep_prop})
# initialize
sess = tf.Session()
m1 = Model(sess, "m1")
sess.run(tf.global_variables_initializer())
print('Learning Started!')
# train my model
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
c, _ = m1.train(batch_xs, batch_ys)
avg_cost += c / total_batc
tf.layers
https://www.tensorflow.org/api_docs/python/tf/layers
tf.layers
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-4-mnist_cnn_layers.py
# L1 ImgIn shape=(?, 28, 28, 1)
W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01))
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME')
L1 = tf.nn.relu(L1)
L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
L1 = tf.nn.dropout(L1, keep_prob=self.keep_prob)
…
# L2 ImgIn shape=(?, 14, 14, 32)
W2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01))
# Convolutional Layer #1
conv1 = tf.layers.conv2d(inputs=X_img,filters=32,kernel_size=[3,3],padding="SAME",activation=tf.nn.relu)
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], padding="SAME", strides=2)
dropout1 = tf.layers.dropout(inputs=pool1,rate=0.7, training=self.training)
# Convolutional Layer #2
conv2 = tf.layers.conv2d(inputs=dropout1,filters=64,kernel_size=[3,3],padding="SAME",activation=tf.nn.relu)
…
flat = tf.reshape(dropout3, [-1, 128 * 4 * 4])
dense4 = tf.layers.dense(inputs=flat, units=625, activation=tf.nn.relu)
dropout4 = tf.layers.dropout(inputs=dense4, rate=0.5, training=self.training)
...
Ensemble
http://rasbt.github.io/mlxtend/user_guide/classifier/StackingClassifier/
models = []
num_models = 7
for m in range(num_models):
models.append(Model(sess, "model" + str(m)))
sess.run(tf.global_variables_initializer())
print('Learning Started!')
# train my model
for epoch in range(training_epochs):
avg_cost_list = np.zeros(len(models))
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys =mnist.train.next_batch(batch_size)
# train each model
for m_idx, m in enumerate(models):
c, _ = m.train(batch_xs, batch_ys)
avg_cost_list[m_idx] += c / total_batch
print('Epoch:','%04d'%(epoch + 1),'cost =', avg_cost_list)
print('Learning Finished!')
class Model:
def __init__(self, sess, name):
self.sess = sess
self.name = name
self._build_net()
def _build_net(self):
with tf.variable_scope(self.name):
...
Ensemble
training
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-5-mnist_cnn_ensemble_layers.py
Tensor flow description of ML Lab. document
Ensemble prediction
Ensemble prediction
0 1 2 3 4 5 6 7 8 9
0.1 0.01 0.02 0.8 ... ... ... ... ... ...
0.01 0.5 0.02 0.4 ... ... ... ... ... ...
0.01 0.01 0.1 0.7 ... ... ... ... ... ...
.
.
.
0.12 0.52 0.14 1.9 ... ... ... ... ... ...
Sum
argmax
# Test model and check accuracy
test_size = len(mnist.test.labels)
predictions = np.zeros(test_size * 10).reshape(test_size, 10)
for m_idx, m in enumerate(models):
print(m_idx, 'Accuracy:', m.get_accuracy(mnist.test.images, mnist.test.labels))
p = m.predict(mnist.test.images)
predictions += p
ensemble_correct_prediction = tf.equal(
tf.argmax(predictions, 1), tf.argmax(mnist.test.labels, 1))
ensemble_accuracy = tf.reduce_mean(
tf.cast(ensemble_correct_prediction, tf.float32))
print('Ensemble accuracy:', sess.run(ensemble_accuracy))
Ensemble
prediction
0 Accuracy: 0.9933
1 Accuracy: 0.9946
2 Accuracy: 0.9934
3 Accuracy: 0.9935
4 Accuracy: 0.9935
5 Accuracy: 0.9949
6 Accuracy: 0.9941
Ensemble accuracy: 0.9952
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-5-mnist_cnn_ensemble_layers.py
Exercise
● Deep & Wide?
● CIFAR 10
● ImageNet
Lab 12
RNN
Sung Kim <hunkim+ml@gmail.com>
http://hunkim.github.io/ml/
Lab 12
RNN
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280
Lab 12-1
RNN Basics
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
RNN in TensorFlow
cell = tf.contrib.rnn.BasicRNNCell(num_units=hidden_size)
...
outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
RNN in TensorFlow
cell = tf.contrib.rnn.BasicRNNCell(num_units=hidden_size)
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size)
...
outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
One node: 4 (input-dim) in 2 (hidden_size)
One node: 4 (input-dim) in 2 (hidden_size)
# One cell RNN input_dim (4) -> output_dim (2)
hidden_size = 2
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size)
x_data = np.array([[[1,0,0,0]]], dtype=np.float32)
outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
sess.run(tf.global_variables_initializer())
pp.pprint(outputs.eval())
array([[[-0.42409304, 0.64651132]]])
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
One node: 4 (input-dim) in 2 (hidden_size)
Unfolding to n sequences
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
Hidden_size=2
sequence_length=5
Unfolding to n sequences
# One cell RNN input_dim (4) -> output_dim (2). sequence: 5
hidden_size = 2
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size)
x_data = np.array([[h, e, l, l, o]], dtype=np.float32)
print(x_data.shape)
pp.pprint(x_data)
outputs, states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
sess.run(tf.global_variables_initializer())
pp.pprint(outputs.eval())
X_data = array
([[[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]]], dtype=float32)
Outputs = array
([[[ 0.19709368, 0.24918222],
[-0.11721198, 0.1784237 ],
[-0.35297349, -0.66278851],
[-0.70915914, -0.58334434],
[-0.38886023, 0.47304463]]], dtype=float32)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
Hidden_size=2
sequence_length=5
Batching input
Hidden_size=2
sequence_length=5
batch_size=3
Batching input
# One cell RNN input_dim (4) -> output_dim (2). sequence: 5, batch 3
# 3 batches 'hello', 'eolll', 'lleel'
x_data = np.array([[h, e, l, l, o],
[e, o, l, l, l],
[l, l, e, e, l]], dtype=np.float32)
pp.pprint(x_data)
cell = rnn.BasicLSTMCell(num_units=2, state_is_tuple=True)
outputs, _states = tf.nn.dynamic_rnn(cell, x_data,
dtype=tf.float32)
sess.run(tf.global_variables_initializer())
pp.pprint(outputs.eval())
array([[[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]],
[[ 0., 1., 0., 0.],
[ 0., 0., 0., 1.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.]],
[[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 1., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.]]],
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
Hidden_size=2
sequence_length=5
batch_size=3
Batching input
# One cell RNN input_dim (4) -> output_dim (2). sequence: 5, batch 3
# 3 batches 'hello', 'eolll', 'lleel'
x_data = np.array([[h, e, l, l, o],
[e, o, l, l, l],
[l, l, e, e, l]], dtype=np.float32)
pp.pprint(x_data)
cell = rnn.BasicLSTMCell(num_units=2, state_is_tuple=True)
outputs, _states = tf.nn.dynamic_rnn(cell, x_data,
dtype=tf.float32)
sess.run(tf.global_variables_initializer())
pp.pprint(outputs.eval())
array([[[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]],
[[ 0., 1., 0., 0.],
[ 0., 0., 0., 1.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.]],
[[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 1., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.]]],
array([[[-0.0173022 , -0.12929453],
[-0.14995177, -0.23189341],
[ 0.03294011, 0.01962204],
[ 0.12852104, 0.12375218],
[ 0.13597946, 0.31746736]],
[[-0.15243632, -0.14177315],
[ 0.04586344, 0.12249056],
[ 0.14292534, 0.15872268],
[ 0.18998367, 0.21004884],
[ 0.21788891, 0.24151592]],
[[ 0.10713603, 0.11001928],
[ 0.17076059, 0.1799853 ],
[-0.03531617, 0.08993293],
[-0.1881337 , -0.08296411],
[-0.00404597, 0.07156041]]],
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
Hidden_size=2
sequence_length=5
batch_size=3
Lab 12-2
Hi Hello RNN
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Teach RNN ‘hihello’
h h
i e l
l
i e
h l o
l
One-hot encoding
[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 0, 1], # o 4
● text: ‘hihello’
● unique chars (vocabulary, voc):
h, i, e, l, o
● voc index:
h:0, i:1, e:2, l:3, o:4
h h
i e l
l
i e
h l o
l
[1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0]
[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 0, 1], # o 4
Teach RNN ‘hihello’
[0, 1, 0, 0, 0] [1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 0, 0, 0, 1]
[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 0, 1], # o 4
Teach RNN ‘hihello’
Creating rnn cell
# RNN model
rnn_cell = rnn_cell.BasicRNNCell(rnn_size)
rnn_cell = rnn_cell. BasicLSTMCell(rnn_size)
rnn_cell = rnn_cell. GRUCell(rnn_size)
Creating rnn cell
# RNN model
rnn_cell = rnn_cell.BasicRNNCell(rnn_size)
rnn_cell = rnn_cell. BasicLSTMCell(rnn_size)
rnn_cell = rnn_cell. GRUCell(rnn_size)
Execute RNN
# RNN model
rnn_cell = rnn_cell.BasicRNNCell(rnn_size)
outputs, _states = tf.nn.dynamic_rnn(
rnn_cell,
X,
initial_state=initial_state,
dtype=tf.float32)
hidden_rnn_size
RNN parameters
hidden_size = 5 # output from the LSTM
input_dim = 5 # one-hot size
batch_size = 1 # one sentence
sequence_length = 6 # |ihello| == 6
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
Data creation
idx2char = ['h', 'i', 'e', 'l', 'o'] # h=0, i=1, e=2, l=3, o=4
x_data = [[0, 1, 0, 2, 3, 3]] # hihell
x_one_hot = [[[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[1, 0, 0, 0, 0], # h 0
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 1, 0]]] # l 3
y_data = [[1, 0, 2, 3, 3, 4]] # ihello
X = tf.placeholder(tf.float32,
[None, sequence_length, input_dim]) # X one-hot
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
Feed to RNN
X = tf.placeholder(
tf.float32, [None, sequence_length, hidden_size]) # X one-hot
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size,
state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
outputs, _states = tf.nn.dynamic_rnn(
cell, X, initial_state=initial_state, dtype=tf.float32)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
x_one_hot = [[[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[1, 0, 0, 0, 0], # h 0
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 1, 0]]] # l 3
y_data = [[1, 0, 2, 3, 3, 4]] # ihello
Cost: sequence_loss
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
# [batch_size, sequence_length]
y_data = tf.constant([[1, 1, 1]])
# [batch_size, sequence_length, emb_dim ]
prediction1 = tf.constant([[[0.3, 0.7], [0.3, 0.7], [0.3, 0.7]]],
dtype=tf.float32)
prediction2 = tf.constant([[[0.1, 0.9], [0.1, 0.9], [0.1, 0.9]]],
dtype=tf.float32)
# [batch_size * sequence_length]
weights = tf.constant([[1, 1, 1]], dtype=tf.float32)
sequence_loss1 = tf.contrib.seq2seq.sequence_loss(prediction1, y_data,
weights)
sequence_loss2 = tf.contrib.seq2seq.sequence_loss(prediction2, y_data,
weights)
sess.run(tf.global_variables_initializer())
print("Loss1: ", sequence_loss1.eval(),
"Loss2: ", sequence_loss2.eval())
Cost: sequence_loss
Loss1: 0.513015
Loss2: 0.371101
Cost: sequence_loss
outputs, _states = tf.nn.dynamic_rnn(
cell, X, initial_state=initial_state, dtype=tf.float32)
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
loss = tf.reduce_mean(sequence_loss)
train = tf.train.AdamOptimizer(learning_rate=0.1).minimize(loss)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
Training
prediction = tf.argmax(outputs, axis=2)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(2000):
l, _ = sess.run([loss, train], feed_dict={X: x_one_hot, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_one_hot})
print(i, "loss:", l, "prediction: ", result, "true Y: ", y_data)
# print char using dic
result_str = [idx2char[c] for c in np.squeeze(result)]
print("tPrediction str: ", ''.join(result_str))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
Results
prediction = tf.argmax(outputs, axis=2)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(2000):
l, _ = sess.run([loss, train], feed_dict={X: x_one_hot, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_one_hot})
print(i, "loss:", l, "prediction: ", result, "true Y: ", y_data)
# print char using dic
result_str = [idx2char[c] for c in np.squeeze(result)]
print("tPrediction str: ", ''.join(result_str))
0 loss: 1.55474 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo
1 loss: 1.55081 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo
2 loss: 1.54704 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo
3 loss: 1.54342 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo
...
1998 loss: 0.75305 prediction: [[1 0 2 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: ihello
1999 loss: 0.752973 prediction: [[1 0 2 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: ihello
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
Lab 12-3
RNN with long sequences
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Manual data creation
idx2char = ['h', 'i', 'e', 'l', 'o']
x_data = [[0, 1, 0, 2, 3, 3]] # hihell
x_one_hot = [[[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[1, 0, 0, 0, 0], # h 0
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 1, 0]]] # l 3
y_data = [[1, 0, 2, 3, 3, 4]] # ihello
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
Better data creation
sample = " if you want you"
idx2char = list(set(sample)) # index -> char
char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idx
sample_idx = [char2idx[c] for c in sample] # char to index
x_data = [sample_idx[:-1]] # X data sample (0 ~ n-1) hello: hell
y_data = [sample_idx[1:]] # Y label sample (1 ~ n) hello: ello
X = tf.placeholder(tf.int32, [None, sequence_length]) # X data
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py
Hyper parameters
sample = " if you want you"
idx2char = list(set(sample)) # index -> char
char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idx
# hyper parameters
dic_size = len(char2idx) # RNN input size (one hot size)
rnn_hidden_size = len(char2idx) # RNN output size
num_classes = len(char2idx) # final output size (RNN or softmax, etc.)
batch_size = 1 # one sample data, one batch
sequence_length = len(sample) - 1 # number of lstm unfolding (unit #)
LSTM and Loss
X = tf.placeholder(tf.int32, [None, sequence_length]) # X data
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0
cell = tf.contrib.rnn.BasicLSTMCell(num_units=rnn_hidden_size, state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
outputs, _states = tf.nn.dynamic_rnn(
cell, X_one_hot, initial_state=initial_state, dtype=tf.float32)
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(logits=outputs, targets=Y,weights=weights)
loss = tf.reduce_mean(sequence_loss)
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(loss)
prediction = tf.argmax(outputs, axis=2)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py
Training and Results
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(3000):
l, _ = sess.run([loss, train], feed_dict={X: x_data, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_data})
# print char using dic
result_str = [idx2char[c] for c in np.squeeze(result)]
print(i, "loss:", l, "Prediction:", ''.join(result_str))
0 loss: 2.29895 Prediction: nnuffuunnuuuyuy
1 loss: 2.29675 Prediction: nnuffuunnuuuyuy
...
1418 loss: 1.37351 Prediction: if you want you
1419 loss: 1.37331 Prediction: if you want you
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py
Really long sentence?
sentence = ("if you want to build a ship, don't drum up people together to "
"collect wood and don't assign them tasks and work, but rather "
"teach them to long for the endless immensity of the sea.")
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
Really long sentence?
sentence = ("if you want to build a ship, don't drum up people together to "
"collect wood and don't assign them tasks and work, but rather "
"teach them to long for the endless immensity of the sea.")
# training dataset
0 if you wan -> f you want
1 f you want -> you want
2 you want -> you want t
3 you want t -> ou want to
…
168 of the se -> of the sea
169 of the sea -> f the sea.
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
Making dataset
char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}
dataX = []
dataY = []
for i in range(0, len(sentence) - seq_length):
x_str = sentence[i:i + seq_length]
y_str = sentence[i + 1: i + seq_length + 1]
print(i, x_str, '->', y_str)
x = [char_dic[c] for c in x_str] # x str to index
y = [char_dic[c] for c in y_str] # y str to index
dataX.append(x)
dataY.append(y)
# training dataset
0 if you wan -> f you want
1 f you want -> you want
2 you want -> you want t
3 you want t -> ou want to
…
168 of the se -> of the sea
169 of the sea -> f the sea.
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
RNN parameters
char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}
data_dim = len(char_set)
hidden_size = len(char_set)
num_classes = len(char_set)
seq_length = 10 # Any arbitrary number
batch_size = len(dataX)
# training dataset
0 if you wan -> f you want
1 f you want -> you want
2 you want -> you want t
3 you want t -> ou want to
…
168 of the se -> of the sea
169 of the sea -> f the sea.
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
LSTM and Loss
X = tf.placeholder(tf.int32, [None, sequence_length]) # X data
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label
X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0
cell = tf.contrib.rnn.BasicLSTMCell(num_units=rnn_hidden_size, state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
outputs, _states = tf.nn.dynamic_rnn(
cell, X_one_hot, initial_state=initial_state, dtype=tf.float32)
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(logits=outputs, targets=Y,weights=weights)
loss = tf.reduce_mean(sequence_loss)
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(loss)
prediction = tf.argmax(outputs, axis=2)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py
Exercise
● Run long sequence RNN
● Why it does not work?
Lab 12-4
RNN with long sequences: Stacked
RNN + Softmax layer
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Really long sentence?
sentence = ("if you want to build a ship, don't drum up people together to "
"collect wood and don't assign them tasks and work, but rather "
"teach them to long for the endless immensity of the sea.")
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
Making dataset
char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}
dataX = []
dataY = []
for i in range(0, len(sentence) - seq_length):
x_str = sentence[i:i + seq_length]
y_str = sentence[i + 1: i + seq_length + 1]
print(i, x_str, '->', y_str)
x = [char_dic[c] for c in x_str] # x str to index
y = [char_dic[c] for c in y_str] # y str to index
dataX.append(x)
dataY.append(y)
# training dataset
0 if you wan -> f you want
1 f you want -> you want
2 you want -> you want t
3 you want t -> ou want to
…
168 of the se -> of the sea
169 of the sea -> f the sea.
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
RNN parameters
char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}
data_dim = len(char_set)
hidden_size = len(char_set)
num_classes = len(char_set)
seq_length = 10 # Any arbitrary number
batch_size = len(dataX)
# training dataset
0 if you wan -> f you want
1 f you want -> you want
2 you want -> you want t
3 you want t -> ou want to
…
168 of the se -> of the sea
169 of the sea -> f the sea.
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
Wide & Deep
https://www.tensorflow.org/versions/r0.11/tutorials/wide_and_deep/index.html
Stacked RNN
X = tf.placeholder(tf.int32, [None, seq_length])
Y = tf.placeholder(tf.int32, [None, seq_length])
# One-hot encoding
X_one_hot = tf.one_hot(X, num_classes)
print(X_one_hot) # check out the shape
# Make a lstm cell with hidden_size (each unit output vector size)
cell = rnn.BasicLSTMCell(hidden_size, state_is_tuple=True)
cell = rnn.MultiRNNCell([cell] * 2, state_is_tuple=True)
# outputs: unfolding size x hidden size, state = hidden size
outputs, _states = tf.nn.dynamic_rnn(cell, X_one_hot, dtype=tf.float32)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
Softmax (FC) in Deep CNN
Image credit: http://personal.ie.cuhk.edu.hk/~ccloy/project_target_code/index.html
Softmax
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
X_for_softmax = tf.reshape(outputs,
[-1, hidden_size])
outputs = tf.reshape(outputs,
[batch_size, seq_length, num_classes])
Softmax
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
Softmax
# (optional) softmax layer
X_for_softmax = tf.reshape(outputs, [-1, hidden_size])
softmax_w = tf.get_variable("softmax_w",
[hidden_size, num_classes]
softmax_b = tf.get_variable("softmax_b",[num_classes])
outputs = tf.matmul(X_for_softmax,softmax_w) + softmax_b
outputs = tf.reshape(outputs,
[batch_size, seq_length, num_classes])
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
Loss
# reshape out for sequence_loss
outputs = tf.reshape(outputs,
[batch_size, seq_length, num_classes])
# All weights are 1 (equal weights)
weights = tf.ones([batch_size, seq_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
mean_loss = tf.reduce_mean(sequence_loss)
train_op =
tf.train.AdamOptimizer(learning_rate=0.1).minimize(mean_loss)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
Training and print results
sess = tf.Session()
sess.run(tf.global_variables_initializer()
for i in range(500):
_, l, results = sess.run(
[train_op, mean_loss, outputs],
feed_dict={X: dataX, Y: dataY})
for j, result in enumerate(results):
index = np.argmax(result, axis=1)
print(i, j, ''.join([char_set[t] for t in index]), l)
0 167 tttttttttt 3.23111
0 168 tttttttttt 3.23111
0 169 tttttttttt 3.23111
…
499 167 oof the se 0.229306
499 168 tf the sea 0.229306
499 169 n the sea. 0.229306
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
# Let's print the last char of each result to check it works
results = sess.run(outputs, feed_dict={X: dataX})
for j, result in enumerate(results):
index = np.argmax(result, axis=1)
if j is 0: # print all for the first result to make a sentence
print(''.join([char_set[t] for t in index]), end='')
else:
print(char_set[index[-1]], end='')
Training and print results
g you want to build a ship, don't drum up people together to collect wood and don't
assign them tasks and work, but rather teach them to long for the endless immensity
of the sea.
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
char-rnn
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
char/word rnn (char/word level n to n model)
https://github.com/sherjilozair/char-rnn-tensorflow
https://github.com/hunkim/word-rnn-tensorflow
Lab 12-5
Dynamic RNN
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Different sequence length
h e l l o
h i
w h y
...
Different sequence length
h e l l o
h i <pad> <pad> <pad>
w h y <pad> <pad>
...
Different sequence length
h e l l o
h i
w h y
...
sequence_length=[5,2,3]
Dynamic RNN
# 3 batches 'hello', 'eolll', 'lleel'
x_data = np.array([[[...]]], dtype=np.float32)
hidden_size = 2
cell = rnn.BasicLSTMCell(num_units=hidden_size,
state_is_tuple=True)
outputs, _states = tf.nn.dynamic_rnn(
cell, x_data, sequence_length=[5,3,4],
dtype=tf.float32)
sess.run(tf.global_variables_initializer())
print(outputs.eval())
array([[[-0.17904168, -0.08053244],
[-0.01294809, 0.01660814],
[-0.05754048, -0.1368292 ],
[-0.08655578, -0.20553185],
[ 0.07297077, -0.21743253]],
[[ 0.10272847, 0.06519825],
[ 0.20188759, -0.05027055],
[ 0.09514933, -0.16452041],
[ 0. , 0. ],
[ 0. , 0. ]],
[[-0.04893036, -0.14655617],
[-0.07947272, -0.20996611],
[ 0.06466491, -0.02576563],
[ 0.15087658, 0.05166111],
[ 0. , 0. ]]],
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
Lab 12-6
RNN with time series data (stock)
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!
https://github.com/hunkim/DeepLearningZeroToAll/
Time series data
Time series data
Open High Low Volume Close
828.659973 833.450012 828.349976 1247700 831.659973
823.02002 828.070007 821.655029 1597800 828.070007
819.929993 824.400024 818.97998 1281700 824.159973
819.359985 823 818.469971 1304000 818.97998
819 823 816 1053600 820.450012
816 820.958984 815.48999 1198100 819.23999
811.700012 815.25 809.780029 1129100 813.669983
809.51001 810.659973 804.539978 989700 809.559998
807 811.840027 803.190002 1155300 808.380005
'data-02-stock_daily.csv'
Many to one
1 2 3 4 5
8
6 7
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
Open High Low Volume Close
828.659973 833.450012 828.349976 1247700 831.659973
823.02002 828.070007 821.655029 1597800 828.070007
819.929993 824.400024 818.97998 1281700 824.159973
819.359985 823 818.469971 1304000 818.97998
819 823 816 1053600 820.450012
816 820.958984 815.48999 1198100 819.23999
811.700012 815.25 809.780029 1129100 813.669983
809.51001 810.659973 804.539978 989700 ?
807 811.840027 803.190002 1155300 ?
Reading data
timesteps = seq_length = 7
data_dim = 5
output_dim = 1
# Open,High,Low,Close,Volume
xy = np.loadtxt('data-02-stock_daily.csv', delimiter=',')
xy = xy[::-1] # reverse order (chronically ordered)
xy = MinMaxScaler(xy)
x = xy
y = xy[:, [-1]] # Close as label
dataX = []
dataY = []
for i in range(0, len(y) - seq_length):
_x = x[i:i + seq_length]
_y = y[i + seq_length] # Next close price
print(_x, "->", _y)
dataX.append(_x)
dataY.append(_y)
[ 0.18667876 0.20948057 0.20878184 0.
0.21744815]
[ 0.30697388 0.31463414 0.21899367
0.01247647 0.21698189]
[ 0.21914211 0.26390721 0.2246864
0.45632338 0.22496747]
[ 0.23312993 0.23641916 0.16268272
0.57017119 0.14744274]
[ 0.13431201 0.15175877 0.11617252
0.39380658 0.13289962]
[ 0.13973232 0.17060429 0.15860382
0.28173344 0.18171679]
[ 0.18933069 0.20057799 0.19187983
0.29783096 0.2086465 ]]
-> [ 0.14106001]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
Training and test datasets
# split to train and testing
train_size = int(len(dataY) * 0.7)
test_size = len(dataY) - train_size
trainX, testX = np.array(dataX[0:train_size]),
np.array(dataX[train_size:len(dataX)])
trainY, testY = np.array(dataY[0:train_size]),
np.array(dataY[train_size:len(dataY)])
# input placeholders
X = tf.placeholder(tf.float32, [None, seq_length, data_dim])
Y = tf.placeholder(tf.float32, [None, 1])
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
LSTM and Loss
# input placeholders
X = tf.placeholder(tf.float32, [None, seq_length, data_dim])
Y = tf.placeholder(tf.float32, [None, 1])
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_dim, state_is_tuple=True)
outputs, _states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
Y_pred = tf.contrib.layers.fully_connected(
outputs[:, -1], output_dim, activation_fn=None)
# We use the last cell's output
# cost/loss
loss = tf.reduce_sum(tf.square(Y_pred - Y)) # sum of the squares
# optimizer
optimizer = tf.train.AdamOptimizer(0.01)
train = optimizer.minimize(loss)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
Training and Results
sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(1000):
_, l = sess.run([train, loss],
feed_dict={X: trainX, Y: trainY})
print(i, l)
testPredict = sess.run(Y_pred, feed_dict={X: testX})
import matplotlib.pyplot as plt
plt.plot(testY)
plt.plot(testPredict)
plt.show()
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
Exercise
● Implement stock prediction using linear regression only
● Improve results using more features such as keywords and/or
sentiments in top news
Other RNN applications
● Language Modeling
● Speech Recognition
● Machine Translation
● Conversation Modeling/Question Answering
● Image/Video Captioning
● Image/Music/Dance Generation
http://jiwonkim.org/awesome-rnn/
Google Could ML
Examples
Sung Kim <hunkim+ml@gmail.com>
https://github.com/hunkim/GoogleCloudMLExamples
Tensor flow description of ML Lab. document
Local TensorFlow Tasks
TensorFlow Task Local Disk
Cloud ML TensorFlow Tasks
Google Cloud ML
TensorFlow Task
Setup your environment
https://cloud.google.com/ml/docs/how-tos/getting-set-up
Google Could Console
https://console.cloud.google.com/
https://cloud.google.com/ml/docs/how-tos/getting-set-up
https://cloud.google.com/ml/docs/how-tos/getting-set-up
https://cloud.google.com/ml/docs/how-tos/getting-set-up
Google Cloud commands
• gclould: command-line interface to Google Cloud Platform
- Google Cloud ML jobs (`gcloud beta ml`)
- Google Compute Engine virtual machine instances and other resources
- Google Cloud Dataproc clusters and jobs
- Google Cloud Deployment manager deployments
- …
• gsutil: command-line interface to Google Cloud Storage
https://cloud.google.com/sdk/gcloud/
https://cloud.google.com/storage/docs/gsutil
Example
Example git repository
git clone https://github.com/hunkim/GoogleCloudMLExamples.git
Simple Multiplication
Run locally
Run on Cloud ML
Machine Learning Console
Jobs
Jobs/Task
Jobs/task7/logs
Input Example
https://www.tensorflow.org/versions/r0.11/how_tos/reading_data/index.html
CSV File
Reading
Run locally
Cloud ML TensorFlow Tasks
Google Cloud ML
TensorFlow Task
Setting and file copy
JOB_NAME="task9"
PROJECT_ID=`gcloud config list project --format "value(core.project)"`
STAGING_BUCKET=gs://${PROJECT_ID}-ml
INPUT_PATH=${STAGING_BUCKET}/input
gsutil cp input/input.csv $INPUT_PATH/input.csv
Google Storage
Run on Cloud ML
Jobs
Logs
Output Example
TensorFlow
Saver
Local Run
Configuration
Create/Check the output folder
Run on Cloud ML
Job completed
Generated checkpoint files
With Great Power Comes Great Responsibility
Check your bills!
Next
• Could ML deploy
• Hyper-parameter tuning
• Distributed training tasks

More Related Content

PPTX
Introduction to TensorFlow 2
PPTX
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
DOCX
Implementasi Pemodelan Sistem Ke TeeChart 2
PPTX
190111 tf2 preview_jwkang_pub
PDF
1 matlab basics
PDF
Swift for tensorflow
PPTX
Obscure Go Optimisations
PPTX
C# Loops
Introduction to TensorFlow 2
Wrangle 2016: (Lightning Talk) FizzBuzz in TensorFlow
Implementasi Pemodelan Sistem Ke TeeChart 2
190111 tf2 preview_jwkang_pub
1 matlab basics
Swift for tensorflow
Obscure Go Optimisations
C# Loops

Similar to Tensor flow description of ML Lab. document (20)

PDF
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)
PDF
The Ring programming language version 1.8 book - Part 84 of 202
PDF
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
PDF
Introduction to Coding
PDF
FS2 for Fun and Profit
PDF
다음웹툰의 UX(Animation, Transition, Custom View)
PDF
TensorFlow Tutorial.pdf
PDF
Pytorch for tf_developers
PPTX
Deep Learning, Scala, and Spark
PPT
Learn Matlab
PPTX
Machine Learning - Introduction to Tensorflow
PDF
The Ring programming language version 1.5.4 book - Part 25 of 185
PPTX
Intro to Python (High School) Unit #3
PPTX
Transferring Software Testing and Analytics Tools to Practice
PPTX
Rendering Art on the Web - A Performance compendium
PDF
Dsoop (co 221) 1
PPTX
Row Pattern Matching 12c MATCH_RECOGNIZE OOW14
PDF
REvit training
PPTX
From Tensorflow Graph to Tensorflow Eager
PDF
Functional Stream Processing with Scalaz-Stream
딥러닝 중급 - AlexNet과 VggNet (Basic of DCNN : AlexNet and VggNet)
The Ring programming language version 1.8 book - Part 84 of 202
TensorFlow Dev Summit 2018 Extended: TensorFlow Eager Execution
Introduction to Coding
FS2 for Fun and Profit
다음웹툰의 UX(Animation, Transition, Custom View)
TensorFlow Tutorial.pdf
Pytorch for tf_developers
Deep Learning, Scala, and Spark
Learn Matlab
Machine Learning - Introduction to Tensorflow
The Ring programming language version 1.5.4 book - Part 25 of 185
Intro to Python (High School) Unit #3
Transferring Software Testing and Analytics Tools to Practice
Rendering Art on the Web - A Performance compendium
Dsoop (co 221) 1
Row Pattern Matching 12c MATCH_RECOGNIZE OOW14
REvit training
From Tensorflow Graph to Tensorflow Eager
Functional Stream Processing with Scalaz-Stream
Ad

Recently uploaded (20)

PPTX
INTERNET------BASICS-------UPDATED PPT PRESENTATION
PPTX
Digital Literacy And Online Safety on internet
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
PDF
Paper PDF World Game (s) Great Redesign.pdf
PDF
Sims 4 Historia para lo sims 4 para jugar
PPTX
Internet___Basics___Styled_ presentation
PDF
Unit-1 introduction to cyber security discuss about how to secure a system
PPTX
PptxGenJS_Demo_Chart_20250317130215833.pptx
PDF
💰 𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓 💰
PDF
An introduction to the IFRS (ISSB) Stndards.pdf
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PDF
Testing WebRTC applications at scale.pdf
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PPTX
innovation process that make everything different.pptx
PDF
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PPTX
Introduction to Information and Communication Technology
PPT
tcp ip networks nd ip layering assotred slides
INTERNET------BASICS-------UPDATED PPT PRESENTATION
Digital Literacy And Online Safety on internet
Slides PDF The World Game (s) Eco Economic Epochs.pdf
Paper PDF World Game (s) Great Redesign.pdf
Sims 4 Historia para lo sims 4 para jugar
Internet___Basics___Styled_ presentation
Unit-1 introduction to cyber security discuss about how to secure a system
PptxGenJS_Demo_Chart_20250317130215833.pptx
💰 𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓 💰
An introduction to the IFRS (ISSB) Stndards.pdf
introduction about ICD -10 & ICD-11 ppt.pptx
WebRTC in SignalWire - troubleshooting media negotiation
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Testing WebRTC applications at scale.pdf
522797556-Unit-2-Temperature-measurement-1-1.pptx
innovation process that make everything different.pptx
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
Design_with_Watersergyerge45hrbgre4top (1).ppt
Introduction to Information and Communication Technology
tcp ip networks nd ip layering assotred slides
Ad

Tensor flow description of ML Lab. document

  • 1. Lab 1 TensorFlow Basics Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 2. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 3. Lab 1 TensorFlow Basics Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 5. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 6. Lab 1 TensorFlow Basics Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 10. TensorFlow ● TensorFlow™ is an open source software library for numerical computation using data flow graphs. ● Python! https://www.tensorflow.org/
  • 11. What is a Data Flow Graph? ● Nodes in the graph represent mathematical operations ● Edges represent the multidimensional data arrays (tensors) communicated between them. https://www.tensorflow.org/
  • 12. Installing TensorFlow ● Linux, Max OSX, Windows • (sudo -H) pip install --upgrade tensorflow • (sudo -H) pip install --upgrade tensorflow-gpu ● From source • bazel ... • https://www.tensorflow.org/install/install_sources ● Google search/Community help • https://www.facebook.com/groups/TensorFlowKR/ https://www.tensorflow.org/install/
  • 13. Check installation and version Sungs-MacBook-Pro:hunkim$ python3 Python 3.6.0 (v3.6.0:41df79263a11, Dec 22 2016, 17:23:13) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf >>> tf.__version__ '1.0.0' >>>
  • 15. TensorFlow Hello World! https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-01-basics.ipynb b’String’ ‘b’ indicates Bytes literals. http://stackoverflow.com/questions/6269765/
  • 17. TensorFlow Mechanics feed data and run graph (operation) sess.run (op) update variables in the graph (and return values) Build graph using TensorFlow operations
  • 18. Computational Graph https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-01-basics.ipynb (1) Build graph (tensors) using TensorFlow operations (2) feed data and run graph (operation) sess.run (op) (3) update variables in the graph (and return values)
  • 20. TensorFlow Mechanics feed data and run graph (operation) sess.run (op, feed_dict={x: x_data}) update variables in the graph (and return values) Build graph using TensorFlow operations
  • 21. Everything is Tensor t = tf.Constant([1., 2., 3.])
  • 22. Tensor Ranks, Shapes, and Types https://www.tensorflow.org/programmers_guide/dims_types
  • 23. Tensor Ranks, Shapes, and Types https://www.tensorflow.org/programmers_guide/dims_types
  • 24. Tensor Ranks, Shapes, and Types https://www.quora.com/When-should-I-use-tf-float32-vs-tf-float64-in-TensorFlow ...
  • 25. TensorFlow Mechanics feed data and run graph (operation) sess.run (op, feed_dict={x: x_data}) update variables in the graph (and return values) Build graph using TensorFlow operations
  • 26. Lab 2 Linear Regression Sung Kim <hunkim+ml@gmail.com>
  • 29. Lab 2 Linear Regression Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 30. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 31. Lab 2 Linear Regression Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 34. TensorFlow Mechanics feed data and run graph (operation) sess.run (op, feed_dict={x: x_data}) update variables in the graph (and return values) Build graph using TensorFlow operations
  • 35. Build graph using TF operations # X and Y data x_train = [1, 2, 3] y_train = [1, 2, 3] W = tf.Variable(tf.random_normal([1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Our hypothesis XW+b hypothesis = x_train * W + b # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - y_train))
  • 36. Build graph using TF operations # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01) train = optimizer.minimize(cost) # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - y_train)) GradientDescent https://www.tensorflow.org/api_docs/python/tf/reduce_mean
  • 37. Run/update graph and get results # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Fit the line for step in range(2001): sess.run(train) if step % 20 == 0: print(step, sess.run(cost), sess.run(W), sess.run(b))
  • 38. import tensorflow as tf # X and Y data x_train = [1, 2, 3] y_train = [1, 2, 3] W = tf.Variable(tf.random_normal([1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Our hypothesis XW+b hypothesis = x_train * W + b # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - y_train)) # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Fit the line for step in range(2001): sess.run(train) if step % 20 == 0: print(step, sess.run(cost), sess.run(W), sess.run(b)) ''' 0 2.82329 [ 2.12867713] [-0.85235667] 20 0.190351 [ 1.53392804] [-1.05059612] 40 0.151357 [ 1.45725465] [-1.02391243] ... 1920 1.77484e-05 [ 1.00489295] [-0.01112291] 1940 1.61197e-05 [ 1.00466311] [-0.01060018] 1960 1.46397e-05 [ 1.004444] [-0.01010205] 1980 1.32962e-05 [ 1.00423515] [-0.00962736] 2000 1.20761e-05 [ 1.00403607] [-0.00917497] ''' Full code (less than 20 lines)
  • 40. Placeholders # X and Y data x_train = [1, 2, 3] y_train = [1, 2, 3] # Now we can use X and Y in place of x_data and y_data # # placeholders for a tensor that will be always fed using feed_dict # See http://stackoverflow.com/questions/36693740/ X = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) ... # Fit the line # Fit the line for step in range(2001): cost_val, W_val, b_val, _ = sess.run([cost, W, b, train], feed_dict={X: [1, 2, 3], Y: [1, 2, 3]}) if step % 20 == 0: print(step, cost_val, W_val, b_val)
  • 41. import tensorflow as tf W = tf.Variable(tf.random_normal([1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') X = tf.placeholder(tf.float32, shape=[None]) Y = tf.placeholder(tf.float32, shape=[None]) # Our hypothesis XW+b hypothesis = X * W + b # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Fit the line for step in range(2001): cost_val, W_val, b_val, _ = sess.run([cost, W, b, train], feed_dict={X: [1, 2, 3], Y: [1, 2, 3]}) if step % 20 == 0: print(step, cost_val, W_val, b_val) ... 1980 1.32962e-05 [ 1.00423515] [-0.00962736] 2000 1.20761e-05 [ 1.00403607] [-0.00917497] # Testing our model print(sess.run(hypothesis, feed_dict={X: [5]})) print(sess.run(hypothesis, feed_dict={X: [2.5]})) print(sess.run(hypothesis, feed_dict={X: [1.5, 3.5]})) [ 5.0110054] [ 2.50091505] [ 1.49687922 3.50495124] Full code with placeholders
  • 42. import tensorflow as tf W = tf.Variable(tf.random_normal([1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') X = tf.placeholder(tf.float32, shape=[None]) Y = tf.placeholder(tf.float32, shape=[None]) # Our hypothesis XW+b hypothesis = X * W + b # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Fit the line with new training data for step in range(2001): cost_val, W_val, b_val, _ = sess.run([cost, W, b, train], feed_dict={X: [1, 2, 3, 4, 5], Y: [2.1, 3.1, 4.1, 5.1, 6.1]}) if step % 20 == 0: print(step, cost_val, W_val, b_val) … 1960 3.32396e-07 [ 1.00037301] [ 1.09865296] 1980 2.90429e-07 [ 1.00034881] [ 1.09874094] 2000 2.5373e-07 [ 1.00032604] [ 1.09882331] # Testing our model print(sess.run(hypothesis, feed_dict={X: [5]})) print(sess.run(hypothesis, feed_dict={X: [2.5]})) print(sess.run(hypothesis, feed_dict={X: [1.5, 3.5]})) [ 6.10045338] [ 3.59963846] [ 2.59931231 4.59996414] Full code with placeholders
  • 43. TensorFlow Mechanics feed data and run graph (operation) sess.run (op, feed_dict={x: x_data}) update variables in the graph (and return values) Build graph using TensorFlow operations feed_dict={X: [1, 2, 3, 4, 5], Y: [2.1, 3.1, 4.1, 5.1, 6.1]})
  • 44. Lab 3 Minimizing Cost Sung Kim <hunkim+ml@gmail.com> With TF 1.0!
  • 45. Lab 3 Minimizing Cost With TF 1.0! Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/
  • 46. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 47. Lab 3 Minimizing Cost With TF 1.0! Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/
  • 50. import tensorflow as tf import matplotlib.pyplot as plt X = [1, 2, 3] Y = [1, 2, 3] W = tf.placeholder(tf.float32) # Our hypothesis for linear model X * W hypothesis = X * W # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Variables for plotting cost function W_val = [] cost_val = [] for i in range(-30, 50): feed_W = i * 0.1 curr_cost, curr_W = sess.run([cost, W], feed_dict={W: feed_W}) W_val.append(curr_W) cost_val.append(curr_cost) # Show the cost function plt.plot(W_val, cost_val) plt.show() https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-1-minimizing_cost_show_graph.py http://matplotlib.org/users/installing.html
  • 51. W cost (W) import tensorflow as tf import matplotlib.pyplot as plt X = [1, 2, 3] Y = [1, 2, 3] W = tf.placeholder(tf.float32) # Our hypothesis for linear model X * W hypothesis = X * W # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Variables for plotting cost function W_val = [] cost_val = [] for i in range(-30, 50): feed_W = i * 0.1 curr_cost, curr_W = sess.run([cost, W], feed_dict={W: feed_W}) W_val.append(curr_W) cost_val.append(curr_cost) # Show the cost function plt.plot(W_val, cost_val) plt.show() https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-1-minimizing_cost_show_graph.py
  • 53. W cost (W) Gradient descent # Minimize: Gradient Descent using derivative: W -= learning_rate * derivative learning_rate = 0.1 gradient = tf.reduce_mean((W * X - Y) * X) descent = W - learning_rate * gradient update = W.assign(descent)
  • 54. import tensorflow as tf x_data = [1, 2, 3] y_data = [1, 2, 3] W = tf.Variable(tf.random_normal([1]), name='weight') X = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) # Our hypothesis for linear model X * W hypothesis = X * W # cost/loss function cost = tf.reduce_sum(tf.square(hypothesis - Y)) # Minimize: Gradient Descent using derivative: W -= learning_rate * derivative learning_rate = 0.1 gradient = tf.reduce_mean((W * X - Y) * X) descent = W - learning_rate * gradient update = W.assign(descent) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) for step in range(21): sess.run(update, feed_dict={X: x_data, Y: y_data}) print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W)) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-2-minimizing_cost_gradient_update.py
  • 55. import tensorflow as tf x_data = [1, 2, 3] y_data = [1, 2, 3] W = tf.Variable(tf.random_normal([1]), name='weight') X = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) # Our hypothesis for linear model X * W hypothesis = X * W # cost/loss function cost = tf.reduce_sum(tf.square(hypothesis - Y)) # Minimize: Gradient Descent using derivative: W -= learning_rate * derivative learning_rate = 0.1 gradient = tf.reduce_mean((W * X - Y) * X) descent = W - learning_rate * gradient update = W.assign(descent) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) for step in range(21): sess.run(update, feed_dict={X: x_data, Y: y_data}) print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W)) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-2-minimizing_cost_gradient_update.py 0 5.81756 [ 1.64462376] 1 1.65477 [ 1.34379935] 2 0.470691 [ 1.18335962] 3 0.133885 [ 1.09779179] 4 0.0380829 [ 1.05215561] 5 0.0108324 [ 1.0278163] 6 0.00308123 [ 1.01483536] 7 0.000876432 [ 1.00791216] 8 0.00024929 [ 1.00421977] 9 7.09082e-05 [ 1.00225055] 10 2.01716e-05 [ 1.00120032] 11 5.73716e-06 [ 1.00064015] 12 1.6319e-06 [ 1.00034142] 13 4.63772e-07 [ 1.00018203] 14 1.31825e-07 [ 1.00009704] 15 3.74738e-08 [ 1.00005174] 16 1.05966e-08 [ 1.00002754] 17 2.99947e-09 [ 1.00001466] 18 8.66635e-10 [ 1.00000787] 19 2.40746e-10 [ 1.00000417] 20 7.02158e-11 [ 1.00000226]
  • 56. # Minimize: Gradient Descent Magic optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) train = optimizer.minimize(cost) import tensorflow as tf x_data = [1, 2, 3] y_data = [1, 2, 3] W = tf.Variable(tf.random_normal([1]), name='weight') X = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) # Our hypothesis for linear model X * W hypothesis = X * W # cost/loss function cost = tf.reduce_sum(tf.square(hypothesis - Y)) # Minimize: Gradient Descent using derivative: W -= learning_rate * derivative learning_rate = 0.1 gradient = tf.reduce_mean((W * X - Y) * X) descent = W - learning_rate * gradient update = W.assign(descent) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) for step in range(21): sess.run(update, feed_dict={X: x_data, Y: y_data}) print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W)) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-2-minimizing_cost_gradient_update.py
  • 57. Output when W=5 import tensorflow as tf # tf Graph Input X = [1, 2, 3] Y = [1, 2, 3] # Set wrong model weights W = tf.Variable(5.0) # Linear model hypothesis = X * W # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize: Gradient Descent Magic optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) for step in range(100): print(step, sess.run(W)) sess.run(train) 0 5.0 1 1.26667 2 1.01778 3 1.00119 4 1.00008 5 1.00001 6 1.0 7 1.0 8 1.0 9 1.0 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-3-minimizing_cost_tf_optimizer.py
  • 58. Output when W=-3 0 -3.0 1 0.733334 2 0.982222 3 0.998815 4 0.999921 5 0.999995 6 1.0 7 1.0 8 1.0 9 1.0 import tensorflow as tf # tf Graph Input X = [1, 2, 3] Y = [1, 2, 3] # Set wrong model weights W = tf.Variable(-3.0) # Linear model hypothesis = X * W # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize: Gradient Descent Magic optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) for step in range(100): print(step, sess.run(W)) sess.run(train) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-3-minimizing_cost_tf_optimizer.py
  • 59. import tensorflow as tf X = [1, 2, 3] Y = [1, 2, 3] # Set wrong model weights W = tf.Variable(5.) # Linear model hypothesis = X * W # Manual gradient gradient = tf.reduce_mean((W * X - Y) * X) * 2 # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01) # Get gradients gvs = optimizer.compute_gradients(cost, [W]) # Apply gradients apply_gradients = optimizer.apply_gradients(gvs) # Launch the graph in a session. sess = tf.Session() sess.run(tf.global_variables_initializer()) for step in range(100): print(step, sess.run([gradient, W, gvs])) sess.run(apply_gradients) Optional: compute_gradient and apply_gradient 0 [37.333332, 5.0, [(37.333336, 5.0)]] 1 [33.848888, 4.6266665, [(33.848888, 4.6266665)]] 2 [30.689657, 4.2881775, [(30.689657, 4.2881775)]] 3 [27.825287, 3.9812808, [(27.825287, 3.9812808)]] 4 [25.228262, 3.703028, [(25.228264, 3.703028)]] ... 96 [0.0030694802, 1.0003289, [(0.0030694804, 1.0003289)]] 97 [0.0027837753, 1.0002983, [(0.0027837753, 1.0002983)]] 98 [0.0025234222, 1.0002704, [(0.0025234222, 1.0002704)]] 99 [0.0022875469, 1.0002451, [(0.0022875469, 1.0002451)]] https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-X-minimizing_cost_tf_gradient.py
  • 60. Lab 4 Multi-variable linear regression Sung Kim <hunkim+ml@gmail.com> With TF 1.0!
  • 61. Lab 4 Multi-variable linear regression With TF 1.0! Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/
  • 62. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 63. Lab 4-1 Multi-variable linear regression With TF 1.0! Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/
  • 65. Hypothesis using matrix x1 x2 x3 Y 73 80 75 152 93 88 93 185 89 91 90 180 96 98 100 196 73 66 70 142 Test Scores for General Psychology
  • 66. Hypothesis using matrix x1 x2 x3 Y 73 80 75 152 93 88 93 185 89 91 90 180 96 98 100 196 73 66 70 142 Test Scores for General Psychology x1_data = [73., 93., 89., 96., 73.] x2_data = [80., 88., 91., 98., 66.] x3_data = [75., 93., 90., 100., 70.] y_data = [152., 185., 180., 196., 142.] # placeholders for a tensor that will be always fed. x1 = tf.placeholder(tf.float32) x2 = tf.placeholder(tf.float32) x3 = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) w1 = tf.Variable(tf.random_normal([1]), name='weight1') w2 = tf.Variable(tf.random_normal([1]), name='weight2') w3 = tf.Variable(tf.random_normal([1]), name='weight3') b = tf.Variable(tf.random_normal([1]), name='bias') hypothesis = x1 * w1 + x2 * w2 + x3 * w3 + b https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-1-multi_variable_linear_regression.py
  • 67. import tensorflow as tf x1_data = [73., 93., 89., 96., 73.] x2_data = [80., 88., 91., 98., 66.] x3_data = [75., 93., 90., 100., 70.] y_data = [152., 185., 180., 196., 142.] # placeholders for a tensor that will be always fed. x1 = tf.placeholder(tf.float32) x2 = tf.placeholder(tf.float32) x3 = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) w1 = tf.Variable(tf.random_normal([1]), name='weight1') w2 = tf.Variable(tf.random_normal([1]), name='weight2') w3 = tf.Variable(tf.random_normal([1]), name='weight3') b = tf.Variable(tf.random_normal([1]), name='bias') hypothesis = x1 * w1 + x2 * w2 + x3 * w3 + b # cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize. Need a very small learning rate for this data set optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) for step in range(2001): cost_val, hy_val, _ = sess.run([cost, hypothesis, train], feed_dict={x1: x1_data, x2: x2_data, x3: x3_data, Y: y_data}) if step % 10 == 0: print(step, "Cost: ", cost_val, "nPrediction:n", hy_val) 0 Cost: 19614.8 Prediction: [ 21.69748688 39.10213089 31.82624626 35.14236832 32.55316544] 10 Cost: 14.0682 Prediction: [ 145.56100464 187.94958496 178.50236511 194.86721802 146.08096313] ... 1990 Cost: 4.9197 Prediction: [ 148.15084839 186.88632202 179.6293335 195.81796265 144.46044922] 2000 Cost: 4.89449 Prediction: [ 148.15931702 186.8805542 179.63194275 195.81971741 144.45298767] https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-1-multi_variable_linear_regression.py
  • 69. x_data = [[73., 80., 75.], [93., 88., 93.], [89., 91., 90.], [96., 98., 100.], [73., 66., 70.]] y_data = [[152.], [185.], [180.], [196.], [142.]] # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 3]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([3, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis hypothesis = tf.matmul(X, W) + b https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-2-multi_variable_matmul_linear_regression.py Matrix
  • 70. import tensorflow as tf x_data = [[73., 80., 75.], [93., 88., 93.], [89., 91., 90.], [96., 98., 100.], [73., 66., 70.]] y_data = [[152.], [185.], [180.], [196.], [142.]] # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 3]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([3, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis hypothesis = tf.matmul(X, W) + b # Simplified cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) for step in range(2001): cost_val, hy_val, _ = sess.run( [cost, hypothesis, train], feed_dict={X: x_data, Y: y_data}) if step % 10 == 0: print(step, "Cost: ", cost_val, "nPrediction:n", hy_val) 0 Cost: 7105.46 Prediction: [[ 80.82241058] [ 92.26364136] [ 93.70250702] [ 98.09217834] [ 72.51759338]] 10 Cost: 5.89726 Prediction: [[ 155.35159302] [ 181.85691833] [ 181.97254944] [ 194.21760559] [ 140.85707092]] ... 1990 Cost: 3.18588 Prediction: [[ 154.36352539] [ 182.94833374] [ 181.85189819] [ 194.35585022] [ 142.03240967]] 2000 Cost: 3.1781 Prediction: [[ 154.35881042] [ 182.95147705] [ 181.85035706] [ 194.35533142] [ 142.036026 ]] https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-2-multi_variable_matmul_linear_regression.py
  • 71. Lab 4-2 Loading Data from File With TF 1.0! Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/
  • 73. Loading data from file data-01-test-score.csv # EXAM1,EXAM2,EXAM3,FINAL 73,80,75,152 93,88,93,185 89,91,90,180 96,98,100,196 73,66,70,142 53,46,55,101 import numpy as np xy = np.loadtxt('data-01-test-score.csv', delimiter=',', dtype=np.float32) x_data = xy[:, 0:-1] y_data = xy[:, [-1]] # Make sure the shape and data are OK print(x_data.shape, x_data, len(x_data)) print(y_data.shape, y_data) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py
  • 76. Loading data from file data-01-test-score.csv # EXAM1,EXAM2,EXAM3,FINAL 73,80,75,152 93,88,93,185 89,91,90,180 96,98,100,196 73,66,70,142 53,46,55,101 import numpy as np xy = np.loadtxt('data-01-test-score.csv', delimiter=',', dtype=np.float32) x_data = xy[:, 0:-1] y_data = xy[:, [-1]] # Make sure the shape and data are OK print(x_data.shape, x_data, len(x_data)) print(y_data.shape, y_data) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py
  • 77. import tensorflow as tf import numpy as np tf.set_random_seed(777) # for reproducibility xy = np.loadtxt('data-01-test-score.csv', delimiter=',', dtype=np.float32) x_data = xy[:, 0:-1] y_data = xy[:, [-1]] # Make sure the shape and data are OK print(x_data.shape, x_data, len(x_data)) print(y_data.shape, y_data) # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 3]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([3, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis hypothesis = tf.matmul(X, W) + b # Simplified cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Set up feed_dict variables inside the loop. for step in range(2001): cost_val, hy_val, _ = sess.run( [cost, hypothesis, train], feed_dict={X: x_data, Y: y_data}) if step % 10 == 0: print(step, "Cost: ", cost_val, "nPrediction:n", hy_val) # Ask my score print("Your score will be ", sess.run(hypothesis, feed_dict={X: [[100, 70, 101]]})) print("Other scores will be ", sess.run(hypothesis, feed_dict={X: [[60, 70, 110], [90, 100, 80]]})) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py
  • 78. Your score will be [[ 181.73277283]] Other scores will be [[ 145.86265564] [ 187.23129272]] Output https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Set up feed_dict variables inside the loop. for step in range(2001): cost_val, hy_val, _ = sess.run( [cost, hypothesis, train], feed_dict={X: x_data, Y: y_data}) if step % 10 == 0: print(step, "Cost: ", cost_val, "nPrediction:n", hy_val) # Ask my score print("Your score will be ", sess.run(hypothesis, feed_dict={X: [[100, 70, 101]]})) print("Other scores will be ", sess.run(hypothesis, feed_dict={X: [[60, 70, 110], [90, 100, 80]]}))
  • 80. filename_queue = tf.train.string_input_producer( ['data-01-test-score.csv', 'data-02-test-score.csv', ... ], shuffle=False, name='filename_queue') reader = tf.TextLineReader() key, value = reader.read(filename_queue) record_defaults = [[0.], [0.], [0.], [0.]] xy = tf.decode_csv(value, record_defaults=record_defaults) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-4-tf_reader_linear_regression.py
  • 81. tf.train.batch # collect batches of csv in train_x_batch, train_y_batch = tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10) sess = tf.Session() ... # Start populating the filename queue. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) for step in range(2001): x_batch, y_batch = sess.run([train_x_batch, train_y_batch]) ... coord.request_stop() coord.join(threads) https://www.tensorflow.org/programmers_guide/reading_data
  • 82. import tensorflow as tf filename_queue = tf.train.string_input_producer( ['data-01-test-score.csv'], shuffle=False, name='filename_queue') reader = tf.TextLineReader() key, value = reader.read(filename_queue) # Default values, in case of empty columns. Also specifies the type of the # decoded result. record_defaults = [[0.], [0.], [0.], [0.]] xy = tf.decode_csv(value, record_defaults=record_defaults) # collect batches of csv in train_x_batch, train_y_batch = tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10) # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 3]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([3, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis hypothesis = tf.matmul(X, W) + b # Simplified cost/loss function cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5) train = optimizer.minimize(cost) # Launch the graph in a session. sess = tf.Session() # Initializes global variables in the graph. sess.run(tf.global_variables_initializer()) # Start populating the filename queue. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) for step in range(2001): x_batch, y_batch = sess.run([train_x_batch, train_y_batch]) cost_val, hy_val, _ = sess.run( [cost, hypothesis, train], feed_dict={X: x_batch, Y: y_batch}) if step % 10 == 0: print(step, "Cost: ", cost_val, "nPrediction:n", hy_val) coord.request_stop() coord.join(threads) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-4-tf_reader_linear_regression.py
  • 83. shuffle_batch # min_after_dequeue defines how big a buffer we will randomly sample # from -- bigger means better shuffling but slower start up and more # memory used. # capacity must be larger than min_after_dequeue and the amount larger # determines the maximum we will prefetch. Recommendation: # min_after_dequeue + (num_threads + a small safety margin) * batch_size min_after_dequeue = 10000 capacity = min_after_dequeue + 3 * batch_size example_batch, label_batch = tf.train.shuffle_batch( [example, label], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) https://www.tensorflow.org/programmers_guide/reading_data
  • 84. Lab 5 Logistic (regression) classifier Sung Kim <hunkim+ml@gmail.com>
  • 85. Lab 5 Logistic (regression) classifier Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 86. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 87. Lab 5 Logistic (regression) classifier Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 90. Training Data x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]] y_data = [[0], [0], [0], [1], [1], [1]] # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 2]) Y = tf.placeholder(tf.float32, shape=[None, 1]) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py
  • 91. X = tf.placeholder(tf.float32, shape=[None, 2]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([2, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W) + b)) hypothesis = tf.sigmoid(tf.matmul(X, W) + b) # cost/loss function cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)) train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost) # Accuracy computation # True if hypothesis>0.5 else False predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32) accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32)) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py
  • 92. Train the model # Launch graph with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) for step in range(10001): cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data}) if step % 200 == 0: print(step, cost_val) # Accuracy report h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data}) print("nHypothesis: ", h, "nCorrect (Y): ", c, "nAccuracy: ", a) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py
  • 93. x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]] y_data = [[0], [0], [0], [1], [1], [1]] # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 2]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([2, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W))) hypothesis = tf.sigmoid(tf.matmul(X, W) + b) # cost/loss function cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)) train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost) # Accuracy computation # True if hypothesis>0.5 else False predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32) accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32)) # Launch graph with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) for step in range(10001): cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data}) if step % 200 == 0: print(step, cost_val) # Accuracy report h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data}) print("nHypothesis: ", h, "nCorrect (Y): ", c, "nAccuracy: ", a) # step, cost 0 1.73078 200 0.571512 400 0.507414 ... 9600 0.154132 9800 0.151778 10000 0.149496 Hypothesis: [[ 0.03074029] [ 0.15884677] [ 0.30486736] [ 0.78138196] [ 0.93957496] [ 0.98016882]] Correct (Y): [[ 0.] [ 0.] [ 0.] [ 1.] [ 1.] [ 1.]] Accuracy: 1.0 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py
  • 94. Classifying diabetes xy = np.loadtxt('data-03-diabetes.csv', delimiter=',', dtype=np.float32) x_data = xy[:, 0:-1] y_data = xy[:, [-1]] https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-2-logistic_regression_diabetes.py
  • 95. xy = np.loadtxt('data-03-diabetes.csv', delimiter=',', dtype=np.float32) x_data = xy[:, 0:-1] y_data = xy[:, [-1]] # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 8]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([8, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W))) hypothesis = tf.sigmoid(tf.matmul(X, W) + b) # cost/loss function cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)) train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost) # Accuracy computation # True if hypothesis>0.5 else False predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32) accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32)) # Launch graph with tf.Session() as sess: sess.run(tf.global_variables_initializer()) feed = {X: x_data, Y: y_data} for step in range(10001): sess.run(train, feed_dict=feed) if step % 200 == 0: print(step, sess.run(cost, feed_dict=feed)) # Accuracy report h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict=feed) print("nHypothesis: ", h, "nCorrect (Y): ", c, "nAccuracy: ", a) 0 0.82794 200 0.755181 400 0.726355 600 0.705179 800 0.686631 ... 9600 0.492056 9800 0.491396 10000 0.490767 [ 0.7461012 ] [ 0.79919308] [ 0.72995949] [ 0.88297188]] [ 1.] [ 1.] [ 1.]] Accuracy: 0.762846 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-2-logistic_regression_diabetes.py
  • 96. Exercise ● CSV reading using tf.decode_csv ● Try other classification data from Kaggle ○ https://www.kaggle.com
  • 97. Lab 6 Softmax classifier Sung Kim <hunkim+ml@gmail.com>
  • 98. Lab 6 Softmax Classifier Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 99. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 100. Lab 6-1 Softmax Classifier Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 104. Cost function: cross entropy # Cross entropy cost/loss cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) https://www.udacity.com/course/viewer#!/c-ud730/l-6370362152/m-6379811817
  • 105. x_data = [[1, 2, 1, 1], [2, 1, 3, 2], [3, 1, 3, 4], [4, 1, 5, 5], [1, 7, 5, 5], [1, 2, 5, 6], [1, 6, 6, 6], [1, 7, 7, 7]] y_data = [[0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 1, 0], [0, 1, 0], [0, 1, 0], [1, 0, 0], [1, 0, 0]] X = tf.placeholder("float", [None, 4]) Y = tf.placeholder("float", [None, 3]) nb_classes = 3 W = tf.Variable(tf.random_normal([4, nb_classes]), name='weight') b = tf.Variable(tf.random_normal([nb_classes]), name='bias') # tf.nn.softmax computes softmax activations # softmax = exp(logits) / reduce_sum(exp(logits), dim) hypothesis = tf.nn.softmax(tf.matmul(X, W) + b) # Cross entropy cost/loss cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) # Launch graph with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for step in range(2001): sess.run(optimizer, feed_dict={X: x_data, Y: y_data}) if step % 200 == 0: print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data})) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-1-softmax_classifier.py
  • 106. Test & one-hot encoding # Testing & One-hot encoding a = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9]]}) print(a, sess.run(tf.arg_max(a, 1))) hypothesis = tf.nn.softmax(tf.matmul(X,W)+b) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-1-softmax_classifier.py [[ 1.38904958e-03 9.98601854e-01 9.06129117e-06]] [1]
  • 107. Test & one-hot encoding all = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9], [1, 3, 4, 3], [1, 1, 0, 1]]}) print(all, sess.run(tf.arg_max(all, 1))) [[ 1.38904958e-03 9.98601854e-01 9.06129117e-06] [ 9.31192040e-01 6.29020557e-02 5.90589503e-03] [ 1.27327668e-08 3.34112905e-04 9.99665856e-01]] [1 0 2] hypothesis = tf.nn.softmax(tf.matmul(X,W)+b) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-1-softmax_classifier.py
  • 108. Lab 6-2 Fancy Softmax Classifier cross_entropy, one_hot, reshape Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 110. softmax_cross_entropy_with_logits logits = tf.matmul(X, W) + b hypothesis = tf.nn.softmax(logits) # Cross entropy cost/loss cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) # Cross entropy cost/loss cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y_one_hot) cost = tf.reduce_mean(cost_i)
  • 112. softmax_cross_entropy_with_logits logits = tf.matmul(X, W) + b hypothesis = tf.nn.softmax(logits) # Cross entropy cost/loss cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) # Cross entropy cost/loss cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y_one_hot) cost = tf.reduce_mean(cost_i)
  • 113. Animal classification with softmax_cross_entropy_with_logits https://kr.pinterest.com/explore/animal-classification-activity/ # Predicting animal type based on various features xy = np.loadtxt('data-04-zoo.csv', delimiter=',', dtype=np.float32) x_data = xy[:, 0:-1] y_data = xy[:, [-1]]
  • 114. tf.one_hot and reshape Y = tf.placeholder(tf.int32, [None, 1]) # 0 ~ 6, shape=(?, 1) Y_one_hot = tf.one_hot(Y, nb_classes) # one hot shape=(?, 1, 7) Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes]) # shape=(?, 7) If the input indices is rank N, the output will have rank N+1. The new axis is created at dimension axis (default: the new axis is appended at the end). https://www.tensorflow.org/api_docs/python/tf/one_hot
  • 115. # Predicting animal type based on various features xy = np.loadtxt('data-04-zoo.csv', delimiter=',', dtype=np.float32) x_data = xy[:, 0:-1] y_data = xy[:, [-1]] nb_classes = 7 # 0 ~ 6 X = tf.placeholder(tf.float32, [None, 16]) Y = tf.placeholder(tf.int32, [None, 1]) # 0 ~ 6 Y_one_hot = tf.one_hot(Y, nb_classes) # one hot Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes]) W = tf.Variable(tf.random_normal([16, nb_classes]), name='weight') b = tf.Variable(tf.random_normal([nb_classes]), name='bias') # tf.nn.softmax computes softmax activations # softmax = exp(logits) / reduce_sum(exp(logits), dim) logits = tf.matmul(X, W) + b hypothesis = tf.nn.softmax(logits) # Cross entropy cost/loss cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y_one_hot) cost = tf.reduce_mean(cost_i) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-2-softmax_zoo_classifier.py
  • 116. cost = tf.reduce_mean(cost_i) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) prediction = tf.argmax(hypothesis, 1) correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) # Launch graph with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for step in range(2000): sess.run(optimizer, feed_dict={X: x_data, Y: y_data}) if step % 100 == 0: loss, acc = sess.run([cost, accuracy], feed_dict={ X: x_data, Y: y_data}) print("Step: {:5}tLoss: {:.3f}tAcc: {:.2%}".format( step, loss, acc)) # Let's see if we can predict pred = sess.run(prediction, feed_dict={X: x_data}) # y_data: (N,1) = flatten => (N, ) matches pred.shape for p, y in zip(pred, y_data.flatten()): print("[{}] Prediction: {} True Y: {}".format(p == int(y), p, int(y))) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-2-softmax_zoo_classifier.py
  • 117. cost = tf.reduce_mean(cost_i) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) prediction = tf.argmax(hypothesis, 1) correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) # Launch graph with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for step in range(2000): sess.run(optimizer, feed_dict={X: x_data, Y: y_data}) if step % 100 == 0: loss, acc = sess.run([cost, accuracy], feed_dict={ X: x_data, Y: y_data}) print("Step: {:5}tLoss: {:.3f}tAcc: {:.2%}".format( step, loss, acc)) # Let's see if we can predict pred = sess.run(prediction, feed_dict={X: x_data}) # y_data: (N,1) = flatten => (N, ) matches pred.shape for p, y in zip(pred, y_data.flatten()): print("[{}] Prediction: {} True Y: {}". format(p == int(y), p, int(y))) Step: 1100 Loss: 0.101 Acc: 99.01% Step: 1200 Loss: 0.092 Acc: 100.00% Step: 1300 Loss: 0.084 Acc: 100.00% ... [True] Prediction: 0 True Y: 0 [True] Prediction: 0 True Y: 0 [True] Prediction: 3 True Y: 3 [True] Prediction: 0 True Y: 0 [True] Prediction: 0 True Y: 0 [True] Prediction: 0 True Y: 0 [True] Prediction: 0 True Y: 0 [True] Prediction: 3 True Y: 3 [True] Prediction: 3 True Y: 3 [True] Prediction: 0 True Y: 0 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-2-softmax_zoo_classifier.py
  • 118. Lab 7 Learning rate, Evaluation Sung Kim <hunkim+ml@gmail.com>
  • 119. Lab 7-1 Learning rate, Evaluation Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 120. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 121. Lab 7-1 Learning rate, Evaluation Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 123. Training and Test datasets x_data = [[1, 2, 1], [1, 3, 2], [1, 3, 4], [1, 5, 5], [1, 7, 5], [1, 2, 5], [1, 6, 6], [1, 7, 7]] y_data = [[0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 1, 0], [0, 1, 0], [0, 1, 0], [1, 0, 0], [1, 0, 0]] # Evaluation our model using this test dataset x_test = [[2, 1, 1], [3, 1, 2], [3, 3, 4]] y_test = [[0, 0, 1], [0, 0, 1], [0, 0, 1]] https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py
  • 124. X = tf.placeholder("float", [None, 3]) Y = tf.placeholder("float", [None, 3]) W = tf.Variable(tf.random_normal([3, 3])) b = tf.Variable(tf.random_normal([3])) hypothesis = tf.nn.softmax(tf.matmul(X, W)+b) cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) # Correct prediction Test model prediction = tf.arg_max(hypothesis, 1) is_correct = tf.equal(prediction, tf.arg_max(Y, 1)) accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32)) # Launch graph with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) for step in range(201): cost_val, W_val, _ = sess.run([cost, W, optimizer], feed_dict={X: x_data, Y: y_data}) print(step, cost_val, W_val) # predict print("Prediction:", sess.run(prediction, feed_dict={X: x_test})) # Calculate the accuracy print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test})) 199 0.672261 [[-1.15377033 0.28146935 1.13632679] [ 0.37484586 0.18958236 0.33544877] [-0.35609841 -0.43973011 -1.25604188]] 200 0.670909 [[-1.15885413 0.28058422 1.14229572] [ 0.37609792 0.19073224 0.33304682] [-0.35536593 -0.44033223 -1.2561723 ]] Prediction: [2 2 2] Accuracy: 1.0 Training and Test datasets https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py
  • 126. Big learning rate 2 27.2798 [[ 0.44451016 0.85699677 -1.03748143] [ 0.48429942 0.98872018 -0.57314301] [ 1.52989244 1.16229868 -4.74406147]] 3 8.668 [[ 0.12396193 0.61504567 -0.47498202] [ 0.22003263 -0.2470119 0.9268558 ] [ 0.96035379 0.41933775 -3.43156195]] 4 5.77111 [[-0.9524312 1.13037777 0.08607888] [-3.78651619 2.26245379 2.42393875] [-3.07170963 3.14037919 -2.12054014]] 5 inf [[ nan nan nan] [ nan nan nan] [ nan nan nan]] 6 nan [[ nan nan nan] [ nan nan nan] [ nan nan nan]] ... Prediction: [0 0 0] Accuracy: 0.0 X = tf.placeholder("float", [None, 3]) Y = tf.placeholder("float", [None, 3]) W = tf.Variable(tf.random_normal([3, 3])) b = tf.Variable(tf.random_normal([3])) hypothesis = tf.nn.softmax(tf.matmul(X, W)+b) cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) optimizer = tf.train.GradientDescentOptimizer (learning_rate=1.5).minimize(cost) # Correct prediction Test model prediction = tf.arg_max(hypothesis, 1) is_correct = tf.equal(prediction, tf.arg_max(Y, 1)) accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32)) # Launch graph with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) for step in range(201): cost_val, W_val, _ = sess.run([cost, W, optimizer], feed_dict={X: x_data, Y: y_data}) print(step, cost_val, W_val) # predict print("Prediction:", sess.run(prediction, feed_dict={X: x_test})) # Calculate the accuracy print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test})) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py
  • 127. Small learning rate X = tf.placeholder("float", [None, 3]) Y = tf.placeholder("float", [None, 3]) W = tf.Variable(tf.random_normal([3, 3])) b = tf.Variable(tf.random_normal([3])) hypothesis = tf.nn.softmax(tf.matmul(X, W)+b) cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) optimizer = tf.train.GradientDescentOptimizer (learning_rate=1e-10).minimize(cost) # Correct prediction Test model prediction = tf.arg_max(hypothesis, 1) is_correct = tf.equal(prediction, tf.arg_max(Y, 1)) accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32)) # Launch graph with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) for step in range(201): cost_val, W_val, _ = sess.run([cost, W, optimizer], feed_dict={X: x_data, Y: y_data}) print(step, cost_val, W_val) # predict print("Prediction:", sess.run(prediction, feed_dict={X: x_test})) # Calculate the accuracy print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test})) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py 0 5.73203 [[ 0.80269563 0.67861295 -1.21728313] [-0.3051686 -0.3032113 1.50825703] [ 0.75722361 -0.7008909 -2.10820389]] 1 5.73203 [[ 0.80269563 0.67861295 -1.21728313] [-0.3051686 -0.3032113 1.50825703] [ 0.75722361 -0.7008909 -2.10820389]] ... 198 5.73203 [[ 0.80269563 0.67861295 -1.21728313] [-0.3051686 -0.3032113 1.50825703] [ 0.75722361 -0.7008909 -2.10820389]] 199 5.73203 [[ 0.80269563 0.67861295 -1.21728313] [-0.3051686 -0.3032113 1.50825703] [ 0.75722361 -0.7008909 -2.10820389]] 200 5.73203 [[ 0.80269563 0.67861295 -1.21728313] [-0.3051686 -0.3032113 1.50825703] [ 0.75722361 -0.7008909 -2.10820389]] Prediction: [0 0 0] Accuracy: 0.0
  • 128. Non-normalized inputs xy = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973], [823.02002, 828.070007, 1828100, 821.655029, 828.070007], [819.929993, 824.400024, 1438100, 818.97998, 824.159973], [816, 820.958984, 1008100, 815.48999, 819.23999], [819.359985, 823, 1188100, 818.469971, 818.97998], [819, 823, 1198100, 816, 820.450012], [811.700012, 815.25, 1098100, 809.780029, 813.669983], [809.51001, 816.659973, 1398100, 804.539978, 809.559998]]) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-2-linear_regression_without_min_max.py
  • 129. Non-normalized inputs xy=... x_data = xy[:, 0:-1] y_data = xy[:, [-1]] # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 4]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([4, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') hypothesis = tf.matmul(X, W) + b cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5) train = optimizer.minimize(cost) sess = tf.Session() sess.run(tf.global_variables_initializer()) for step in range(2001): cost_val, hy_val, _ = sess.run( [cost, hypothesis, train], feed_dict={X: x_data, Y: y_data}) print(step, "Cost: ", cost_val, "nPrediction:n", hy_val) 5 Cost: inf Prediction: [[ inf] [ inf] [ inf] ... 6 Cost: nan Prediction: [[ nan] [ nan] [ nan] ... https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-2-linear_regression_without_min_max.py
  • 130. Normalized inputs (min-max scale) xy = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973], [823.02002, 828.070007, 1828100, 821.655029, 828.070007], [819.929993, 824.400024, 1438100, 818.97998, 824.159973], [816, 820.958984, 1008100, 815.48999, 819.23999], [819.359985, 823, 1188100, 818.469971, 818.97998], [819, 823, 1198100, 816, 820.450012], [811.700012, 815.25, 1098100, 809.780029, 813.669983], [809.51001, 816.659973, 1398100, 804.539978, 809.559998]]) [[ 0.99999999 0.99999999 0. 1. 1. ] [ 0.70548491 0.70439552 1. 0.71881782 0.83755791] [ 0.54412549 0.50274824 0.57608696 0.606468 0.6606331 ] [ 0.33890353 0.31368023 0.10869565 0.45989134 0.43800918] [ 0.51436 0.42582389 0.30434783 0.58504805 0.42624401] [ 0.49556179 0.42582389 0.31521739 0.48131134 0.49276137] [ 0.11436064 0. 0.20652174 0.22007776 0.18597238] [ 0. 0.07747099 0.5326087 0. 0. ]] xy = MinMaxScaler(xy) print(xy) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-3-linear_regression_min_max.py
  • 131. Normalized inputs xy=... x_data = xy[:, 0:-1] y_data = xy[:, [-1]] # placeholders for a tensor that will be always fed. X = tf.placeholder(tf.float32, shape=[None, 4]) Y = tf.placeholder(tf.float32, shape=[None, 1]) W = tf.Variable(tf.random_normal([4, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') hypothesis = tf.matmul(X, W) + b cost = tf.reduce_mean(tf.square(hypothesis - Y)) # Minimize optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5) train = optimizer.minimize(cost) sess = tf.Session() sess.run(tf.global_variables_initializer()) for step in range(2001): cost_val, hy_val, _ = sess.run( [cost, hypothesis, train], feed_dict={X: x_data, Y: y_data}) print(step, "Cost: ", cost_val, "nPrediction:n", hy_val) Prediction: [[ 1.63450289] [ 0.06628087] [ 0.35014752] [ 0.67070574] [ 0.61131608] [ 0.61466062] [ 0.23175186] [-0.13716528]] https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-3-linear_regression_min_max.py
  • 132. Lab 7-2 MNIST data Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 135. 28x28x1 image http://derindelimavi.blogspot.hk/2015/04/mnist-el-yazs-rakam-veri-seti.html # MNIST data image of shape 28 * 28 = 784 X = tf.placeholder(tf.float32, [None, 784]) # 0 - 9 digits recognition = 10 classes Y = tf.placeholder(tf.float32, [None, nb_classes])
  • 136. MNIST Dataset from tensorflow.examples.tutorials.mnist import input_data # Check out https://www.tensorflow.org/get_started/mnist/beginners for # more information about the mnist dataset mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) … batch_xs, batch_ys = mnist.train.next_batch(100) … print("Accuracy: ", accuracy.eval(session=sess, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
  • 137. Reading data and set variables from tensorflow.examples.tutorials.mnist import input_data # Check out https://www.tensorflow.org/get_started/mnist/beginners for # more information about the mnist dataset mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) nb_classes = 10 # MNIST data image of shape 28 * 28 = 784 X = tf.placeholder(tf.float32, [None, 784]) # 0 - 9 digits recognition = 10 classes Y = tf.placeholder(tf.float32, [None, nb_classes]) W = tf.Variable(tf.random_normal([784, nb_classes])) b = tf.Variable(tf.random_normal([nb_classes])) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
  • 138. Softmax! # Hypothesis (using softmax) hypothesis = tf.nn.softmax(tf.matmul(X, W) + b) cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) # Test model is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1)) # Calculate accuracy accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32)) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
  • 139. Training epoch/batch # parameters training_epochs = 15 batch_size = 100 with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) # Training cycle for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) c, _ = sess.run([cost, optimizer], feed_dict={X: batch_xs, Y: batch_ys}) avg_cost += c / total_batch print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost)) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
  • 140. Training epoch/batch In the neural network terminology: ● one epoch = one forward pass and one backward pass of all the training examples ● batch size = the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need. ● number of iterations = number of passes, each pass using [batch size] number of examples. To be clear, one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes). Example: if you have 1000 training examples, and your batch size is 500, then it will take 2 iterations to complete 1 epoch. http://stackoverflow.com/questions/4752626/epoch-vs-iteration-when-training-neural-networks
  • 141. Training epoch/batch # parameters training_epochs = 15 batch_size = 100 with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) # Training cycle for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) c, _ = sess.run([cost, optimizer], feed_dict={X: batch_xs, Y: batch_ys}) avg_cost += c / total_batch print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost)) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
  • 142. Report results on test dataset # Test the model using test sets print("Accuracy: ", accuracy.eval(session=sess, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
  • 143. hypothesis = tf.nn.softmax(tf.matmul(X, W) + b) cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1)) accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32)) # parameters training_epochs = 15 batch_size = 100 with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) # Training cycle for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) c, _ = sess.run([cost, optimizer], feed_dict={X: batch_xs, Y: batch_ys}) avg_cost += c / total_batch print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost)) Epoch: 0001 cost = 2.868104637 Epoch: 0002 cost = 1.134684615 Epoch: 0003 cost = 0.908220728 Epoch: 0004 cost = 0.794199896 Epoch: 0005 cost = 0.721815854 Epoch: 0006 cost = 0.670184430 Epoch: 0007 cost = 0.630576546 Epoch: 0008 cost = 0.598888191 Epoch: 0009 cost = 0.573027079 Epoch: 0010 cost = 0.550497213 Epoch: 0011 cost = 0.532001859 Epoch: 0012 cost = 0.515517795 Epoch: 0013 cost = 0.501175288 Epoch: 0014 cost = 0.488425370 Epoch: 0015 cost = 0.476968593 Learning finished Accuracy: 0.888 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
  • 144. Sample image show and prediction import matplotlib.pyplot as plt import random # Get one and predict r = random.randint(0, mnist.test.num_examples - 1) print("Label:", sess.run(tf.argmax(mnist.test.labels[r:r+1], 1))) print("Prediction:", sess.run(tf.argmax(hypothesis, 1), feed_dict={X: mnist.test.images[r:r + 1]})) plt.imshow(mnist.test.images[r:r + 1]. reshape(28, 28), cmap='Greys', interpolation='nearest') plt.show() https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py
  • 145. Lab 8 Tensor Manipulation Sung Kim <hunkim+ml@gmail.com>
  • 146. Lab 8 Tensor Manipulation Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 147. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 148. Lab 8 Tensor Manipulation Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 150. Simple 1D array and slicing Image from http://www.frosteye.net/1233 t = np.array([0., 1., 2., 3., 4., 5., 6.])
  • 151. Simple 1D array and slicing Image from http://www.frosteye.net/1233
  • 167. Ones and Zeros like https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb
  • 170. Lab 9-1 NN for XOR Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 171. Lab 9 NN for XOR Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 172. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 173. Lab 9-1 NN for XOR Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 175. XOR data set x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32) y_data = np.array([[0], [1], [1], [0]], dtype=np.float32) http://templ25.mandaringardencity.com/xor-gate-truth-table-2/
  • 176. x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32) y_data = np.array([[0], [1], [1], [0]], dtype=np.float32) X = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) W = tf.Variable(tf.random_normal([2, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W))) hypothesis = tf.sigmoid(tf.matmul(X, W) + b) # cost/loss function cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)) train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) # Accuracy computation # True if hypothesis>0.5 else False predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32) accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32)) # Launch graph with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) for step in range(10001): sess.run(train, feed_dict={X: x_data, Y: y_data}) if step % 100 == 0: print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W)) # Accuracy report h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data}) print("nHypothesis: ", h, "nCorrect: ", c, "nAccuracy: ", a) XOR with logistic regression? But it doesn’t work https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-1-xor.py
  • 177. XOR with logistic regression? But it doesn’t work! x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32) y_data = np.array([[0], [1], [1], [0]], dtype=np.float32) X = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) W = tf.Variable(tf.random_normal([2, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W))) hypothesis = tf.sigmoid(tf.matmul(X, W) + b) # cost/loss function cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)) train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) # Accuracy computation # True if hypothesis>0.5 else False predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32) accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32)) # Launch graph with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) for step in range(10001): sess.run(train, feed_dict={X: x_data, Y: y_data}) if step % 100 == 0: print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W)) # Accuracy report h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data}) print("nHypothesis: ", h, "nCorrect: ", c, "nAccuracy: ", a) Hypothesis: [[ 0.5] [ 0.5] [ 0.5] [ 0.5]] Correct: [[ 0.] [ 0.] [ 0.] [ 0.]] Accuracy: 0.5 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-1-xor.py
  • 178. Neural Net W = tf.Variable(tf.random_normal([2, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W))) hypothesis = tf.sigmoid(tf.matmul(X, W) + b) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-1-xor.py
  • 179. Neural Net W = tf.Variable(tf.random_normal([2, 1]), name='weight') b = tf.Variable(tf.random_normal([1]), name='bias') # Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W))) hypothesis = tf.sigmoid(tf.matmul(X, W) + b) W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1') b1 = tf.Variable(tf.random_normal([2]), name='bias1') layer1 = tf.sigmoid(tf.matmul(X, W1) + b1) W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2') b2 = tf.Variable(tf.random_normal([1]), name='bias2') hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-2-xor-nn.py
  • 180. NN for XOR x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32) y_data = np.array([[0], [1], [1], [0]], dtype=np.float32) X = tf.placeholder(tf.float32) Y = tf.placeholder(tf.float32) W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1') b1 = tf.Variable(tf.random_normal([2]), name='bias1') layer1 = tf.sigmoid(tf.matmul(X, W1) + b1) W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2') b2 = tf.Variable(tf.random_normal([1]), name='bias2') hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2) # cost/loss function cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis)) train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) # Accuracy computation # True if hypothesis>0.5 else False predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32) accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32)) # Launch graph with tf.Session() as sess: # Initialize TensorFlow variables sess.run(tf.global_variables_initializer()) for step in range(10001): sess.run(train, feed_dict={X: x_data, Y: y_data}) if step % 100 == 0: print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run([W1, W2])) # Accuracy report h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data}) print("nHypothesis: ", h, "nCorrect: ", c, "nAccuracy: ", a) Hypothesis: [[ 0.01338218] [ 0.98166394] [ 0.98809403] [ 0.01135799]] Correct: [[ 0.] [ 1.] [ 1.] [ 0.]] Accuracy: 1.0 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-2-xor-nn.py
  • 181. Wide NN for XOR W1 = tf.Variable(tf.random_normal([2, 10]), name='weight1') b1 = tf.Variable(tf.random_normal([10]), name='bias1') layer1 = tf.sigmoid(tf.matmul(X, W1) + b1) W2 = tf.Variable(tf.random_normal([10, 1]), name='weight2') b2 = tf.Variable(tf.random_normal([1]), name='bias2') hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2) [2,10], [10,1] Hypothesis: [[ 0.00358802] [ 0.99366933] [ 0.99204296] [ 0.0095663 ]] Correct: [[ 0.] [ 1.] [ 1.] [ 0.]] Accuracy: 1.0 [2,2], [2,1] Hypothesis: [[ 0.01338218] [ 0.98166394] [ 0.98809403] [ 0.01135799]] Correct: [[ 0.] [ 1.] [ 1.] [ 0.]] Accuracy: 1.0 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-3-xor-nn-wide-deep.py
  • 182. Deep NN for XOR W1 = tf.Variable(tf.random_normal([2, 10]), name='weight1') b1 = tf.Variable(tf.random_normal([10]), name='bias1') layer1 = tf.sigmoid(tf.matmul(X, W1) + b1) W2 = tf.Variable(tf.random_normal([10, 10]), name='weight2') b2 = tf.Variable(tf.random_normal([10]), name='bias2') layer2 = tf.sigmoid(tf.matmul(layer1, W2) + b2) W3 = tf.Variable(tf.random_normal([10, 10]), name='weight3') b3 = tf.Variable(tf.random_normal([10]), name='bias3') layer3 = tf.sigmoid(tf.matmul(layer2, W3) + b3) W4 = tf.Variable(tf.random_normal([10, 1]), name='weight4') b4 = tf.Variable(tf.random_normal([1]), name='bias4') hypothesis = tf.sigmoid(tf.matmul(layer3, W4) + b4) 4 layers Hypothesis: [[ 7.80e-04] [ 9.99e-01] [ 9.98e-01] [ 1.55e-03]] Correct: [[ 0.] [ 1.] [ 1.] [ 0.]] Accuracy: 1.0 2 layers Hypothesis: [[ 0.01338218] [ 0.98166394] [ 0.98809403] [ 0.01135799]] Correct: [[ 0.] [ 1.] [ 1.] [ 0.]] Accuracy: 1.0 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-3-xor-nn-wide-deep.py
  • 183. Exercise ● Wide and Deep NN for MNIST
  • 184. Lab 9-2 Tensorboard for XOR NN Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 185. TensorBoard: TF logging/debugging tool ●Visualize your TF graph ●Plot quantitative metrics ●Show additional data https://www.tensorflow.org/get_started/summaries_and_tensorboard
  • 186. Old fashion: print, print, print 9400 0.0151413 [array([[ 6.21692038, 6.05913448], [-6.33773184, -5.75189114]], dtype=float32), array([[ 9.93581772], [-9.43034935]], dtype=float32)] 9500 0.014909 [array([[ 6.22498751, 6.07049847], [-6.34637976, -5.76352596]], dtype=float32), array([[ 9.96414757], [-9.45942593]], dtype=float32)] 9600 0.0146836 [array([[ 6.23292685, 6.08166742], [-6.35489035, -5.77496052]], dtype=float32), array([[ 9.99207973], [-9.48807526]], dtype=float32)] 9700 0.0144647 [array([[ 6.24074268, 6.09264851], [-6.36326933, -5.78619957]], dtype=float32), array([[ 10.01962471], [ -9.51631165]], dtype=float32)] 9800 0.0142521 [array([[ 6.24843407, 6.10344648], [-6.37151814, -5.79724932]], dtype=float32), array([[ 10.04679298], [ -9.54414845]], dtype=float32)] 9900 0.0140456 [array([[ 6.25601053, 6.11406422], [-6.3796401 , -5.80811596]], dtype=float32), array([[ 10.07359505], [ -9.57159519]], dtype=float32)] 10000 0.0138448 [array([[ 6.26347113, 6.12451124], [-6.38764334, -5.81880617]], dtype=float32), array([[ 10.10004139], [ -9.59866238]], dtype=float32)]
  • 188. 5 steps of using TensorBoard From TF graph, decide which tensors you want to log w2_hist = tf.summary.histogram("weights2", W2) cost_summ = tf.summary.scalar("cost", cost) Merge all summaries summary = tf.summary.merge_all() Create writer and add graph # Create summary writer writer = tf.summary.FileWriter(‘./logs’) writer.add_graph(sess.graph) Run summary merge and add_summary s, _ = sess.run([summary, optimizer], feed_dict=feed_dict) writer.add_summary(s, global_step=global_step) Launch TensorBoard tensorboard --logdir=./logs
  • 189. Scalar tensors cost_summ = tf.summary.scalar("cost", cost)
  • 190. Histogram (multi-dimensional tensors) W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2') b2 = tf.Variable(tf.random_normal([1]), name='bias2') hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2) w2_hist = tf.summary.histogram("weights2", W2) b2_hist = tf.summary.histogram("biases2", b2) hypothesis_hist = tf.summary.histogram("hypothesis", hypothesis) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py
  • 191. Add scope for better graph hierarchy with tf.name_scope("layer1") as scope: W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1') b1 = tf.Variable(tf.random_normal([2]), name='bias1') layer1 = tf.sigmoid(tf.matmul(X, W1) + b1) w1_hist = tf.summary.histogram("weights1", W1) b1_hist = tf.summary.histogram("biases1", b1) layer1_hist = tf.summary.histogram("layer1", layer1) with tf.name_scope("layer2") as scope: W2 = tf.Variable(tf.random_normal([2, 1]), name='weight2') b2 = tf.Variable(tf.random_normal([1]), name='bias2') hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2) w2_hist = tf.summary.histogram("weights2", W2) b2_hist = tf.summary.histogram("biases2", b2) hypothesis_hist = tf.summary.histogram("hypothesis", hypothesis) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py
  • 192. Merge summaries and create writer after creating session # Summary summary = tf.summary.merge_all() # initialize sess = tf.Session() sess.run(tf.global_variables_initializer()) # Create summary writer writer = tf.summary.FileWriter(TB_SUMMARY_DIR) writer.add_graph(sess.graph) # Add graph in the tensorboard https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py
  • 193. Run merged summary and write (add summary) s, _ = sess.run([summary, optimizer], feed_dict=feed_dict) writer.add_summary(s, global_step=global_step) global_step += 1 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py
  • 194. Launch tensorboard (local) writer = tf.summary.FileWriter("./logs/xor_logs") $ tensorboard —logdir=./logs/xor_logs Starting TensorBoard b'41' on port 6006 (You can navigate to http://127.0.0.1:6006)
  • 195. Launch tensorboard (remote server) ssh -L local_port:127.0.0.1:remote_port username@server.com local> $ ssh -L 7007:121.0.0.0:6006 hunkim@server.com server> $ tensorboard —logdir=./logs/xor_logs (You can navigate to http://127.0.0.1:7007)
  • 197. Multiple runs learning_rate=0.1 VS learning_rate=0.01
  • 198. Multiple runs tensorboard —logdir=./logs/xor_logs train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) ... writer = tf.summary.FileWriter("./logs/xor_logs") tensorboard —logdir=./logs/xor_logs_r0_01 train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost) ... writer = tf.summary.FileWriter(“"./logs/xor_logs_r0_01"”) tensorboard —logdir=./logs
  • 200. 5 steps of using TensorBoard From TF graph, decide which tensors you want to log w2_hist = tf.summary.histogram("weights2", W2) cost_summ = tf.summary.scalar("cost", cost) Merge all summaries summary = tf.summary.merge_all() Create writer and add graph # Create summary writer writer = tf.summary.FileWriter(‘./logs’) writer.add_graph(sess.graph) Run summary merge and add_summary s, _ = sess.run([summary, optimizer], feed_dict=feed_dict) writer.add_summary(s, global_step=global_step) Launch TensorBoard tensorboard --logdir=./logs
  • 201. Exercise ● Wide and Deep NN for MNIST ● Add tensorboard
  • 202. Lab 9-2-E Tensorboard for MNIST Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 203. Visualizing your Deep learning using TensorBoard (TensorFlow) Sung Kim <hunkim+ml@gmail.com>
  • 204. TensorBoard: TF logging/debugging tool ●Visualize your TF graph ●Plot quantitative metrics ●Show additional data https://www.tensorflow.org/get_started/summaries_and_tensorboard
  • 205. Old fashion: print, print, print
  • 207. 5 steps of using TensorBoard From TF graph, decide which tensors you want to log with tf.variable_scope('layer1') as scope: tf.summary.image('input', x_image, 3) tf.summary.histogram("layer", L1) tf.summary.scalar("loss", cost) Merge all summaries summary = tf.summary.merge_all() Create writer and add graph # Create summary writer writer = tf.summary.FileWriter(TB_SUMMARY_DIR) writer.add_graph(sess.graph) Run summary merge and add_summary s, _ = sess.run([summary, optimizer], feed_dict=feed_dict) writer.add_summary(s, global_step=global_step) Launch TensorBoard tensorboard --logdir=/tmp/mnist_logs
  • 208. Image Input # Image input x_image = tf.reshape(X, [-1, 28, 28, 1]) tf.summary.image('input', x_image, 3)
  • 209. Histogram (multi-dimensional tensors) with tf.variable_scope('layer1') as scope: W1 = tf.get_variable("W", shape=[784, 512]) b1 = tf.Variable(tf.random_normal([512])) L1 = tf.nn.relu(tf.matmul(X, W1) + b1) L1 = tf.nn.dropout(L1, keep_prob=keep_prob) tf.summary.histogram("X", X) tf.summary.histogram("weights", W1) tf.summary.histogram("bias", b1) tf.summary.histogram("layer", L1)
  • 211. Add scope for better hierarchy with tf.variable_scope('layer1') as scope: W1 = tf.get_variable("W", shape=[784, 512],... b1 = tf.Variable(tf.random_normal([512])) L1 = tf.nn.relu(tf.matmul(X, W1) + b1) L1 = tf.nn.dropout(L1, keep_prob=keep_prob) tf.summary.histogram("X", X) tf.summary.histogram("weights", W1) tf.summary.histogram("bias", b1) tf.summary.histogram("layer", L1) with tf.variable_scope('layer2') as scope: ... with tf.variable_scope('layer3') as scope: ... with tf.variable_scope('layer4') as scope: ... with tf.variable_scope('layer5') as scope: ...
  • 212. Merge summaries and create writer after creating session # Summary summary = tf.summary.merge_all() # initialize sess = tf.Session() sess.run(tf.global_variables_initializer()) # Create summary writer writer = tf.summary.FileWriter(TB_SUMMARY_DIR) writer.add_graph(sess.graph)
  • 213. Run merged summary and write (add summary) s, _ = sess.run([summary, optimizer], feed_dict=feed_dict) writer.add_summary(s, global_step=global_step) global_step += 1
  • 214. Launch tensorboard (local) writer = tf.summary.FileWriter(“/tmp/mnist_logs”) $ tensorboard —logdir=/tmp/mnist_logs Starting TensorBoard b'41' on port 6006 (You can navigate to http://127.0.0.1:6006)
  • 215. Launch tensorboard (remote server) ssh -L local_port:127.0.0.1:remote_port username@server.com local> $ ssh -L 7007:127.0.0.1:6006 hunkim@server.com server> $ tensorboard —logdir=/tmp/mnist_logs (You can navigate to http://127.0.0.1:7007)
  • 216. Multiple runs tensorboard —logdir=/tmp/mnist_logs/run1 writer = tf.summary.FileWriter(“/tmp/mnist_logs/run1”) tensorboard —logdir=/tmp/mnist_logs/run2 writer = tf.summary.FileWriter(“/tmp/mnist_logs/run1”) tensorboard —logdir=/tmp/mnist_logs
  • 217. 5 steps of using TensorBoard From TF graph, decide which tensors you want to log with tf.variable_scope('layer1') as scope: tf.summary.image('input', x_image, 3) tf.summary.histogram("layer", L1) tf.summary.scalar("loss", cost) Merge all summaries summary = tf.summary.merge_all() Create writer and add graph # Create summary writer writer = tf.summary.FileWriter(TB_SUMMARY_DIR) writer.add_graph(sess.graph) Run summary merge and add_summary s, _ = sess.run([summary, optimizer], feed_dict=feed_dict) writer.add_summary(s, global_step=global_step) Launch TensorBoard tensorboard --logdir=/tmp/mnist_logs
  • 218. Lab 9-3 (optional) NN Backpropagation Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 223. “Yes you should understand backprop” https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b • “If you try to ignore how it works under the hood because TensorFlow automagically makes my networks learn” - “You will not be ready to wrestle with the dangers it presents” - “You will be much less effective at building and debugging neural networks.” • “The good news is that backpropagation is not that difficult to understand” - “if presented properly.”
  • 224. Back propagation (chain rule) http://cs231n.stanford.edu/ w
  • 225. Back propagation (chain rule) http://cs231n.stanford.edu/ w
  • 226. * + Sigmoid loss w b Logistic Regression Network a0
  • 227. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) (4) E=loss(a1 ,t) Network forward a0 Forward pass, OK? Just follow (1), (2), (3) and (4) (1) o=a0 *w
  • 228. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) (4) E=loss(a1 ,t) Network forward a0 (1) o=a0 *w
  • 229. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Network forward a0 (1) o=a0 *w (4) E=loss(a1 ,t)
  • 230. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Network forward a0 Let’s do back propagation! will be given. What would be We can use the chain rule. backward prop (1) o=a0 *w (4) E=loss(a1 ,t)
  • 231. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Network forward backward prop a0 In the same manner, we can get back prop (4), (3), (1) and (1)! (1) o=a0 *w (4) E=loss(a1 ,t)
  • 232. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Gate derivatives Network forward a0 These derivatives for each gate will be given. We can just use them in the chain rule. (1) o=a0 *w (4) E=loss(a1 ,t)
  • 233. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Gate derivatives Network forward backward prop a0 Given from the pre computed derivative Just apply them one by one and solve each derivative one by one! (1) o=a0 *w (4) E=loss(a1 ,t)
  • 234. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Gate derivatives Network forward backward prop a0 Given from the pre computed derivative Just apply them one by one and solve each derivative one by one! Given (1) o=a0 *w (4) E=loss(a1 ,t)
  • 235. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Gate derivatives Network forward backward prop a0 Given from the pre computed derivative Just apply them one by one and solve each derivative one by one! Given (1) o=a0 *w (4) E=loss(a1 ,t)
  • 236. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Gate derivatives Network forward backward prop a0 Matrix (1) o=a0 *w (4) E=loss(a1 ,t) For Matrix: http://cs231n.github.io/optimization-2/#staged
  • 237. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Gate derivatives Network update (learning rate, alpha) Network forward backward prop a0 Matrix (1) o=a0 *w (4) E=loss(a1 ,t)
  • 238. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Gate derivatives Network update (learning rate, alpha) Network forward backward prop a0 Done! Let’s update our network using derivatives! (1) o=a0 *w (4) E=loss(a1 ,t)
  • 239. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Network forward backward prop a0 (1) o=a0 *w d_a1 = (a1 - t) / (a1 * (1. - t) + 1e-7) d_sigma = a1 * (1 - a1) # sigma prime d_l = d_a1 * d_sigma # (a1 - t) d_b = d_l * 1 d_o = d_1 * 1 d_W = tf.matmul(tf.transpose(a0), d_o) # Updating network using gradients learning_rate = 0.01 train_step = [ tf.assign(W, W - learning_rate * d_W), tf.assign(b, b - learning_rate * tf.reduce_sum(d_b))] (4) E=loss(a1 ,t)
  • 240. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Network forward backward prop a0 (1) o=a0 *w d_a1 = (a1 - t) / (a1 * (1. - a1) + 1e-7) d_sigma = a1 * (1 - a1) # sigma prime d_l = d_a1 * d_sigma # (a1 - t) d_b = d_l * 1 d_o = d_1 * 1 d_W = tf.matmul(tf.transpose(a0), d_o) # Updating network using gradients learning_rate = 0.01 train_step = [ tf.assign(W, W - learning_rate * d_W), tf.assign(b, b - learning_rate * d_b)] (4) E=loss(a1 ,t)
  • 241. * + Sigmoid loss w b (2) l=o+b (3) a1 =sigmoid(l) Derivatives (chain rule) Network forward backward prop a0 (1) o=a0 *w d_a1 = (a1 - t) / (a1 * (1. - a1) + 1e-7) d_sigma = a1 * (1 - a1) # sigma prime d_l = d_a1 * d_sigma # (a1 - t) d_b = d_l * 1 d_o = d_1 * 1 d_W = tf.matmul(tf.transpose(a0), d_o) # Updating network using gradients learning_rate = 0.01 train_step = [ tf.assign(W, W - learning_rate * d_W / N), # sample size tf.assign(b, b - learning_rate * tf.reduce_mean(d_b))] (4) E=loss(a1 ,t)
  • 242. Exercise ● See more backprop code samples at https://github.com/hunkim/DeepLearningZeroToAll ● https://github.com/hunkim/DeepLearningZeroToAll/blob/mast er/lab-09-7-sigmoid_back_prop.py ● Solve XOR using NN backprop
  • 243. Lab 10 NN, ReLu, Xavier, Dropout, and Adam Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 244. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 245. Lab 10 NN, ReLu, Xavier, Dropout, and Adam Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 247. Softmax classifier for MNIST # weights & bias for nn layers W = tf.Variable(tf.random_normal([784, 10])) b = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(X, W) + b # define cost/loss & optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # initialize sess = tf.Session() sess.run(tf.global_variables_initializer()) # train my model for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) feed_dict = {X: batch_xs, Y: batch_ys} c, _ = sess.run([cost, optimizer], feed_dict=feed_dict) avg_cost += c / total_batch print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost)) print('Learning Finished!') # Test model and check accuracy correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) Epoch: 0001 cost = 5.888845987 Epoch: 0002 cost = 1.860620173 Epoch: 0003 cost = 1.159035648 Epoch: 0004 cost = 0.892340870 Epoch: 0005 cost = 0.751155428 Epoch: 0006 cost = 0.662484806 Epoch: 0007 cost = 0.601544010 Epoch: 0008 cost = 0.556526115 Epoch: 0009 cost = 0.521186961 Epoch: 0010 cost = 0.493068354 Epoch: 0011 cost = 0.469686249 Epoch: 0012 cost = 0.449967254 Epoch: 0013 cost = 0.433519321 Epoch: 0014 cost = 0.419000337 Epoch: 0015 cost = 0.406490815 Learning Finished! Accuracy: 0.9035 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-1-mnist_softmax.py
  • 249. Softmax classifier for MNIST # weights & bias for nn layers W = tf.Variable(tf.random_normal([784, 10])) b = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(X, W) + b # define cost/loss & optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # initialize sess = tf.Session() sess.run(tf.global_variables_initializer()) # train my model for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) feed_dict = {X: batch_xs, Y: batch_ys} c, _ = sess.run([cost, optimizer], feed_dict=feed_dict) avg_cost += c / total_batch print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost)) print('Learning Finished!') # Test model and check accuracy correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) Epoch: 0001 cost = 5.888845987 Epoch: 0002 cost = 1.860620173 Epoch: 0003 cost = 1.159035648 Epoch: 0004 cost = 0.892340870 Epoch: 0005 cost = 0.751155428 Epoch: 0006 cost = 0.662484806 Epoch: 0007 cost = 0.601544010 Epoch: 0008 cost = 0.556526115 Epoch: 0009 cost = 0.521186961 Epoch: 0010 cost = 0.493068354 Epoch: 0011 cost = 0.469686249 Epoch: 0012 cost = 0.449967254 Epoch: 0013 cost = 0.433519321 Epoch: 0014 cost = 0.419000337 Epoch: 0015 cost = 0.406490815 Learning Finished! Accuracy: 0.9035 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-1-mnist_softmax.py
  • 250. NN for MNIST # input place holders X = tf.placeholder(tf.float32, [None, 784]) Y = tf.placeholder(tf.float32, [None, 10]) # weights & bias for nn layers W1 = tf.Variable(tf.random_normal([784, 256])) b1 = tf.Variable(tf.random_normal([256])) L1 = tf.nn.relu(tf.matmul(X, W1) + b1) W2 = tf.Variable(tf.random_normal([256, 256])) b2 = tf.Variable(tf.random_normal([256])) L2 = tf.nn.relu(tf.matmul(L1, W2) + b2) W3 = tf.Variable(tf.random_normal([256, 10])) b3 = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(L2, W3) + b3 # define cost/loss & optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( logits=hypothesis, labels=Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) Epoch: 0001 cost = 141.207671860 Epoch: 0002 cost = 38.788445864 Epoch: 0003 cost = 23.977515479 Epoch: 0004 cost = 16.315132428 Epoch: 0005 cost = 11.702554882 Epoch: 0006 cost = 8.573139748 Epoch: 0007 cost = 6.370995680 Epoch: 0008 cost = 4.537178684 Epoch: 0009 cost = 3.216900532 Epoch: 0010 cost = 2.329708954 Epoch: 0011 cost = 1.715552875 Epoch: 0012 cost = 1.189857912 Epoch: 0013 cost = 0.820965160 Epoch: 0014 cost = 0.624131458 Epoch: 0015 cost = 0.454633765 Learning Finished! Accuracy: 0.9455 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-2-mnist_nn.py
  • 252. Xavier for MNIST # input place holders X = tf.placeholder(tf.float32, [None, 784]) Y = tf.placeholder(tf.float32, [None, 10]) # weights & bias for nn layers # http://stackoverflow.com/questions/33640581 W1 = tf.get_variable("W1", shape=[784, 256], initializer=tf.contrib.layers.xavier_initializer()) b1 = tf.Variable(tf.random_normal([256])) L1 = tf.nn.relu(tf.matmul(X, W1) + b1) W2 = tf.get_variable("W2", shape=[256, 256], initializer=tf.contrib.layers.xavier_initializer()) b2 = tf.Variable(tf.random_normal([256])) L2 = tf.nn.relu(tf.matmul(L1, W2) + b2) W3 = tf.get_variable("W3", shape=[256, 10], initializer=tf.contrib.layers.xavier_initializer()) b3 = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(L2, W3) + b3 # define cost/loss & optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( logits=hypothesis, labels=Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) Epoch: 0001 cost = 0.301498963 Epoch: 0002 cost = 0.107252513 Epoch: 0003 cost = 0.064888892 Epoch: 0004 cost = 0.044463030 Epoch: 0005 cost = 0.029951642 Epoch: 0006 cost = 0.020663404 Epoch: 0007 cost = 0.015853033 Epoch: 0008 cost = 0.011764387 Epoch: 0009 cost = 0.008598264 Epoch: 0010 cost = 0.007383116 Epoch: 0011 cost = 0.006839140 Epoch: 0012 cost = 0.004672963 Epoch: 0013 cost = 0.003979437 Epoch: 0014 cost = 0.002714260 Epoch: 0015 cost = 0.004707661 Learning Finished! Accuracy: 0.9783 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-3-mnist_nn_xavier.py
  • 253. Xavier for MNIST # input place holders X = tf.placeholder(tf.float32, [None, 784]) Y = tf.placeholder(tf.float32, [None, 10]) # weights & bias for nn layers # http://stackoverflow.com/questions/33640581 W1 = tf.get_variable("W1", shape=[784, 256], initializer=tf.contrib.layers.xavier_initializer()) b1 = tf.Variable(tf.random_normal([256])) L1 = tf.nn.relu(tf.matmul(X, W1) + b1) W2 = tf.get_variable("W2", shape=[256, 256], initializer=tf.contrib.layers.xavier_initializer()) b2 = tf.Variable(tf.random_normal([256])) L2 = tf.nn.relu(tf.matmul(L1, W2) + b2) W3 = tf.get_variable("W3", shape=[256, 10], initializer=tf.contrib.layers.xavier_initializer()) b3 = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(L2, W3) + b3 # define cost/loss & optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( logits=hypothesis, labels=Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) Epoch: 0001 cost = 0.301498963 Epoch: 0002 cost = 0.107252513 Epoch: 0003 cost = 0.064888892 Epoch: 0004 cost = 0.044463030 Epoch: 0005 cost = 0.029951642 Epoch: 0006 cost = 0.020663404 Epoch: 0007 cost = 0.015853033 Epoch: 0008 cost = 0.011764387 Epoch: 0009 cost = 0.008598264 Epoch: 0010 cost = 0.007383116 Epoch: 0011 cost = 0.006839140 Epoch: 0012 cost = 0.004672963 Epoch: 0013 cost = 0.003979437 Epoch: 0014 cost = 0.002714260 Epoch: 0015 cost = 0.004707661 Learning Finished! Accuracy: 0.9783 (xavier) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-3-mnist_nn_xavier.py Epoch: 0001 cost = 141.207671860 Epoch: 0002 cost = 38.788445864 Epoch: 0003 cost = 23.977515479 Epoch: 0004 cost = 16.315132428 Epoch: 0005 cost = 11.702554882 Epoch: 0006 cost = 8.573139748 Epoch: 0007 cost = 6.370995680 Epoch: 0008 cost = 4.537178684 Epoch: 0009 cost = 3.216900532 Epoch: 0010 cost = 2.329708954 Epoch: 0011 cost = 1.715552875 Epoch: 0012 cost = 1.189857912 Epoch: 0013 cost = 0.820965160 Epoch: 0014 cost = 0.624131458 Epoch: 0015 cost = 0.454633765 Learning Finished! Accuracy: 0.9455 (normal dist)
  • 254. Deep NN for MNIST W1 = tf.get_variable("W1", shape=[784, 512], initializer=tf.contrib.layers.xavier_initializer()) b1 = tf.Variable(tf.random_normal([512])) L1 = tf.nn.relu(tf.matmul(X, W1) + b1) W2 = tf.get_variable("W2", shape=[512, 512], initializer=tf.contrib.layers.xavier_initializer()) b2 = tf.Variable(tf.random_normal([512])) L2 = tf.nn.relu(tf.matmul(L1, W2) + b2) W3 = tf.get_variable("W3", shape=[512, 512], initializer=tf.contrib.layers.xavier_initializer()) b3 = tf.Variable(tf.random_normal([512])) L3 = tf.nn.relu(tf.matmul(L2, W3) + b3) W4 = tf.get_variable("W4", shape=[512, 512], initializer=tf.contrib.layers.xavier_initializer()) b4 = tf.Variable(tf.random_normal([512])) L4 = tf.nn.relu(tf.matmul(L3, W4) + b4) W5 = tf.get_variable("W5", shape=[512, 10], initializer=tf.contrib.layers.xavier_initializer()) b5 = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(L4, W5) + b5 # define cost/loss & optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) Epoch: 0001 cost = 0.266061549 Epoch: 0002 cost = 0.080796588 Epoch: 0003 cost = 0.049075800 Epoch: 0004 cost = 0.034772298 Epoch: 0005 cost = 0.024780529 Epoch: 0006 cost = 0.017072763 Epoch: 0007 cost = 0.014031383 Epoch: 0008 cost = 0.013763446 Epoch: 0009 cost = 0.009164047 Epoch: 0010 cost = 0.008291388 Epoch: 0011 cost = 0.007319742 Epoch: 0012 cost = 0.006434021 Epoch: 0013 cost = 0.005684378 Epoch: 0014 cost = 0.004781207 Epoch: 0015 cost = 0.004342310 Learning Finished! Accuracy: 0.9742 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-4-mnist_nn_deep.py
  • 255. Dropout for MNIST # dropout (keep_prob) rate 0.7 on training, but should be 1 for testing keep_prob = tf.placeholder(tf.float32) W1 = tf.get_variable("W1", shape=[784, 512]) b1 = tf.Variable(tf.random_normal([512])) L1 = tf.nn.relu(tf.matmul(X, W1) + b1) L1 = tf.nn.dropout(L1, keep_prob=keep_prob) W2 = tf.get_variable("W2", shape=[512, 512]) b2 = tf.Variable(tf.random_normal([512])) L2 = tf.nn.relu(tf.matmul(L1, W2) + b2) L2 = tf.nn.dropout(L2, keep_prob=keep_prob) … # train my model for epoch in range(training_epochs): ... for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) feed_dict = {X: batch_xs, Y: batch_ys, keep_prob: 0.7} c, _ = sess.run([cost, optimizer], feed_dict=feed_dict) avg_cost += c / total_batch # Test model and check accuracy correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print('Accuracy:', sess.run(accuracy, feed_dict={ X: mnist.test.images, Y: mnist.test.labels, keep_prob: 1})) Epoch: 0001 cost = 0.447322626 Epoch: 0002 cost = 0.157285590 Epoch: 0003 cost = 0.121884535 Epoch: 0004 cost = 0.098128681 Epoch: 0005 cost = 0.082901778 Epoch: 0006 cost = 0.075337573 Epoch: 0007 cost = 0.069752543 Epoch: 0008 cost = 0.060884363 Epoch: 0009 cost = 0.055276413 Epoch: 0010 cost = 0.054631256 Epoch: 0011 cost = 0.049675195 Epoch: 0012 cost = 0.049125314 Epoch: 0013 cost = 0.047231930 Epoch: 0014 cost = 0.041290121 Epoch: 0015 cost = 0.043621063 Learning Finished! Accuracy: 0.9804!! https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-5-mnist_nn_dropout.py
  • 257. Optimizers train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost) ● tf.train.AdadeltaOptimizer ● tf.train.AdagradOptimizer ● tf.train.AdagradDAOptimizer ● tf.train.MomentumOptimizer ● tf.train.AdamOptimizer ● tf.train.FtrlOptimizer ● tf.train.ProximalGradientDescentOptimizer ● tf.train.ProximalAdagradOptimizer ● tf.train.RMSPropOptimizer https://www.tensorflow.org/api_guides/python/train
  • 259. ADAM: a method for stochastic optimization [Kingma et al. 2015]
  • 260. Use Adam Optimizer # define cost/loss & optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits( logits=hypothesis, labels=Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
  • 261. Summary ●Softmax VS Neural Nets for MNIST, 90% and 94.5% ●Xavier initialization: 97.8% ●Deep Neural Nets with Dropout: 98% ●Adam and other optimizers ●Exercise: Batch Normalization - https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab- 10-6-mnist_nn_batchnorm.ipynb
  • 262. Lecture and Lab 11 CNN Sung Kim <hunkim+ml@gmail.com> http://hunkim.github.io/ml/
  • 263. Lab 11 CNN Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 264. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 265. Lab 11-1 CNN Basics Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 268. CNN for CT images Asan Medical Center & Microsoft Medical Bigdata Contest Winner by GeunYoung Lee and Alex Kim https://www.slideshare.net/GYLee3/ss-72966495
  • 269. Convolution layer and max pooling
  • 270. Simple convolution layer Stride: 1x1 3x3x1 2x2x1 filter
  • 272. Simple convolution layer Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: VALID 1 1 1 1 1 2 3 4 5 6 7 8 9 [[[[1.]],[[1.]]], [[[1.]],[[1.]]]] shape=(2,2,1,1)
  • 273. Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: VALID 1 1 1 1 1 2 3 4 5 6 7 8 9
  • 274. 1 1 1 1 1 2 3 4 5 6 7 8 9 0 0 0 0 0 0 0 3 3 Simple convolution layer Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: SAME
  • 275. Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: SAME 1 1 1 1 1 2 3 4 5 6 7 8 9 0 0 0 0 0 0 0
  • 281. Lab 11-2 CNN MNIST: 99%! Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 285. # input placeholders X = tf.placeholder(tf.float32, [None, 784]) X_img = tf.reshape(X, [-1, 28, 28, 1]) # img 28x28x1 (black/white) Y = tf.placeholder(tf.float32, [None, 10]) # L1 ImgIn shape=(?, 28, 28, 1) W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01)) # Conv -> (?, 28, 28, 32) # Pool -> (?, 14, 14, 32) L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME') L1 = tf.nn.relu(L1) L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') ''' Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32) ''' Conv layer 1 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py
  • 286. ''' Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32) ''' # L2 ImgIn shape=(?, 14, 14, 32) W2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01)) # Conv ->(?, 14, 14, 64) # Pool ->(?, 7, 7, 64) L2 = tf.nn.conv2d(L1, W2, strides=[1, 1, 1, 1], padding='SAME') L2 = tf.nn.relu(L2) L2 = tf.nn.max_pool(L2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') L2 = tf.reshape(L2, [-1, 7 * 7 * 64]) ''' Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32) Tensor("Relu_1:0", shape=(?, 14, 14, 64), dtype=float32) Tensor("MaxPool_1:0", shape=(?, 7, 7, 64), dtype=float32) Tensor("Reshape_1:0", shape=(?, 3136), dtype=float32) Conv layer 2 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py
  • 287. ''' Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32) Tensor("Relu_1:0", shape=(?, 14, 14, 64), dtype=float32) Tensor("MaxPool_1:0", shape=(?, 7, 7, 64), dtype=float32) Tensor("Reshape_1:0", shape=(?, 3136), dtype=float32) ''' L2 = tf.reshape(L2, [-1, 7 * 7 * 64]) # Final FC 7x7x64 inputs -> 10 outputs W3 = tf.get_variable("W3", shape=[7 * 7 * 64, 10], initializer=tf.contrib.layers.xavier_initializer()) b = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(L2, W3) + b # define cost/loss & optimizer cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) Fully Connected (FC, Dense) layer https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py
  • 288. Training and Evaluation https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py # initialize sess = tf.Session() sess.run(tf.global_variables_initializer()) # train my model print('Learning stared. It takes sometime.') for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) feed_dict = {X: batch_xs, Y: batch_ys} c, _, = sess.run([cost, optimizer], feed_dict=feed_dict) avg_cost += c / total_batch print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost)) print('Learning Finished!') # Test model and check accuracy correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
  • 289. https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py # initialize sess = tf.Session() sess.run(tf.global_variables_initializer()) # train my model print('Learning stared. It takes sometime.') for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) feed_dict = {X: batch_xs, Y: batch_ys} c, _, = sess.run([cost, optimizer], feed_dict=feed_dict) avg_cost += c / total_batch print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost)) print('Learning Finished!') # Test model and check accuracy correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) Epoch: 0001 cost = 0.340291267 Epoch: 0002 cost = 0.090731326 Epoch: 0003 cost = 0.064477619 Epoch: 0004 cost = 0.050683064 ... Epoch: 0011 cost = 0.017758641 Epoch: 0012 cost = 0.014156652 Epoch: 0013 cost = 0.012397016 Epoch: 0014 cost = 0.010693789 Epoch: 0015 cost = 0.009469977 Learning Finished! Accuracy: 0.9885 Training and Evaluation
  • 290. Deep CNN Image credit: http://personal.ie.cuhk.edu.hk/~ccloy/project_target_code/index.html
  • 291. # L3 ImgIn shape=(?, 7, 7, 64) W3 = tf.Variable(tf.random_normal([3, 3, 64, 128], stddev=0.01)) # Conv ->(?, 7, 7, 128) # Pool ->(?, 4, 4, 128) # Reshape ->(?, 4 * 4 * 128) # Flatten them for FC L3 = tf.nn.conv2d(L2, W3, strides=[1, 1, 1, 1], padding='SAME') L3 = tf.nn.relu(L3) L3 = tf.nn.max_pool(L3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') L3 = tf.nn.dropout(L3, keep_prob=keep_prob) L3 = tf.reshape(L3, [-1, 128 * 4 * 4]) '''Tensor("Conv2D_2:0", shape=(?, 7, 7, 128), dtype=float32) Tensor("Relu_2:0", shape=(?, 7, 7, 128), dtype=float32) Tensor("MaxPool_2:0", shape=(?, 4, 4, 128), dtype=float32) Tensor("dropout_2/mul:0", shape=(?, 4, 4, 128), dtype=float32) Tensor("Reshape_1:0", shape=(?, 2048), dtype=float32)''' # L4 FC 4x4x128 inputs -> 625 outputs W4 = tf.get_variable("W4", shape=[128 * 4 * 4, 625], initializer=tf.contrib.layers.xavier_initializer()) b4 = tf.Variable(tf.random_normal([625])) L4 = tf.nn.relu(tf.matmul(L3, W4) + b4) L4 = tf.nn.dropout(L4, keep_prob=keep_prob) '''Tensor("Relu_3:0", shape=(?, 625), dtype=float32) Tensor("dropout_3/mul:0", shape=(?, 625), dtype=float32)''' # L5 Final FC 625 inputs -> 10 outputs W5 = tf.get_variable("W5", shape=[625, 10], initializer=tf.contrib.layers.xavier_initializer()) b5 = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(L4, W5) + b5 '''Tensor("add_1:0", shape=(?, 10), dtype=float32)''' Deep CNN https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-2-mnist_deep_cnn.py # L1 ImgIn shape=(?, 28, 28, 1) W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01)) # Conv -> (?, 28, 28, 32) # Pool -> (?, 14, 14, 32) L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME') L1 = tf.nn.relu(L1) L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') L1 = tf.nn.dropout(L1, keep_prob=keep_prob) '''Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32) Tensor("dropout/mul:0", shape=(?, 14, 14, 32), dtype=float32)''' # L2 ImgIn shape=(?, 14, 14, 32) W2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01)) # Conv ->(?, 14, 14, 64) # Pool ->(?, 7, 7, 64) L2 = tf.nn.conv2d(L1, W2, strides=[1, 1, 1, 1], padding='SAME') L2 = tf.nn.relu(L2) L2 = tf.nn.max_pool(L2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') L2 = tf.nn.dropout(L2, keep_prob=keep_prob) '''Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32) Tensor("Relu_1:0", shape=(?, 14, 14, 64), dtype=float32) Tensor("MaxPool_1:0", shape=(?, 7, 7, 64), dtype=float32) Tensor("dropout_1/mul:0", shape=(?, 7, 7, 64), dtype=float32)'''
  • 292. Deep CNN https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-2-mnist_deep_cnn.py # L1 ImgIn shape=(?, 28, 28, 1) W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01)) # Conv -> (?, 28, 28, 32) # Pool -> (?, 14, 14, 32) L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME') L1 = tf.nn.relu(L1) L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') L1 = tf.nn.dropout(L1, keep_prob=keep_prob) '''Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32) Tensor("dropout/mul:0", shape=(?, 14, 14, 32), dtype=float32)''' ... ... # L4 FC 4x4x128 inputs -> 625 outputs W4 = tf.get_variable("W4", shape=[128 * 4 * 4, 625], initializer=tf.contrib.layers.xavier_initializer()) b4 = tf.Variable(tf.random_normal([625])) L4 = tf.nn.relu(tf.matmul(L3, W4) + b4) L4 = tf.nn.dropout(L4, keep_prob=keep_prob) '''Tensor("Relu_3:0", shape=(?, 625), dtype=float32) Tensor("dropout_3/mul:0", shape=(?, 625), dtype=float32)''' # L5 Final FC 625 inputs -> 10 outputs W5 = tf.get_variable("W5", shape=[625, 10], initializer=tf.contrib.layers.xavier_initializer()) b5 = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(L4, W5) + b5 '''Tensor("add_1:0", shape=(?, 10), dtype=float32)''' # Test model and check accuracy correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32)) print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels, keep_prob: 1})) Epoch: 0013 cost = 0.027188021 Epoch: 0014 cost = 0.023604777 Epoch: 0015 cost = 0.024607201 Learning Finished! Accuracy: 0.9938
  • 293. Lab 11-3 Class, Layers, Ensemble Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 295. CNN https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-2-mnist_deep_cnn.py # L1 ImgIn shape=(?, 28, 28, 1) W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01)) # Conv -> (?, 28, 28, 32) # Pool -> (?, 14, 14, 32) L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME') L1 = tf.nn.relu(L1) L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') L1 = tf.nn.dropout(L1, keep_prob=keep_prob) '''Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32) Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32) Tensor("dropout/mul:0", shape=(?, 14, 14, 32), dtype=float32)''' ... ... # L4 FC 4x4x128 inputs -> 625 outputs W4 = tf.get_variable("W4", shape=[128 * 4 * 4, 625], initializer=tf.contrib.layers.xavier_initializer()) b4 = tf.Variable(tf.random_normal([625])) L4 = tf.nn.relu(tf.matmul(L3, W4) + b4) L4 = tf.nn.dropout(L4, keep_prob=keep_prob) '''Tensor("Relu_3:0", shape=(?, 625), dtype=float32) Tensor("dropout_3/mul:0", shape=(?, 625), dtype=float32)''' # L5 Final FC 625 inputs -> 10 outputs W5 = tf.get_variable("W5", shape=[625, 10], initializer=tf.contrib.layers.xavier_initializer()) b5 = tf.Variable(tf.random_normal([10])) hypothesis = tf.matmul(L4, W5) + b5 '''Tensor("add_1:0", shape=(?, 10), dtype=float32)''' # Test model and check accuracy correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32)) print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels, keep_prob: 1})) Epoch: 0013 cost = 0.027188021 Epoch: 0014 cost = 0.023604777 Epoch: 0015 cost = 0.024607201 Learning Finished! Accuracy: 0.9938
  • 296. Python Class https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-3-mnist_cnn_class.py class Model: def __init__(self, sess, name): self.sess = sess self.name = name self._build_net() def _build_net(self): with tf.variable_scope(self.name): # input place holders self.X = tf.placeholder(tf.float32, [None, 784]) # img 28x28x1 (black/white) X_img = tf.reshape(self.X, [-1, 28, 28, 1]) self.Y = tf.placeholder(tf.float32, [None, 10]) # L1 ImgIn shape=(?, 28, 28, 1) W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01)) ... def predict(self, x_test, keep_prop=1.0): return self.sess.run(self.logits, feed_dict={self.X: x_test, self.keep_prob: keep_prop}) def get_accuracy(self, x_test, y_test, keep_prop=1.0): return self.sess.run(self.accuracy, feed_dict={self.X: x_test, self.Y: y_test, self.keep_prob: keep_prop}) def train(self, x_data, y_data, keep_prop=0.7): return self.sess.run([self.cost, self.optimizer], feed_dict={ self.X: x_data, self.Y: y_data, self.keep_prob: keep_prop}) # initialize sess = tf.Session() m1 = Model(sess, "m1") sess.run(tf.global_variables_initializer()) print('Learning Started!') # train my model for epoch in range(training_epochs): avg_cost = 0 total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys = mnist.train.next_batch(batch_size) c, _ = m1.train(batch_xs, batch_ys) avg_cost += c / total_batc
  • 298. tf.layers https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-4-mnist_cnn_layers.py # L1 ImgIn shape=(?, 28, 28, 1) W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01)) # Conv -> (?, 28, 28, 32) # Pool -> (?, 14, 14, 32) L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME') L1 = tf.nn.relu(L1) L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') L1 = tf.nn.dropout(L1, keep_prob=self.keep_prob) … # L2 ImgIn shape=(?, 14, 14, 32) W2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01)) # Convolutional Layer #1 conv1 = tf.layers.conv2d(inputs=X_img,filters=32,kernel_size=[3,3],padding="SAME",activation=tf.nn.relu) pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], padding="SAME", strides=2) dropout1 = tf.layers.dropout(inputs=pool1,rate=0.7, training=self.training) # Convolutional Layer #2 conv2 = tf.layers.conv2d(inputs=dropout1,filters=64,kernel_size=[3,3],padding="SAME",activation=tf.nn.relu) … flat = tf.reshape(dropout3, [-1, 128 * 4 * 4]) dense4 = tf.layers.dense(inputs=flat, units=625, activation=tf.nn.relu) dropout4 = tf.layers.dropout(inputs=dense4, rate=0.5, training=self.training) ...
  • 300. models = [] num_models = 7 for m in range(num_models): models.append(Model(sess, "model" + str(m))) sess.run(tf.global_variables_initializer()) print('Learning Started!') # train my model for epoch in range(training_epochs): avg_cost_list = np.zeros(len(models)) total_batch = int(mnist.train.num_examples / batch_size) for i in range(total_batch): batch_xs, batch_ys =mnist.train.next_batch(batch_size) # train each model for m_idx, m in enumerate(models): c, _ = m.train(batch_xs, batch_ys) avg_cost_list[m_idx] += c / total_batch print('Epoch:','%04d'%(epoch + 1),'cost =', avg_cost_list) print('Learning Finished!') class Model: def __init__(self, sess, name): self.sess = sess self.name = name self._build_net() def _build_net(self): with tf.variable_scope(self.name): ... Ensemble training https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-5-mnist_cnn_ensemble_layers.py
  • 303. Ensemble prediction 0 1 2 3 4 5 6 7 8 9 0.1 0.01 0.02 0.8 ... ... ... ... ... ... 0.01 0.5 0.02 0.4 ... ... ... ... ... ... 0.01 0.01 0.1 0.7 ... ... ... ... ... ... . . . 0.12 0.52 0.14 1.9 ... ... ... ... ... ... Sum argmax
  • 304. # Test model and check accuracy test_size = len(mnist.test.labels) predictions = np.zeros(test_size * 10).reshape(test_size, 10) for m_idx, m in enumerate(models): print(m_idx, 'Accuracy:', m.get_accuracy(mnist.test.images, mnist.test.labels)) p = m.predict(mnist.test.images) predictions += p ensemble_correct_prediction = tf.equal( tf.argmax(predictions, 1), tf.argmax(mnist.test.labels, 1)) ensemble_accuracy = tf.reduce_mean( tf.cast(ensemble_correct_prediction, tf.float32)) print('Ensemble accuracy:', sess.run(ensemble_accuracy)) Ensemble prediction 0 Accuracy: 0.9933 1 Accuracy: 0.9946 2 Accuracy: 0.9934 3 Accuracy: 0.9935 4 Accuracy: 0.9935 5 Accuracy: 0.9949 6 Accuracy: 0.9941 Ensemble accuracy: 0.9952 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-5-mnist_cnn_ensemble_layers.py
  • 305. Exercise ● Deep & Wide? ● CIFAR 10 ● ImageNet
  • 306. Lab 12 RNN Sung Kim <hunkim+ml@gmail.com> http://hunkim.github.io/ml/
  • 307. Lab 12 RNN Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 308. Call for comments Please feel free to add comments directly on these slides Other slides: https://goo.gl/jPtWNt Picture from http://www.tssablog.org/archives/3280
  • 309. Lab 12-1 RNN Basics Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 311. RNN in TensorFlow cell = tf.contrib.rnn.BasicRNNCell(num_units=hidden_size) ... outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
  • 312. RNN in TensorFlow cell = tf.contrib.rnn.BasicRNNCell(num_units=hidden_size) cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size) ... outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
  • 313. One node: 4 (input-dim) in 2 (hidden_size)
  • 314. One node: 4 (input-dim) in 2 (hidden_size)
  • 315. # One cell RNN input_dim (4) -> output_dim (2) hidden_size = 2 cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size) x_data = np.array([[[1,0,0,0]]], dtype=np.float32) outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32) sess.run(tf.global_variables_initializer()) pp.pprint(outputs.eval()) array([[[-0.42409304, 0.64651132]]]) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb One node: 4 (input-dim) in 2 (hidden_size)
  • 316. Unfolding to n sequences https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb Hidden_size=2 sequence_length=5
  • 317. Unfolding to n sequences # One cell RNN input_dim (4) -> output_dim (2). sequence: 5 hidden_size = 2 cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size) x_data = np.array([[h, e, l, l, o]], dtype=np.float32) print(x_data.shape) pp.pprint(x_data) outputs, states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32) sess.run(tf.global_variables_initializer()) pp.pprint(outputs.eval()) X_data = array ([[[ 1., 0., 0., 0.], [ 0., 1., 0., 0.], [ 0., 0., 1., 0.], [ 0., 0., 1., 0.], [ 0., 0., 0., 1.]]], dtype=float32) Outputs = array ([[[ 0.19709368, 0.24918222], [-0.11721198, 0.1784237 ], [-0.35297349, -0.66278851], [-0.70915914, -0.58334434], [-0.38886023, 0.47304463]]], dtype=float32) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb Hidden_size=2 sequence_length=5
  • 319. Batching input # One cell RNN input_dim (4) -> output_dim (2). sequence: 5, batch 3 # 3 batches 'hello', 'eolll', 'lleel' x_data = np.array([[h, e, l, l, o], [e, o, l, l, l], [l, l, e, e, l]], dtype=np.float32) pp.pprint(x_data) cell = rnn.BasicLSTMCell(num_units=2, state_is_tuple=True) outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32) sess.run(tf.global_variables_initializer()) pp.pprint(outputs.eval()) array([[[ 1., 0., 0., 0.], [ 0., 1., 0., 0.], [ 0., 0., 1., 0.], [ 0., 0., 1., 0.], [ 0., 0., 0., 1.]], [[ 0., 1., 0., 0.], [ 0., 0., 0., 1.], [ 0., 0., 1., 0.], [ 0., 0., 1., 0.], [ 0., 0., 1., 0.]], [[ 0., 0., 1., 0.], [ 0., 0., 1., 0.], [ 0., 1., 0., 0.], [ 0., 1., 0., 0.], [ 0., 0., 1., 0.]]], https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb Hidden_size=2 sequence_length=5 batch_size=3
  • 320. Batching input # One cell RNN input_dim (4) -> output_dim (2). sequence: 5, batch 3 # 3 batches 'hello', 'eolll', 'lleel' x_data = np.array([[h, e, l, l, o], [e, o, l, l, l], [l, l, e, e, l]], dtype=np.float32) pp.pprint(x_data) cell = rnn.BasicLSTMCell(num_units=2, state_is_tuple=True) outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32) sess.run(tf.global_variables_initializer()) pp.pprint(outputs.eval()) array([[[ 1., 0., 0., 0.], [ 0., 1., 0., 0.], [ 0., 0., 1., 0.], [ 0., 0., 1., 0.], [ 0., 0., 0., 1.]], [[ 0., 1., 0., 0.], [ 0., 0., 0., 1.], [ 0., 0., 1., 0.], [ 0., 0., 1., 0.], [ 0., 0., 1., 0.]], [[ 0., 0., 1., 0.], [ 0., 0., 1., 0.], [ 0., 1., 0., 0.], [ 0., 1., 0., 0.], [ 0., 0., 1., 0.]]], array([[[-0.0173022 , -0.12929453], [-0.14995177, -0.23189341], [ 0.03294011, 0.01962204], [ 0.12852104, 0.12375218], [ 0.13597946, 0.31746736]], [[-0.15243632, -0.14177315], [ 0.04586344, 0.12249056], [ 0.14292534, 0.15872268], [ 0.18998367, 0.21004884], [ 0.21788891, 0.24151592]], [[ 0.10713603, 0.11001928], [ 0.17076059, 0.1799853 ], [-0.03531617, 0.08993293], [-0.1881337 , -0.08296411], [-0.00404597, 0.07156041]]], https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb Hidden_size=2 sequence_length=5 batch_size=3
  • 321. Lab 12-2 Hi Hello RNN Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 323. Teach RNN ‘hihello’ h h i e l l i e h l o l
  • 324. One-hot encoding [1, 0, 0, 0, 0], # h 0 [0, 1, 0, 0, 0], # i 1 [0, 0, 1, 0, 0], # e 2 [0, 0, 0, 1, 0], # l 3 [0, 0, 0, 0, 1], # o 4 ● text: ‘hihello’ ● unique chars (vocabulary, voc): h, i, e, l, o ● voc index: h:0, i:1, e:2, l:3, o:4
  • 325. h h i e l l i e h l o l [1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [1, 0, 0, 0, 0], # h 0 [0, 1, 0, 0, 0], # i 1 [0, 0, 1, 0, 0], # e 2 [0, 0, 0, 1, 0], # l 3 [0, 0, 0, 0, 1], # o 4 Teach RNN ‘hihello’ [0, 1, 0, 0, 0] [1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 0, 0, 0, 1]
  • 326. [1, 0, 0, 0, 0], # h 0 [0, 1, 0, 0, 0], # i 1 [0, 0, 1, 0, 0], # e 2 [0, 0, 0, 1, 0], # l 3 [0, 0, 0, 0, 1], # o 4 Teach RNN ‘hihello’
  • 327. Creating rnn cell # RNN model rnn_cell = rnn_cell.BasicRNNCell(rnn_size) rnn_cell = rnn_cell. BasicLSTMCell(rnn_size) rnn_cell = rnn_cell. GRUCell(rnn_size)
  • 328. Creating rnn cell # RNN model rnn_cell = rnn_cell.BasicRNNCell(rnn_size) rnn_cell = rnn_cell. BasicLSTMCell(rnn_size) rnn_cell = rnn_cell. GRUCell(rnn_size)
  • 329. Execute RNN # RNN model rnn_cell = rnn_cell.BasicRNNCell(rnn_size) outputs, _states = tf.nn.dynamic_rnn( rnn_cell, X, initial_state=initial_state, dtype=tf.float32) hidden_rnn_size
  • 330. RNN parameters hidden_size = 5 # output from the LSTM input_dim = 5 # one-hot size batch_size = 1 # one sentence sequence_length = 6 # |ihello| == 6 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
  • 331. Data creation idx2char = ['h', 'i', 'e', 'l', 'o'] # h=0, i=1, e=2, l=3, o=4 x_data = [[0, 1, 0, 2, 3, 3]] # hihell x_one_hot = [[[1, 0, 0, 0, 0], # h 0 [0, 1, 0, 0, 0], # i 1 [1, 0, 0, 0, 0], # h 0 [0, 0, 1, 0, 0], # e 2 [0, 0, 0, 1, 0], # l 3 [0, 0, 0, 1, 0]]] # l 3 y_data = [[1, 0, 2, 3, 3, 4]] # ihello X = tf.placeholder(tf.float32, [None, sequence_length, input_dim]) # X one-hot Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
  • 332. Feed to RNN X = tf.placeholder( tf.float32, [None, sequence_length, hidden_size]) # X one-hot Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size, state_is_tuple=True) initial_state = cell.zero_state(batch_size, tf.float32) outputs, _states = tf.nn.dynamic_rnn( cell, X, initial_state=initial_state, dtype=tf.float32) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py x_one_hot = [[[1, 0, 0, 0, 0], # h 0 [0, 1, 0, 0, 0], # i 1 [1, 0, 0, 0, 0], # h 0 [0, 0, 1, 0, 0], # e 2 [0, 0, 0, 1, 0], # l 3 [0, 0, 0, 1, 0]]] # l 3 y_data = [[1, 0, 2, 3, 3, 4]] # ihello
  • 334. # [batch_size, sequence_length] y_data = tf.constant([[1, 1, 1]]) # [batch_size, sequence_length, emb_dim ] prediction1 = tf.constant([[[0.3, 0.7], [0.3, 0.7], [0.3, 0.7]]], dtype=tf.float32) prediction2 = tf.constant([[[0.1, 0.9], [0.1, 0.9], [0.1, 0.9]]], dtype=tf.float32) # [batch_size * sequence_length] weights = tf.constant([[1, 1, 1]], dtype=tf.float32) sequence_loss1 = tf.contrib.seq2seq.sequence_loss(prediction1, y_data, weights) sequence_loss2 = tf.contrib.seq2seq.sequence_loss(prediction2, y_data, weights) sess.run(tf.global_variables_initializer()) print("Loss1: ", sequence_loss1.eval(), "Loss2: ", sequence_loss2.eval()) Cost: sequence_loss Loss1: 0.513015 Loss2: 0.371101
  • 335. Cost: sequence_loss outputs, _states = tf.nn.dynamic_rnn( cell, X, initial_state=initial_state, dtype=tf.float32) weights = tf.ones([batch_size, sequence_length]) sequence_loss = tf.contrib.seq2seq.sequence_loss( logits=outputs, targets=Y, weights=weights) loss = tf.reduce_mean(sequence_loss) train = tf.train.AdamOptimizer(learning_rate=0.1).minimize(loss) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
  • 336. Training prediction = tf.argmax(outputs, axis=2) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for i in range(2000): l, _ = sess.run([loss, train], feed_dict={X: x_one_hot, Y: y_data}) result = sess.run(prediction, feed_dict={X: x_one_hot}) print(i, "loss:", l, "prediction: ", result, "true Y: ", y_data) # print char using dic result_str = [idx2char[c] for c in np.squeeze(result)] print("tPrediction str: ", ''.join(result_str)) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
  • 337. Results prediction = tf.argmax(outputs, axis=2) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for i in range(2000): l, _ = sess.run([loss, train], feed_dict={X: x_one_hot, Y: y_data}) result = sess.run(prediction, feed_dict={X: x_one_hot}) print(i, "loss:", l, "prediction: ", result, "true Y: ", y_data) # print char using dic result_str = [idx2char[c] for c in np.squeeze(result)] print("tPrediction str: ", ''.join(result_str)) 0 loss: 1.55474 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo 1 loss: 1.55081 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo 2 loss: 1.54704 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo 3 loss: 1.54342 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo ... 1998 loss: 0.75305 prediction: [[1 0 2 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: ihello 1999 loss: 0.752973 prediction: [[1 0 2 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: ihello https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
  • 338. Lab 12-3 RNN with long sequences Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 340. Manual data creation idx2char = ['h', 'i', 'e', 'l', 'o'] x_data = [[0, 1, 0, 2, 3, 3]] # hihell x_one_hot = [[[1, 0, 0, 0, 0], # h 0 [0, 1, 0, 0, 0], # i 1 [1, 0, 0, 0, 0], # h 0 [0, 0, 1, 0, 0], # e 2 [0, 0, 0, 1, 0], # l 3 [0, 0, 0, 1, 0]]] # l 3 y_data = [[1, 0, 2, 3, 3, 4]] # ihello https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py
  • 341. Better data creation sample = " if you want you" idx2char = list(set(sample)) # index -> char char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idx sample_idx = [char2idx[c] for c in sample] # char to index x_data = [sample_idx[:-1]] # X data sample (0 ~ n-1) hello: hell y_data = [sample_idx[1:]] # Y label sample (1 ~ n) hello: ello X = tf.placeholder(tf.int32, [None, sequence_length]) # X data Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py
  • 342. Hyper parameters sample = " if you want you" idx2char = list(set(sample)) # index -> char char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idx # hyper parameters dic_size = len(char2idx) # RNN input size (one hot size) rnn_hidden_size = len(char2idx) # RNN output size num_classes = len(char2idx) # final output size (RNN or softmax, etc.) batch_size = 1 # one sample data, one batch sequence_length = len(sample) - 1 # number of lstm unfolding (unit #)
  • 343. LSTM and Loss X = tf.placeholder(tf.int32, [None, sequence_length]) # X data Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0 cell = tf.contrib.rnn.BasicLSTMCell(num_units=rnn_hidden_size, state_is_tuple=True) initial_state = cell.zero_state(batch_size, tf.float32) outputs, _states = tf.nn.dynamic_rnn( cell, X_one_hot, initial_state=initial_state, dtype=tf.float32) weights = tf.ones([batch_size, sequence_length]) sequence_loss = tf.contrib.seq2seq.sequence_loss(logits=outputs, targets=Y,weights=weights) loss = tf.reduce_mean(sequence_loss) train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(loss) prediction = tf.argmax(outputs, axis=2) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py
  • 344. Training and Results with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for i in range(3000): l, _ = sess.run([loss, train], feed_dict={X: x_data, Y: y_data}) result = sess.run(prediction, feed_dict={X: x_data}) # print char using dic result_str = [idx2char[c] for c in np.squeeze(result)] print(i, "loss:", l, "Prediction:", ''.join(result_str)) 0 loss: 2.29895 Prediction: nnuffuunnuuuyuy 1 loss: 2.29675 Prediction: nnuffuunnuuuyuy ... 1418 loss: 1.37351 Prediction: if you want you 1419 loss: 1.37331 Prediction: if you want you https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py
  • 345. Really long sentence? sentence = ("if you want to build a ship, don't drum up people together to " "collect wood and don't assign them tasks and work, but rather " "teach them to long for the endless immensity of the sea.") https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 346. Really long sentence? sentence = ("if you want to build a ship, don't drum up people together to " "collect wood and don't assign them tasks and work, but rather " "teach them to long for the endless immensity of the sea.") # training dataset 0 if you wan -> f you want 1 f you want -> you want 2 you want -> you want t 3 you want t -> ou want to … 168 of the se -> of the sea 169 of the sea -> f the sea. https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 347. Making dataset char_set = list(set(sentence)) char_dic = {w: i for i, w in enumerate(char_set)} dataX = [] dataY = [] for i in range(0, len(sentence) - seq_length): x_str = sentence[i:i + seq_length] y_str = sentence[i + 1: i + seq_length + 1] print(i, x_str, '->', y_str) x = [char_dic[c] for c in x_str] # x str to index y = [char_dic[c] for c in y_str] # y str to index dataX.append(x) dataY.append(y) # training dataset 0 if you wan -> f you want 1 f you want -> you want 2 you want -> you want t 3 you want t -> ou want to … 168 of the se -> of the sea 169 of the sea -> f the sea. https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 348. RNN parameters char_set = list(set(sentence)) char_dic = {w: i for i, w in enumerate(char_set)} data_dim = len(char_set) hidden_size = len(char_set) num_classes = len(char_set) seq_length = 10 # Any arbitrary number batch_size = len(dataX) # training dataset 0 if you wan -> f you want 1 f you want -> you want 2 you want -> you want t 3 you want t -> ou want to … 168 of the se -> of the sea 169 of the sea -> f the sea. https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 349. LSTM and Loss X = tf.placeholder(tf.int32, [None, sequence_length]) # X data Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0 cell = tf.contrib.rnn.BasicLSTMCell(num_units=rnn_hidden_size, state_is_tuple=True) initial_state = cell.zero_state(batch_size, tf.float32) outputs, _states = tf.nn.dynamic_rnn( cell, X_one_hot, initial_state=initial_state, dtype=tf.float32) weights = tf.ones([batch_size, sequence_length]) sequence_loss = tf.contrib.seq2seq.sequence_loss(logits=outputs, targets=Y,weights=weights) loss = tf.reduce_mean(sequence_loss) train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(loss) prediction = tf.argmax(outputs, axis=2) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py
  • 350. Exercise ● Run long sequence RNN ● Why it does not work?
  • 351. Lab 12-4 RNN with long sequences: Stacked RNN + Softmax layer Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 353. Really long sentence? sentence = ("if you want to build a ship, don't drum up people together to " "collect wood and don't assign them tasks and work, but rather " "teach them to long for the endless immensity of the sea.") https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 354. Making dataset char_set = list(set(sentence)) char_dic = {w: i for i, w in enumerate(char_set)} dataX = [] dataY = [] for i in range(0, len(sentence) - seq_length): x_str = sentence[i:i + seq_length] y_str = sentence[i + 1: i + seq_length + 1] print(i, x_str, '->', y_str) x = [char_dic[c] for c in x_str] # x str to index y = [char_dic[c] for c in y_str] # y str to index dataX.append(x) dataY.append(y) # training dataset 0 if you wan -> f you want 1 f you want -> you want 2 you want -> you want t 3 you want t -> ou want to … 168 of the se -> of the sea 169 of the sea -> f the sea. https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 355. RNN parameters char_set = list(set(sentence)) char_dic = {w: i for i, w in enumerate(char_set)} data_dim = len(char_set) hidden_size = len(char_set) num_classes = len(char_set) seq_length = 10 # Any arbitrary number batch_size = len(dataX) # training dataset 0 if you wan -> f you want 1 f you want -> you want 2 you want -> you want t 3 you want t -> ou want to … 168 of the se -> of the sea 169 of the sea -> f the sea. https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 357. Stacked RNN X = tf.placeholder(tf.int32, [None, seq_length]) Y = tf.placeholder(tf.int32, [None, seq_length]) # One-hot encoding X_one_hot = tf.one_hot(X, num_classes) print(X_one_hot) # check out the shape # Make a lstm cell with hidden_size (each unit output vector size) cell = rnn.BasicLSTMCell(hidden_size, state_is_tuple=True) cell = rnn.MultiRNNCell([cell] * 2, state_is_tuple=True) # outputs: unfolding size x hidden size, state = hidden size outputs, _states = tf.nn.dynamic_rnn(cell, X_one_hot, dtype=tf.float32) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 358. Softmax (FC) in Deep CNN Image credit: http://personal.ie.cuhk.edu.hk/~ccloy/project_target_code/index.html
  • 360. X_for_softmax = tf.reshape(outputs, [-1, hidden_size]) outputs = tf.reshape(outputs, [batch_size, seq_length, num_classes]) Softmax https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 361. Softmax # (optional) softmax layer X_for_softmax = tf.reshape(outputs, [-1, hidden_size]) softmax_w = tf.get_variable("softmax_w", [hidden_size, num_classes] softmax_b = tf.get_variable("softmax_b",[num_classes]) outputs = tf.matmul(X_for_softmax,softmax_w) + softmax_b outputs = tf.reshape(outputs, [batch_size, seq_length, num_classes]) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 362. Loss # reshape out for sequence_loss outputs = tf.reshape(outputs, [batch_size, seq_length, num_classes]) # All weights are 1 (equal weights) weights = tf.ones([batch_size, seq_length]) sequence_loss = tf.contrib.seq2seq.sequence_loss( logits=outputs, targets=Y, weights=weights) mean_loss = tf.reduce_mean(sequence_loss) train_op = tf.train.AdamOptimizer(learning_rate=0.1).minimize(mean_loss) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 363. Training and print results sess = tf.Session() sess.run(tf.global_variables_initializer() for i in range(500): _, l, results = sess.run( [train_op, mean_loss, outputs], feed_dict={X: dataX, Y: dataY}) for j, result in enumerate(results): index = np.argmax(result, axis=1) print(i, j, ''.join([char_set[t] for t in index]), l) 0 167 tttttttttt 3.23111 0 168 tttttttttt 3.23111 0 169 tttttttttt 3.23111 … 499 167 oof the se 0.229306 499 168 tf the sea 0.229306 499 169 n the sea. 0.229306 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 364. # Let's print the last char of each result to check it works results = sess.run(outputs, feed_dict={X: dataX}) for j, result in enumerate(results): index = np.argmax(result, axis=1) if j is 0: # print all for the first result to make a sentence print(''.join([char_set[t] for t in index]), end='') else: print(char_set[index[-1]], end='') Training and print results g you want to build a ship, don't drum up people together to collect wood and don't assign them tasks and work, but rather teach them to long for the endless immensity of the sea. https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py
  • 367. char/word rnn (char/word level n to n model) https://github.com/sherjilozair/char-rnn-tensorflow https://github.com/hunkim/word-rnn-tensorflow
  • 368. Lab 12-5 Dynamic RNN Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 370. Different sequence length h e l l o h i w h y ...
  • 371. Different sequence length h e l l o h i <pad> <pad> <pad> w h y <pad> <pad> ...
  • 372. Different sequence length h e l l o h i w h y ... sequence_length=[5,2,3]
  • 373. Dynamic RNN # 3 batches 'hello', 'eolll', 'lleel' x_data = np.array([[[...]]], dtype=np.float32) hidden_size = 2 cell = rnn.BasicLSTMCell(num_units=hidden_size, state_is_tuple=True) outputs, _states = tf.nn.dynamic_rnn( cell, x_data, sequence_length=[5,3,4], dtype=tf.float32) sess.run(tf.global_variables_initializer()) print(outputs.eval()) array([[[-0.17904168, -0.08053244], [-0.01294809, 0.01660814], [-0.05754048, -0.1368292 ], [-0.08655578, -0.20553185], [ 0.07297077, -0.21743253]], [[ 0.10272847, 0.06519825], [ 0.20188759, -0.05027055], [ 0.09514933, -0.16452041], [ 0. , 0. ], [ 0. , 0. ]], [[-0.04893036, -0.14655617], [-0.07947272, -0.20996611], [ 0.06466491, -0.02576563], [ 0.15087658, 0.05166111], [ 0. , 0. ]]], https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb
  • 374. Lab 12-6 RNN with time series data (stock) Sung Kim <hunkim+ml@gmail.com> Code: https://github.com/hunkim/DeepLearningZeroToAll/ With TF 1.0!
  • 377. Time series data Open High Low Volume Close 828.659973 833.450012 828.349976 1247700 831.659973 823.02002 828.070007 821.655029 1597800 828.070007 819.929993 824.400024 818.97998 1281700 824.159973 819.359985 823 818.469971 1304000 818.97998 819 823 816 1053600 820.450012 816 820.958984 815.48999 1198100 819.23999 811.700012 815.25 809.780029 1129100 813.669983 809.51001 810.659973 804.539978 989700 809.559998 807 811.840027 803.190002 1155300 808.380005 'data-02-stock_daily.csv'
  • 378. Many to one 1 2 3 4 5 8 6 7 https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
  • 379. Open High Low Volume Close 828.659973 833.450012 828.349976 1247700 831.659973 823.02002 828.070007 821.655029 1597800 828.070007 819.929993 824.400024 818.97998 1281700 824.159973 819.359985 823 818.469971 1304000 818.97998 819 823 816 1053600 820.450012 816 820.958984 815.48999 1198100 819.23999 811.700012 815.25 809.780029 1129100 813.669983 809.51001 810.659973 804.539978 989700 ? 807 811.840027 803.190002 1155300 ?
  • 380. Reading data timesteps = seq_length = 7 data_dim = 5 output_dim = 1 # Open,High,Low,Close,Volume xy = np.loadtxt('data-02-stock_daily.csv', delimiter=',') xy = xy[::-1] # reverse order (chronically ordered) xy = MinMaxScaler(xy) x = xy y = xy[:, [-1]] # Close as label dataX = [] dataY = [] for i in range(0, len(y) - seq_length): _x = x[i:i + seq_length] _y = y[i + seq_length] # Next close price print(_x, "->", _y) dataX.append(_x) dataY.append(_y) [ 0.18667876 0.20948057 0.20878184 0. 0.21744815] [ 0.30697388 0.31463414 0.21899367 0.01247647 0.21698189] [ 0.21914211 0.26390721 0.2246864 0.45632338 0.22496747] [ 0.23312993 0.23641916 0.16268272 0.57017119 0.14744274] [ 0.13431201 0.15175877 0.11617252 0.39380658 0.13289962] [ 0.13973232 0.17060429 0.15860382 0.28173344 0.18171679] [ 0.18933069 0.20057799 0.19187983 0.29783096 0.2086465 ]] -> [ 0.14106001] https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
  • 381. Training and test datasets # split to train and testing train_size = int(len(dataY) * 0.7) test_size = len(dataY) - train_size trainX, testX = np.array(dataX[0:train_size]), np.array(dataX[train_size:len(dataX)]) trainY, testY = np.array(dataY[0:train_size]), np.array(dataY[train_size:len(dataY)]) # input placeholders X = tf.placeholder(tf.float32, [None, seq_length, data_dim]) Y = tf.placeholder(tf.float32, [None, 1]) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
  • 382. LSTM and Loss # input placeholders X = tf.placeholder(tf.float32, [None, seq_length, data_dim]) Y = tf.placeholder(tf.float32, [None, 1]) cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_dim, state_is_tuple=True) outputs, _states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32) Y_pred = tf.contrib.layers.fully_connected( outputs[:, -1], output_dim, activation_fn=None) # We use the last cell's output # cost/loss loss = tf.reduce_sum(tf.square(Y_pred - Y)) # sum of the squares # optimizer optimizer = tf.train.AdamOptimizer(0.01) train = optimizer.minimize(loss) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
  • 383. Training and Results sess = tf.Session() sess.run(tf.global_variables_initializer()) for i in range(1000): _, l = sess.run([train, loss], feed_dict={X: trainX, Y: trainY}) print(i, l) testPredict = sess.run(Y_pred, feed_dict={X: testX}) import matplotlib.pyplot as plt plt.plot(testY) plt.plot(testPredict) plt.show() https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py
  • 384. Exercise ● Implement stock prediction using linear regression only ● Improve results using more features such as keywords and/or sentiments in top news
  • 385. Other RNN applications ● Language Modeling ● Speech Recognition ● Machine Translation ● Conversation Modeling/Question Answering ● Image/Video Captioning ● Image/Music/Dance Generation http://jiwonkim.org/awesome-rnn/
  • 386. Google Could ML Examples Sung Kim <hunkim+ml@gmail.com> https://github.com/hunkim/GoogleCloudMLExamples
  • 389. Cloud ML TensorFlow Tasks Google Cloud ML TensorFlow Task
  • 397. Google Cloud commands • gclould: command-line interface to Google Cloud Platform - Google Cloud ML jobs (`gcloud beta ml`) - Google Compute Engine virtual machine instances and other resources - Google Cloud Dataproc clusters and jobs - Google Cloud Deployment manager deployments - … • gsutil: command-line interface to Google Cloud Storage https://cloud.google.com/sdk/gcloud/ https://cloud.google.com/storage/docs/gsutil
  • 399. Example git repository git clone https://github.com/hunkim/GoogleCloudMLExamples.git
  • 404. Jobs
  • 410. Cloud ML TensorFlow Tasks Google Cloud ML TensorFlow Task
  • 411. Setting and file copy JOB_NAME="task9" PROJECT_ID=`gcloud config list project --format "value(core.project)"` STAGING_BUCKET=gs://${PROJECT_ID}-ml INPUT_PATH=${STAGING_BUCKET}/input gsutil cp input/input.csv $INPUT_PATH/input.csv
  • 414. Jobs
  • 415. Logs
  • 424. With Great Power Comes Great Responsibility
  • 426. Next • Could ML deploy • Hyper-parameter tuning • Distributed training tasks