Tensor flow description of ML Lab. document

Lab 1
TensorFlow Basics
Sung Kim <hunkim+ml@gmail.com>
Code: https://github.com/hunkim/DeepLearningZeroToAll/
With TF 1.0!

Call for comments
Please feel free to add comments directly on these slides
Other slides: https://goo.gl/jPtWNt
Picture from http://www.tssablog.org/archives/3280

https://www.tensorflow.org

https://twitter.com/fchollet/status/830499993450450944/
TensorFlow

https://twitter.com/fchollet/status/830499993450450944/

TensorFlow
● TensorFlow™ is an open source software library for
numerical computation using data flow graphs.
● Python!
https://www.tensorflow.org/

What is a Data Flow Graph?
● Nodes in the graph represent
mathematical operations
● Edges represent the
multidimensional data arrays
(tensors) communicated
between them.
https://www.tensorflow.org/

Installing TensorFlow
● Linux, Max OSX, Windows
• (sudo -H) pip install --upgrade tensorflow
• (sudo -H) pip install --upgrade tensorflow-gpu
● From source
• bazel ...
• https://www.tensorflow.org/install/install_sources
● Google search/Community help
• https://www.facebook.com/groups/TensorFlowKR/
https://www.tensorflow.org/install/

Check installation and version
Sungs-MacBook-Pro:hunkim$ python3
Python 3.6.0 (v3.6.0:41df79263a11, Dec 22 2016, 17:23:13)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more
information.
>>> import tensorflow as tf
>>> tf.__version__
'1.0.0'
>>>

https://github.com/hunkim/DeepLearningZeroToAll/

TensorFlow Hello World!
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-01-basics.ipynb
b’String’ ‘b’ indicates Bytes literals. http://stackoverflow.com/questions/6269765/

Computational Graph

TensorFlow Mechanics
feed data and run graph (operation)
sess.run (op)
update variables
in the graph
(and return values)
Build graph using
TensorFlow operations

Computational Graph
(1) Build graph (tensors) using TensorFlow operations
(2) feed data and run graph (operation)
sess.run (op)
(3) update variables in the graph
(and return values)

Placeholder

sess.run (op, feed_dict={x: x_data})
update variables
in the graph
(and return values)
Build graph using

Everything is Tensor
t = tf.Constant([1., 2., 3.])

Tensor Ranks, Shapes, and Types
https://www.tensorflow.org/programmers_guide/dims_types

Tensor Ranks, Shapes, and Types
https://www.quora.com/When-should-I-use-tf-float32-vs-tf-float64-in-TensorFlow
...

Lab 2
Linear Regression

Tensor flow description of ML Lab. document

Variables
https://www.tensorflow.org/programmers_guide/variables

Lab 2
Linear Regression
With TF 1.0!

Build graph using TF operations
# X and Y data
x_train = [1, 2, 3]
y_train = [1, 2, 3]
W = tf.Variable(tf.random_normal([1]), name='weight')
b = tf.Variable(tf.random_normal([1]), name='bias')
# Our hypothesis XW+b
hypothesis = x_train * W + b
# cost/loss function
cost = tf.reduce_mean(tf.square(hypothesis - y_train))

Build graph using TF operations
# Minimize
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01)
train = optimizer.minimize(cost)
GradientDescent
https://www.tensorflow.org/api_docs/python/tf/reduce_mean

Run/update graph and get results
# Launch the graph in a session.
sess = tf.Session()
# Initializes global variables in the graph.
sess.run(tf.global_variables_initializer())
# Fit the line
for step in range(2001):
sess.run(train)
if step % 20 == 0:
print(step, sess.run(cost), sess.run(W), sess.run(b))

import tensorflow as tf
# X and Y data
x_train = [1, 2, 3]
y_train = [1, 2, 3]
hypothesis = x_train * W + b
# Minimize
sess = tf.Session()
# Fit the line
sess.run(train)
if step % 20 == 0:
print(step, sess.run(cost), sess.run(W), sess.run(b))
'''
0 2.82329 [ 2.12867713] [-0.85235667]
20 0.190351 [ 1.53392804] [-1.05059612]
40 0.151357 [ 1.45725465] [-1.02391243]
...
1920 1.77484e-05 [ 1.00489295] [-0.01112291]
1940 1.61197e-05 [ 1.00466311] [-0.01060018]
1960 1.46397e-05 [ 1.004444] [-0.01010205]
1980 1.32962e-05 [ 1.00423515] [-0.00962736]
2000 1.20761e-05 [ 1.00403607] [-0.00917497]
'''
Full code (less than 20 lines)

Placeholders

Placeholders
# X and Y data
x_train = [1, 2, 3]
y_train = [1, 2, 3]
# Now we can use X and Y in place of x_data and y_data
# # placeholders for a tensor that will be always fed using feed_dict
# See http://stackoverflow.com/questions/36693740/
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)
...
# Fit the line
# Fit the line
cost_val, W_val, b_val, _ =
sess.run([cost, W, b, train],
feed_dict={X: [1, 2, 3], Y: [1, 2, 3]})
if step % 20 == 0:
print(step, cost_val, W_val, b_val)

X = tf.placeholder(tf.float32, shape=[None])
Y = tf.placeholder(tf.float32, shape=[None])
hypothesis = X * W + b
cost = tf.reduce_mean(tf.square(hypothesis - Y))
# Minimize
sess = tf.Session()
# Fit the line
cost_val, W_val, b_val, _ = sess.run([cost, W, b, train],
feed_dict={X: [1, 2, 3], Y: [1, 2, 3]})
if step % 20 == 0:
...
1980 1.32962e-05 [ 1.00423515] [-0.00962736]
2000 1.20761e-05 [ 1.00403607] [-0.00917497]
# Testing our model
print(sess.run(hypothesis, feed_dict={X: [5]}))
print(sess.run(hypothesis, feed_dict={X: [2.5]}))
print(sess.run(hypothesis,
feed_dict={X: [1.5, 3.5]}))
[ 5.0110054]
[ 2.50091505]
[ 1.49687922 3.50495124]
Full code with placeholders

X = tf.placeholder(tf.float32, shape=[None])
Y = tf.placeholder(tf.float32, shape=[None])
hypothesis = X * W + b
# Minimize
sess = tf.Session()
# Fit the line with new training data
cost_val, W_val, b_val, _ = sess.run([cost, W, b, train],
feed_dict={X: [1, 2, 3, 4, 5],
Y: [2.1, 3.1, 4.1, 5.1, 6.1]})
if step % 20 == 0:
…
1960 3.32396e-07 [ 1.00037301] [ 1.09865296]
1980 2.90429e-07 [ 1.00034881] [ 1.09874094]
2000 2.5373e-07 [ 1.00032604] [ 1.09882331]
# Testing our model
print(sess.run(hypothesis, feed_dict={X: [5]}))
print(sess.run(hypothesis, feed_dict={X: [2.5]}))
print(sess.run(hypothesis,
feed_dict={X: [1.5, 3.5]}))
[ 6.10045338]
[ 3.59963846]
[ 2.59931231 4.59996414]
Full code with placeholders

sess.run (op, feed_dict={x: x_data})
update variables
in the graph
(and return values)
Build graph using
feed_dict={X: [1, 2, 3, 4, 5],
Y: [2.1, 3.1, 4.1, 5.1, 6.1]})

Lab 3
Minimizing Cost
With TF 1.0!

import matplotlib.pyplot as plt
X = [1, 2, 3]
Y = [1, 2, 3]
W = tf.placeholder(tf.float32)
# Our hypothesis for linear model X * W
hypothesis = X * W
sess = tf.Session()
# Variables for plotting cost function
W_val = []
cost_val = []
for i in range(-30, 50):
feed_W = i * 0.1
curr_cost, curr_W = sess.run([cost, W], feed_dict={W: feed_W})
W_val.append(curr_W)
cost_val.append(curr_cost)
# Show the cost function
plt.plot(W_val, cost_val)
plt.show()
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-1-minimizing_cost_show_graph.py
http://matplotlib.org/users/installing.html

W
cost (W)
X = [1, 2, 3]
Y = [1, 2, 3]
W = tf.placeholder(tf.float32)
hypothesis = X * W
sess = tf.Session()
# Variables for plotting cost function
W_val = []
cost_val = []
for i in range(-30, 50):
feed_W = i * 0.1
curr_cost, curr_W = sess.run([cost, W], feed_dict={W: feed_W})
W_val.append(curr_W)
cost_val.append(curr_cost)
# Show the cost function
plt.plot(W_val, cost_val)
plt.show()
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-1-minimizing_cost_show_graph.py

W
cost (W)
Gradient descent
# Minimize: Gradient Descent using derivative:
W -= learning_rate * derivative
learning_rate = 0.1
gradient = tf.reduce_mean((W * X - Y) * X)
descent = W - learning_rate * gradient
update = W.assign(descent)

x_data = [1, 2, 3]
y_data = [1, 2, 3]
hypothesis = X * W
cost = tf.reduce_sum(tf.square(hypothesis - Y))
# Minimize: Gradient Descent using derivative: W -= learning_rate * derivative
learning_rate = 0.1
sess = tf.Session()
sess.run(update, feed_dict={X: x_data, Y: y_data})
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-2-minimizing_cost_gradient_update.py

x_data = [1, 2, 3]
y_data = [1, 2, 3]
hypothesis = X * W
learning_rate = 0.1
sess = tf.Session()
0 5.81756 [ 1.64462376]
1 1.65477 [ 1.34379935]
2 0.470691 [ 1.18335962]
3 0.133885 [ 1.09779179]
4 0.0380829 [ 1.05215561]
5 0.0108324 [ 1.0278163]
6 0.00308123 [ 1.01483536]
7 0.000876432 [ 1.00791216]
8 0.00024929 [ 1.00421977]
9 7.09082e-05 [ 1.00225055]
10 2.01716e-05 [ 1.00120032]
11 5.73716e-06 [ 1.00064015]
12 1.6319e-06 [ 1.00034142]
13 4.63772e-07 [ 1.00018203]
14 1.31825e-07 [ 1.00009704]
15 3.74738e-08 [ 1.00005174]
16 1.05966e-08 [ 1.00002754]
17 2.99947e-09 [ 1.00001466]
18 8.66635e-10 [ 1.00000787]
19 2.40746e-10 [ 1.00000417]
20 7.02158e-11 [ 1.00000226]

# Minimize: Gradient Descent Magic
optimizer =
tf.train.GradientDescentOptimizer(learning_rate=0.1)
x_data = [1, 2, 3]
y_data = [1, 2, 3]
hypothesis = X * W
learning_rate = 0.1
sess = tf.Session()

Output when W=5
# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]
# Set wrong model weights
W = tf.Variable(5.0)
# Linear model
hypothesis = X * W
sess = tf.Session()
print(step, sess.run(W))
sess.run(train)
0 5.0
1 1.26667
2 1.01778
3 1.00119
4 1.00008
5 1.00001
6 1.0
7 1.0
8 1.0
9 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-3-minimizing_cost_tf_optimizer.py

Output when W=-3
0 -3.0
1 0.733334
2 0.982222
3 0.998815
4 0.999921
5 0.999995
6 1.0
7 1.0
8 1.0
9 1.0
# tf Graph Input
X = [1, 2, 3]
Y = [1, 2, 3]
W = tf.Variable(-3.0)
# Linear model
hypothesis = X * W
sess = tf.Session()
print(step, sess.run(W))
sess.run(train) https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-3-minimizing_cost_tf_optimizer.py

X = [1, 2, 3]
Y = [1, 2, 3]
W = tf.Variable(5.)
# Linear model
hypothesis = X * W
# Manual gradient
gradient = tf.reduce_mean((W * X - Y) * X) * 2
# Get gradients
gvs = optimizer.compute_gradients(cost, [W])
# Apply gradients
apply_gradients = optimizer.apply_gradients(gvs)
sess = tf.Session()
print(step, sess.run([gradient, W, gvs]))
sess.run(apply_gradients)
Optional: compute_gradient
and apply_gradient
0 [37.333332, 5.0, [(37.333336, 5.0)]]
1 [33.848888, 4.6266665, [(33.848888, 4.6266665)]]
2 [30.689657, 4.2881775, [(30.689657, 4.2881775)]]
3 [27.825287, 3.9812808, [(27.825287, 3.9812808)]]
4 [25.228262, 3.703028, [(25.228264, 3.703028)]]
...
96 [0.0030694802, 1.0003289, [(0.0030694804, 1.0003289)]]
97 [0.0027837753, 1.0002983, [(0.0027837753, 1.0002983)]]
98 [0.0025234222, 1.0002704, [(0.0025234222, 1.0002704)]]
99 [0.0022875469, 1.0002451, [(0.0022875469, 1.0002451)]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-03-X-minimizing_cost_tf_gradient.py

Lab 4
Multi-variable linear regression
With TF 1.0!

Lab 4
With TF 1.0!

Lab 4-1
With TF 1.0!

Hypothesis using matrix
x1
x2
x3
Y
73 80 75 152
93 88 93 185
89 91 90 180
96 98 100 196
73 66 70 142
Test Scores for General Psychology

Hypothesis using matrix
x1
x2
x3
Y
73 80 75 152
93 88 93 185
89 91 90 180
96 98 100 196
73 66 70 142
Test Scores for General Psychology
x1_data = [73., 93., 89., 96., 73.]
x2_data = [80., 88., 91., 98., 66.]
x3_data = [75., 93., 90., 100., 70.]
y_data = [152., 185., 180., 196., 142.]
# placeholders for a tensor that will be always fed.
x1 = tf.placeholder(tf.float32)
w1 = tf.Variable(tf.random_normal([1]), name='weight1')
hypothesis = x1 * w1 + x2 * w2 + x3 * w3 + b
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-1-multi_variable_linear_regression.py

x1_data = [73., 93., 89., 96., 73.]
x2_data = [80., 88., 91., 98., 66.]
x3_data = [75., 93., 90., 100., 70.]
y_data = [152., 185., 180., 196., 142.]
hypothesis = x1 * w1 + x2 * w2 + x3 * w3 + b
# Minimize. Need a very small learning rate for this data set
optimizer = tf.train.GradientDescentOptimizer(learning_rate=1e-5)
sess = tf.Session()
cost_val, hy_val, _ = sess.run([cost, hypothesis, train],
feed_dict={x1: x1_data, x2: x2_data, x3: x3_data, Y: y_data})
if step % 10 == 0:
print(step, "Cost: ", cost_val, "nPrediction:n", hy_val)
0 Cost: 19614.8
Prediction:
[ 21.69748688
39.10213089 31.82624626
35.14236832
32.55316544]
10 Cost: 14.0682
Prediction:
[ 145.56100464
187.94958496
178.50236511
194.86721802
146.08096313]
...
1990 Cost: 4.9197
Prediction:
[ 148.15084839
186.88632202
179.6293335
195.81796265
144.46044922]
2000 Cost: 4.89449
Prediction:
[ 148.15931702
186.8805542
179.63194275
195.81971741
144.45298767]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-1-multi_variable_linear_regression.py

x_data = [[73., 80., 75.], [93., 88., 93.],
[89., 91., 90.], [96., 98., 100.], [73., 66., 70.]]
y_data = [[152.], [185.], [180.], [196.], [142.]]
X = tf.placeholder(tf.float32, shape=[None, 3])
Y = tf.placeholder(tf.float32, shape=[None, 1])
W = tf.Variable(tf.random_normal([3, 1]), name='weight')
# Hypothesis
hypothesis = tf.matmul(X, W) + b
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-2-multi_variable_matmul_linear_regression.py
Matrix

x_data = [[73., 80., 75.], [93., 88., 93.],
[89., 91., 90.], [96., 98., 100.], [73., 66., 70.]]
y_data = [[152.], [185.], [180.], [196.], [142.]]
# Hypothesis
# Simplified cost/loss function
# Minimize
sess = tf.Session()
cost_val, hy_val, _ = sess.run(
[cost, hypothesis, train], feed_dict={X: x_data, Y: y_data})
if step % 10 == 0:
0 Cost: 7105.46
Prediction:
[[ 80.82241058]
[ 92.26364136]
[ 93.70250702]
[ 98.09217834]
[ 72.51759338]]
10 Cost: 5.89726
Prediction:
[[ 155.35159302]
[ 181.85691833]
[ 181.97254944]
[ 194.21760559]
[ 140.85707092]]
...
1990 Cost: 3.18588
Prediction:
[[ 154.36352539]
[ 182.94833374]
[ 181.85189819]
[ 194.35585022]
[ 142.03240967]]
2000 Cost: 3.1781
Prediction:
[[ 154.35881042]
[ 182.95147705]
[ 181.85035706]
[ 194.35533142]
[ 142.036026 ]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-2-multi_variable_matmul_linear_regression.py

Lab 4-2
Loading Data from File
With TF 1.0!

Loading data from file
data-01-test-score.csv
# EXAM1,EXAM2,EXAM3,FINAL
73,80,75,152
93,88,93,185
89,91,90,180
96,98,100,196
73,66,70,142
53,46,55,101
import numpy as np
xy = np.loadtxt('data-01-test-score.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# Make sure the shape and data are OK
print(x_data.shape, x_data, len(x_data))
print(y_data.shape, y_data)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-3-file_input_linear_regression.py

Slicing
http://cs231n.github.io/python-numpy-tutorial/

http://slides.com/wigging

import numpy as np
tf.set_random_seed(777) # for reproducibility
xy = np.loadtxt('data-01-test-score.csv', delimiter=',',
dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# Make sure the shape and data are OK
print(x_data.shape, x_data, len(x_data))
print(y_data.shape, y_data)
# Hypothesis
# Minimize
sess = tf.Session()
# Set up feed_dict variables inside the loop.
[cost, hypothesis, train],
feed_dict={X: x_data, Y: y_data})
if step % 10 == 0:
print(step, "Cost: ", cost_val,
"nPrediction:n", hy_val)
# Ask my score
print("Your score will be ", sess.run(hypothesis,
feed_dict={X: [[100, 70, 101]]}))
print("Other scores will be ", sess.run(hypothesis,
feed_dict={X: [[60, 70, 110], [90, 100, 80]]}))

Your score will be [[ 181.73277283]]
Other scores will be [[ 145.86265564] [ 187.23129272]]
Output
sess = tf.Session()
# Set up feed_dict variables inside the loop.
if step % 10 == 0:
# Ask my score
print("Your score will be ", sess.run(hypothesis,
feed_dict={X: [[100, 70, 101]]}))
print("Other scores will be ", sess.run(hypothesis,
feed_dict={X: [[60, 70, 110], [90, 100, 80]]}))

Queue Runners
https://www.tensorflow.org/programmers_guide/reading_data

filename_queue = tf.train.string_input_producer(
['data-01-test-score.csv', 'data-02-test-score.csv', ... ],
shuffle=False, name='filename_queue')
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
record_defaults = [[0.], [0.], [0.], [0.]]
xy = tf.decode_csv(value, record_defaults=record_defaults)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-4-tf_reader_linear_regression.py

tf.train.batch
# collect batches of csv in
train_x_batch, train_y_batch =
tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10)
sess = tf.Session()
...
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
x_batch, y_batch = sess.run([train_x_batch, train_y_batch])
...
coord.request_stop()
coord.join(threads)

filename_queue = tf.train.string_input_producer(
['data-01-test-score.csv'], shuffle=False, name='filename_queue')
reader = tf.TextLineReader()
key, value = reader.read(filename_queue)
# Default values, in case of empty columns. Also specifies the type of the
# decoded result.
record_defaults = [[0.], [0.], [0.], [0.]]
xy = tf.decode_csv(value, record_defaults=record_defaults)
# collect batches of csv in
train_x_batch, train_y_batch =
tf.train.batch([xy[0:-1], xy[-1:]], batch_size=10)
# Hypothesis
# Minimize
sess = tf.Session()
# Start populating the filename queue.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
x_batch, y_batch = sess.run([train_x_batch, train_y_batch])
feed_dict={X: x_batch, Y: y_batch})
if step % 10 == 0:
coord.request_stop()
coord.join(threads)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-04-4-tf_reader_linear_regression.py

shuffle_batch
# min_after_dequeue defines how big a buffer we will randomly sample
# from -- bigger means better shuffling but slower start up and more
# memory used.
# capacity must be larger than min_after_dequeue and the amount larger
# determines the maximum we will prefetch. Recommendation:
# min_after_dequeue + (num_threads + a small safety margin) * batch_size
min_after_dequeue = 10000
capacity = min_after_dequeue + 3 * batch_size
example_batch, label_batch = tf.train.shuffle_batch(
[example, label], batch_size=batch_size, capacity=capacity,
min_after_dequeue=min_after_dequeue)

Lab 5
Logistic (regression) classifier

Lab 5
Logistic (regression) classifier
With TF 1.0!

Training Data
x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]
y_data = [[0], [0], [0], [1], [1], [1]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-1-logistic_regression.py

# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W) + b))
hypothesis = tf.sigmoid(tf.matmul(X, W) + b)
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) *
tf.log(1 - hypothesis))
train = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)
# Accuracy computation
# True if hypothesis>0.5 else False
predicted = tf.cast(hypothesis > 0.5, dtype=tf.float32)
accuracy = tf.reduce_mean(tf.cast(tf.equal(predicted, Y), dtype=tf.float32))

Train the model
# Launch graph
with tf.Session() as sess:
# Initialize TensorFlow variables
cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})
if step % 200 == 0:
print(step, cost_val)
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy],
print("nHypothesis: ", h, "nCorrect (Y): ", c, "nAccuracy: ", a)

x_data = [[1, 2], [2, 3], [3, 1], [4, 3], [5, 3], [6, 2]]
y_data = [[0], [0], [0], [1], [1], [1]]
# Hypothesis using sigmoid: tf.div(1., 1. + tf.exp(tf.matmul(X, W)))
cost = -tf.reduce_mean(Y * tf.log(hypothesis) + (1 - Y) * tf.log(1 - hypothesis))
# Launch graph
cost_val, _ = sess.run([cost, train], feed_dict={X: x_data, Y: y_data})
if step % 200 == 0:
print(step, cost_val)
# Accuracy report
# step, cost
0 1.73078
200 0.571512
400 0.507414
...
9600 0.154132
9800 0.151778
10000 0.149496
Hypothesis:
[[ 0.03074029]
[ 0.15884677]
[ 0.30486736]
[ 0.78138196]
[ 0.93957496]
[ 0.98016882]]
Correct (Y):
[[ 0.]
[ 0.]
[ 0.]
[ 1.]
[ 1.]
[ 1.]]
Accuracy: 1.0

Classifying diabetes
xy = np.loadtxt('data-03-diabetes.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-2-logistic_regression_diabetes.py

xy = np.loadtxt('data-03-diabetes.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# Launch graph
feed = {X: x_data, Y: y_data}
sess.run(train, feed_dict=feed)
if step % 200 == 0:
print(step, sess.run(cost, feed_dict=feed))
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict=feed)
0 0.82794
200 0.755181
400 0.726355
600 0.705179
800 0.686631
...
9600 0.492056
9800 0.491396
10000 0.490767
[ 0.7461012 ]
[ 0.79919308]
[ 0.72995949]
[ 0.88297188]]
[ 1.]
[ 1.]
[ 1.]]
Accuracy:
0.762846
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-05-2-logistic_regression_diabetes.py

Exercise
● CSV reading using tf.decode_csv
● Try other classification data from Kaggle
○ https://www.kaggle.com

Lab 6
Softmax classifier

Lab 6
Softmax Classifier
With TF 1.0!

Lab 6-1
Softmax Classifier
With TF 1.0!

Softmax function
https://www.udacity.com/course/viewer#!/c-ud730/l-6370362152/m-6379811817

tf.matmul(X,W)+b
hypothesis = tf.nn.softmax(tf.matmul(X,W)+b)

Cost function: cross entropy
# Cross entropy cost/loss
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), axis=1))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(cost)

x_data = [[1, 2, 1, 1], [2, 1, 3, 2], [3, 1, 3, 4], [4, 1, 5, 5], [1, 7, 5, 5],
[1, 2, 5, 6], [1, 6, 6, 6], [1, 7, 7, 7]]
y_data = [[0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 1, 0], [0, 1, 0], [0, 1, 0], [1, 0, 0], [1, 0, 0]]
X = tf.placeholder("float", [None, 4])
Y = tf.placeholder("float", [None, 3])
nb_classes = 3
W = tf.Variable(tf.random_normal([4, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')
# tf.nn.softmax computes softmax activations
# softmax = exp(logits) / reduce_sum(exp(logits), dim)
hypothesis = tf.nn.softmax(tf.matmul(X, W) + b)
# Launch graph
sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
if step % 200 == 0:
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-1-softmax_classifier.py

Test & one-hot encoding
# Testing & One-hot encoding
a = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9]]})
print(a, sess.run(tf.arg_max(a, 1)))
[[ 1.38904958e-03 9.98601854e-01 9.06129117e-06]] [1]

Test & one-hot encoding
all = sess.run(hypothesis, feed_dict={X: [[1, 11, 7, 9],
[1, 3, 4, 3],
[1, 1, 0, 1]]})
print(all, sess.run(tf.arg_max(all, 1)))
[[ 1.38904958e-03 9.98601854e-01 9.06129117e-06]
[ 9.31192040e-01 6.29020557e-02 5.90589503e-03]
[ 1.27327668e-08 3.34112905e-04 9.99665856e-01]]
[1 0 2]

Lab 6-2
Fancy Softmax Classifier
cross_entropy, one_hot, reshape
With TF 1.0!

softmax_cross_entropy_with_logits
logits = tf.matmul(X, W) + b
hypothesis = tf.nn.softmax(logits)
cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits,
labels=Y_one_hot)
cost = tf.reduce_mean(cost_i)

tf.matmul(X,W)
hypothesis = tf.nn.softmax(tf.matmul(X,W))

Animal classification
with softmax_cross_entropy_with_logits
https://kr.pinterest.com/explore/animal-classification-activity/
# Predicting animal type based on various features
xy = np.loadtxt('data-04-zoo.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]

tf.one_hot and reshape
Y = tf.placeholder(tf.int32, [None, 1]) # 0 ~ 6, shape=(?, 1)
Y_one_hot = tf.one_hot(Y, nb_classes) # one hot shape=(?, 1, 7)
Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes]) # shape=(?, 7)
If the input indices is rank N, the output will have rank N+1. The new axis is
created at dimension axis (default: the new axis is appended at the end).
https://www.tensorflow.org/api_docs/python/tf/one_hot

# Predicting animal type based on various features
xy = np.loadtxt('data-04-zoo.csv', delimiter=',', dtype=np.float32)
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
nb_classes = 7 # 0 ~ 6
X = tf.placeholder(tf.float32, [None, 16])
Y = tf.placeholder(tf.int32, [None, 1]) # 0 ~ 6
Y_one_hot = tf.one_hot(Y, nb_classes) # one hot
Y_one_hot = tf.reshape(Y_one_hot, [-1, nb_classes])
W = tf.Variable(tf.random_normal([16, nb_classes]), name='weight')
b = tf.Variable(tf.random_normal([nb_classes]), name='bias')
# tf.nn.softmax computes softmax activations
# softmax = exp(logits) / reduce_sum(exp(logits), dim)
logits = tf.matmul(X, W) + b
hypothesis = tf.nn.softmax(logits)
cost_i = tf.nn.softmax_cross_entropy_with_logits(logits=logits,
labels=Y_one_hot)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-06-2-softmax_zoo_classifier.py

prediction = tf.argmax(hypothesis, 1)
correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Launch graph
if step % 100 == 0:
loss, acc = sess.run([cost, accuracy], feed_dict={
X: x_data, Y: y_data})
print("Step: {:5}tLoss: {:.3f}tAcc: {:.2%}".format(
step, loss, acc))
# Let's see if we can predict
pred = sess.run(prediction, feed_dict={X: x_data})
# y_data: (N,1) = flatten => (N, ) matches pred.shape
for p, y in zip(pred, y_data.flatten()):
print("[{}] Prediction: {} True Y: {}".format(p == int(y), p, int(y)))

prediction = tf.argmax(hypothesis, 1)
correct_prediction = tf.equal(prediction, tf.argmax(Y_one_hot, 1))
# Launch graph
if step % 100 == 0:
loss, acc = sess.run([cost, accuracy], feed_dict={
X: x_data, Y: y_data})
print("Step: {:5}tLoss: {:.3f}tAcc: {:.2%}".format(
step, loss, acc))
# Let's see if we can predict
pred = sess.run(prediction, feed_dict={X: x_data})
# y_data: (N,1) = flatten => (N, ) matches pred.shape
for p, y in zip(pred, y_data.flatten()):
print("[{}] Prediction: {} True Y: {}".
format(p == int(y), p, int(y)))
Step: 1100 Loss: 0.101 Acc: 99.01%
Step: 1200 Loss: 0.092 Acc: 100.00%
Step: 1300 Loss: 0.084 Acc: 100.00%
...
[True] Prediction: 0 True Y: 0

Lab 7
Learning rate, Evaluation

Lab 7-1
Learning rate, Evaluation
With TF 1.0!

Training and Test datasets
x_data = [[1, 2, 1], [1, 3, 2], [1, 3, 4], [1, 5, 5], [1, 7, 5], [1, 2, 5], [1, 6, 6], [1, 7, 7]]
y_data = [[0, 0, 1], [0, 0, 1], [0, 0, 1], [0, 1, 0], [0, 1, 0], [0, 1, 0], [1, 0, 0], [1, 0, 0]]
# Evaluation our model using this test dataset
x_test = [[2, 1, 1], [3, 1, 2], [3, 3, 4]]
y_test = [[0, 0, 1], [0, 0, 1], [0, 0, 1]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-1-learning_rate_and_evaluation.py

W = tf.Variable(tf.random_normal([3, 3]))
b = tf.Variable(tf.random_normal([3]))
hypothesis = tf.nn.softmax(tf.matmul(X, W)+b)
# Correct prediction Test model
prediction = tf.arg_max(hypothesis, 1)
is_correct = tf.equal(prediction, tf.arg_max(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
# Launch graph
cost_val, W_val, _ = sess.run([cost, W, optimizer],
print(step, cost_val, W_val)
# predict
print("Prediction:", sess.run(prediction, feed_dict={X: x_test}))
# Calculate the accuracy
print("Accuracy: ", sess.run(accuracy, feed_dict={X: x_test, Y: y_test}))
199 0.672261 [[-1.15377033 0.28146935
1.13632679]
[ 0.37484586 0.18958236 0.33544877]
[-0.35609841 -0.43973011 -1.25604188]]
200 0.670909 [[-1.15885413 0.28058422
1.14229572]
[ 0.37609792 0.19073224 0.33304682]
[-0.35536593 -0.44033223 -1.2561723 ]]
Prediction: [2 2 2]
Accuracy: 1.0
Training and Test datasets

Learning rate: NaN!
http://sebastianraschka.com/Articles/2015_singlelayer_neurons.html

Big learning rate
2 27.2798 [[ 0.44451016 0.85699677
-1.03748143]
[ 0.48429942 0.98872018 -0.57314301]
[ 1.52989244 1.16229868 -4.74406147]]
3 8.668 [[ 0.12396193 0.61504567
-0.47498202]
[ 0.22003263 -0.2470119 0.9268558 ]
[ 0.96035379 0.41933775 -3.43156195]]
4 5.77111 [[-0.9524312 1.13037777
0.08607888]
[-3.78651619 2.26245379 2.42393875]
[-3.07170963 3.14037919 -2.12054014]]
5 inf [[ nan nan nan]
[ nan nan nan]
[ nan nan nan]]
6 nan [[ nan nan nan]
[ nan nan nan]
[ nan nan nan]]
...
Prediction: [0 0 0]
Accuracy: 0.0
optimizer = tf.train.GradientDescentOptimizer
(learning_rate=1.5).minimize(cost)
# Launch graph
# predict

Small learning rate
optimizer = tf.train.GradientDescentOptimizer
(learning_rate=1e-10).minimize(cost)
# Launch graph
# predict
0 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
1 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
...
198 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
199 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
200 5.73203 [[ 0.80269563 0.67861295 -1.21728313]
[-0.3051686 -0.3032113 1.50825703]
[ 0.75722361 -0.7008909 -2.10820389]]
Prediction: [0 0 0]
Accuracy: 0.0

Non-normalized inputs
xy = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973],
[823.02002, 828.070007, 1828100, 821.655029, 828.070007],
[819.929993, 824.400024, 1438100, 818.97998, 824.159973],
[816, 820.958984, 1008100, 815.48999, 819.23999],
[819.359985, 823, 1188100, 818.469971, 818.97998],
[819, 823, 1198100, 816, 820.450012],
[811.700012, 815.25, 1098100, 809.780029, 813.669983],
[809.51001, 816.659973, 1398100, 804.539978, 809.559998]])
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-2-linear_regression_without_min_max.py

Non-normalized
inputs
xy=...
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# Minimize
sess = tf.Session()
5 Cost: inf
Prediction:
[[ inf]
[ inf]
[ inf]
...
6 Cost: nan
Prediction:
[[ nan]
[ nan]
[ nan]
...
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-2-linear_regression_without_min_max.py

Normalized inputs (min-max scale)
xy = np.array([[828.659973, 833.450012, 908100, 828.349976, 831.659973],
[823.02002, 828.070007, 1828100, 821.655029, 828.070007],
[819.929993, 824.400024, 1438100, 818.97998, 824.159973],
[816, 820.958984, 1008100, 815.48999, 819.23999],
[819.359985, 823, 1188100, 818.469971, 818.97998],
[819, 823, 1198100, 816, 820.450012],
[811.700012, 815.25, 1098100, 809.780029, 813.669983],
[809.51001, 816.659973, 1398100, 804.539978, 809.559998]])
[[ 0.99999999 0.99999999 0. 1. 1. ]
[ 0.70548491 0.70439552 1. 0.71881782 0.83755791]
[ 0.54412549 0.50274824 0.57608696 0.606468 0.6606331 ]
[ 0.33890353 0.31368023 0.10869565 0.45989134 0.43800918]
[ 0.51436 0.42582389 0.30434783 0.58504805 0.42624401]
[ 0.49556179 0.42582389 0.31521739 0.48131134 0.49276137]
[ 0.11436064 0. 0.20652174 0.22007776 0.18597238]
[ 0. 0.07747099 0.5326087 0. 0. ]]
xy = MinMaxScaler(xy)
print(xy)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-3-linear_regression_min_max.py

Normalized inputs
xy=...
x_data = xy[:, 0:-1]
y_data = xy[:, [-1]]
# Minimize
sess = tf.Session()
Prediction:
[[ 1.63450289]
[ 0.06628087]
[ 0.35014752]
[ 0.67070574]
[ 0.61131608]
[ 0.61466062]
[ 0.23175186]
[-0.13716528]]
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-3-linear_regression_min_max.py

Lab 7-2
MNIST data
With TF 1.0!

MNIST Dataset
http://yann.lecun.com/exdb/mnist/

28x28x1 image
http://derindelimavi.blogspot.hk/2015/04/mnist-el-yazs-rakam-veri-seti.html
# MNIST data image of shape 28 * 28 = 784
# 0 - 9 digits recognition = 10 classes
Y = tf.placeholder(tf.float32, [None, nb_classes])

MNIST Dataset
from tensorflow.examples.tutorials.mnist import input_data
# Check out https://www.tensorflow.org/get_started/mnist/beginners for
# more information about the mnist dataset
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
…
batch_xs, batch_ys = mnist.train.next_batch(100)
…
print("Accuracy: ", accuracy.eval(session=sess,
feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-07-4-mnist_introduction.py

Reading data and set variables
from tensorflow.examples.tutorials.mnist import input_data
# Check out https://www.tensorflow.org/get_started/mnist/beginners for
# more information about the mnist dataset
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
nb_classes = 10
# MNIST data image of shape 28 * 28 = 784
# 0 - 9 digits recognition = 10 classes
Y = tf.placeholder(tf.float32, [None, nb_classes])
W = tf.Variable(tf.random_normal([784, nb_classes]))
b = tf.Variable(tf.random_normal([nb_classes]))

Softmax!
# Hypothesis (using softmax)
# Test model
is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1))
# Calculate accuracy

Training epoch/batch
# parameters
training_epochs = 15
batch_size = 100
# Training cycle
for epoch in range(training_epochs):
avg_cost = 0
total_batch = int(mnist.train.num_examples / batch_size)
for i in range(total_batch):
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
c, _ = sess.run([cost, optimizer], feed_dict={X: batch_xs, Y: batch_ys})
avg_cost += c / total_batch
print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

Training epoch/batch
In the neural network terminology:
● one epoch = one forward pass and one backward pass of all the training examples
● batch size = the number of training examples in one forward/backward pass. The higher
the batch size, the more memory space you'll need.
● number of iterations = number of passes, each pass using [batch size] number of
examples. To be clear, one pass = one forward pass + one backward pass (we do not count the
forward pass and backward pass as two different passes).
Example: if you have 1000 training examples, and your batch size is 500, then it will take 2 iterations to
complete 1 epoch.
http://stackoverflow.com/questions/4752626/epoch-vs-iteration-when-training-neural-networks

Report results on test dataset
# Test the model using test sets
print("Accuracy: ", accuracy.eval(session=sess,
feed_dict={X: mnist.test.images, Y: mnist.test.labels}))

is_correct = tf.equal(tf.arg_max(hypothesis, 1), tf.arg_max(Y, 1))
# parameters
training_epochs = 15
batch_size = 100
# Training cycle
avg_cost = 0
c, _ = sess.run([cost, optimizer],
feed_dict={X: batch_xs, Y: batch_ys})
print('Epoch:', '%04d' % (epoch + 1),
'cost =', '{:.9f}'.format(avg_cost))
Epoch: 0001 cost = 2.868104637
Epoch: 0002 cost = 1.134684615
Epoch: 0003 cost = 0.908220728
Epoch: 0004 cost = 0.794199896
Epoch: 0005 cost = 0.721815854
Epoch: 0006 cost = 0.670184430
Epoch: 0007 cost = 0.630576546
Epoch: 0008 cost = 0.598888191
Epoch: 0009 cost = 0.573027079
Epoch: 0010 cost = 0.550497213
Epoch: 0011 cost = 0.532001859
Epoch: 0012 cost = 0.515517795
Epoch: 0013 cost = 0.501175288
Epoch: 0014 cost = 0.488425370
Epoch: 0015 cost = 0.476968593
Learning finished
Accuracy: 0.888

Sample image show and prediction
import random
# Get one and predict
r = random.randint(0, mnist.test.num_examples - 1)
print("Label:", sess.run(tf.argmax(mnist.test.labels[r:r+1], 1)))
print("Prediction:", sess.run(tf.argmax(hypothesis, 1),
feed_dict={X: mnist.test.images[r:r + 1]}))
plt.imshow(mnist.test.images[r:r + 1].
reshape(28, 28), cmap='Greys', interpolation='nearest')
plt.show()

Lab 8
Tensor Manipulation

Lab 8
Tensor Manipulation
With TF 1.0!

Simple 1D array and slicing
Image from http://www.frosteye.net/1233
t = np.array([0., 1., 2., 3., 4., 5., 6.])

Simple 1D array and slicing
Image from http://www.frosteye.net/1233

2D Array
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-08-tensor_manipulation.ipynb

Shape, Rank, Axis

Matmul VS multiply

Broadcasting
https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

Reduce mean

Reduce sum

Argmax

Reshape**

Reshape (squeeze, expand)

One hot

Casting

Stack

Ones and Zeros like

Zip

Lab 9-1
NN for XOR
With TF 1.0!

Lab 9
NN for XOR
With TF 1.0!

XOR data set
x_data = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)
y_data = np.array([[0], [1], [1], [0]], dtype=np.float32)
http://templ25.mandaringardencity.com/xor-gate-truth-table-2/

# Launch graph
sess.run(train, feed_dict={X: x_data, Y: y_data})
if step % 100 == 0:
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data})
print("nHypothesis: ", h, "nCorrect: ", c, "nAccuracy: ", a)
XOR with
logistic regression?
But
it doesn’t work
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-1-xor.py

XOR with
logistic regression?
But
it doesn’t work!
# Launch graph
if step % 100 == 0:
# Accuracy report
h, c, a = sess.run([hypothesis, predicted, accuracy], feed_dict={X: x_data, Y: y_data})
Hypothesis:
[[ 0.5]
[ 0.5]
[ 0.5]
[ 0.5]]
Correct:
[[ 0.]
[ 0.]
[ 0.]
[ 0.]]
Accuracy: 0.5

Neural Net

Neural Net
W1 = tf.Variable(tf.random_normal([2, 2]), name='weight1')
b1 = tf.Variable(tf.random_normal([2]), name='bias1')
layer1 = tf.sigmoid(tf.matmul(X, W1) + b1)
hypothesis = tf.sigmoid(tf.matmul(layer1, W2) + b2)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-2-xor-nn.py

NN for XOR
# Launch graph
if step % 100 == 0:
print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run([W1, W2]))
# Accuracy report
Hypothesis:
[[ 0.01338218]
[ 0.98166394]
[ 0.98809403]
[ 0.01135799]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-2-xor-nn.py

Wide NN for XOR
[2,10], [10,1]
Hypothesis:
[[ 0.00358802]
[ 0.99366933]
[ 0.99204296]
[ 0.0095663 ]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
[2,2], [2,1]
Hypothesis:
[[ 0.01338218]
[ 0.98166394]
[ 0.98809403]
[ 0.01135799]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-3-xor-nn-wide-deep.py

Deep NN for XOR
layer2 = tf.sigmoid(tf.matmul(layer1, W2) + b2)
layer3 = tf.sigmoid(tf.matmul(layer2, W3) + b3)
4 layers
Hypothesis:
[[ 7.80e-04]
[ 9.99e-01]
[ 9.98e-01]
[ 1.55e-03]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
2 layers
Hypothesis:
[[ 0.01338218]
[ 0.98166394]
[ 0.98809403]
[ 0.01135799]]
Correct:
[[ 0.]
[ 1.]
[ 1.]
[ 0.]]
Accuracy: 1.0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-3-xor-nn-wide-deep.py

Exercise
● Wide and Deep NN for MNIST

Lab 9-2
Tensorboard for XOR NN
With TF 1.0!

TensorBoard: TF logging/debugging tool
●Visualize your TF graph
●Plot quantitative metrics
●Show additional data
https://www.tensorflow.org/get_started/summaries_and_tensorboard

Old fashion: print, print, print
9400 0.0151413 [array([[ 6.21692038, 6.05913448],
[-6.33773184, -5.75189114]], dtype=float32), array([[ 9.93581772],
[-9.43034935]], dtype=float32)]
9500 0.014909 [array([[ 6.22498751, 6.07049847],
[-9.45942593]], dtype=float32)]
9600 0.0146836 [array([[ 6.23292685, 6.08166742],
[-9.48807526]], dtype=float32)]
9700 0.0144647 [array([[ 6.24074268, 6.09264851],
[ -9.51631165]], dtype=float32)]
9800 0.0142521 [array([[ 6.24843407, 6.10344648],
[ -9.54414845]], dtype=float32)]
9900 0.0140456 [array([[ 6.25601053, 6.11406422],
[-6.3796401 , -5.80811596]], dtype=float32), array([[ 10.07359505],
[ -9.57159519]], dtype=float32)]
10000 0.0138448 [array([[ 6.26347113, 6.12451124],
[ -9.59866238]], dtype=float32)]

5 steps of using
TensorBoard
From TF graph, decide which tensors you want to log
w2_hist = tf.summary.histogram("weights2", W2)
cost_summ = tf.summary.scalar("cost", cost)
Merge all summaries
summary = tf.summary.merge_all()
Create writer and add graph
# Create summary writer
writer = tf.summary.FileWriter(‘./logs’)
writer.add_graph(sess.graph)
Run summary merge and add_summary
s, _ = sess.run([summary, optimizer], feed_dict=feed_dict)
writer.add_summary(s, global_step=global_step)
Launch TensorBoard
tensorboard --logdir=./logs

Scalar tensors
cost_summ = tf.summary.scalar("cost", cost)

Histogram (multi-dimensional tensors)
b2_hist = tf.summary.histogram("biases2", b2)
hypothesis_hist = tf.summary.histogram("hypothesis", hypothesis)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-09-4-xor_tensorboard.py

Add scope for better
graph hierarchy
with tf.name_scope("layer1") as scope:
layer1_hist = tf.summary.histogram("layer1", layer1)
with tf.name_scope("layer2") as scope:
hypothesis_hist = tf.summary.histogram("hypothesis", hypothesis)

Merge summaries and create writer
after creating session
# Summary
# initialize
sess = tf.Session()
writer = tf.summary.FileWriter(TB_SUMMARY_DIR)
writer.add_graph(sess.graph) # Add graph in the tensorboard

Run merged summary and write (add summary)
global_step += 1

Launch tensorboard (local)
writer = tf.summary.FileWriter("./logs/xor_logs")
$ tensorboard —logdir=./logs/xor_logs
Starting TensorBoard b'41' on port 6006
(You can navigate to http://127.0.0.1:6006)

Launch tensorboard (remote server)
ssh -L local_port:127.0.0.1:remote_port username@server.com
local> $ ssh -L 7007:121.0.0.0:6006 hunkim@server.com
server> $ tensorboard —logdir=./logs/xor_logs

Multiple runs learning_rate=0.1 VS learning_rate=0.01

Multiple runs
tensorboard —logdir=./logs/xor_logs
...
writer = tf.summary.FileWriter("./logs/xor_logs")
tensorboard —logdir=./logs/xor_logs_r0_01
...
writer = tf.summary.FileWriter(“"./logs/xor_logs_r0_01"”)
tensorboard —logdir=./logs

Exercise
● Wide and Deep NN for MNIST
● Add tensorboard

Lab 9-2-E
Tensorboard for MNIST
With TF 1.0!

Visualizing your Deep learning using
TensorBoard (TensorFlow)

Old fashion: print, print, print

5 steps of using
TensorBoard
From TF graph, decide which tensors you want to log
with tf.variable_scope('layer1') as scope:
tf.summary.image('input', x_image, 3)
tf.summary.histogram("layer", L1)
tf.summary.scalar("loss", cost)
Merge all summaries
Create writer and add graph
Run summary merge and add_summary
Launch TensorBoard
tensorboard --logdir=/tmp/mnist_logs

Image Input
# Image input
x_image = tf.reshape(X, [-1, 28, 28, 1])
tf.summary.image('input', x_image, 3)

Histogram (multi-dimensional tensors)
W1 = tf.get_variable("W", shape=[784, 512])
b1 = tf.Variable(tf.random_normal([512]))
L1 = tf.nn.relu(tf.matmul(X, W1) + b1)
L1 = tf.nn.dropout(L1, keep_prob=keep_prob)
tf.summary.histogram("X", X)
tf.summary.histogram("weights", W1)
tf.summary.histogram("bias", b1)

Scalar tensors
tf.summary.scalar("loss", cost)

Add scope for better hierarchy
W1 = tf.get_variable("W", shape=[784, 512],...
tf.summary.histogram("X", X)
tf.summary.histogram("weights", W1)
tf.summary.histogram("bias", b1)
...
...
...
...

Merge summaries and create writer
after creating session
# Summary
# initialize
sess = tf.Session()

Run merged summary and write (add summary)
global_step += 1

Launch tensorboard (local)
writer = tf.summary.FileWriter(“/tmp/mnist_logs”)
$ tensorboard —logdir=/tmp/mnist_logs
Starting TensorBoard b'41' on port 6006

Launch tensorboard (remote server)
ssh -L local_port:127.0.0.1:remote_port username@server.com
local> $ ssh -L 7007:127.0.0.1:6006 hunkim@server.com
server> $ tensorboard —logdir=/tmp/mnist_logs

Multiple runs
tensorboard —logdir=/tmp/mnist_logs/run1
writer = tf.summary.FileWriter(“/tmp/mnist_logs/run1”)
tensorboard —logdir=/tmp/mnist_logs/run2
writer = tf.summary.FileWriter(“/tmp/mnist_logs/run1”)
tensorboard —logdir=/tmp/mnist_logs

Lab 9-3 (optional)
NN Backpropagation
With TF 1.0!

Tensorflow

“Yes you should understand backprop”
https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b
• “If you try to ignore how it works under the hood because TensorFlow
automagically makes my networks learn”
- “You will not be ready to wrestle with the dangers it presents”
- “You will be much less effective at building and debugging neural networks.”
• “The good news is that backpropagation is not that difficult to understand”
- “if presented properly.”

Back propagation (chain rule)
http://cs231n.stanford.edu/
w

* + Sigmoid loss
w b
Logistic Regression Network
a0

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l) (4) E=loss(a1
,t)
Network forward
a0
Forward pass, OK? Just follow (1), (2), (3) and (4)
(1) o=a0
*w

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l) (4) E=loss(a1
,t)
Network forward
a0
(1) o=a0
*w

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
a0
(1) o=a0
*w (4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
a0
Let’s do back propagation!
will be given. What would be
We can use the chain rule.
backward prop
(1) o=a0
*w (4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
backward prop
a0
In the same manner, we can get back prop (4), (3), (1) and (1)!
(1) o=a0
*w (4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Gate derivatives
Network forward
a0
These derivatives for each gate will be given.
We can just use them in the chain rule.
(1) o=a0
*w (4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule) Gate derivatives
Network forward
backward prop
a0
Given from the pre
computed derivative
Just apply them one by one and solve each
derivative one by one!
(1) o=a0
*w (4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
backward prop
a0
Given from the pre
computed derivative
Just apply them one by one and solve each
derivative one by one!
Given
(1) o=a0
*w (4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
backward prop
a0
Matrix
(1) o=a0
*w (4) E=loss(a1
,t)
For Matrix: http://cs231n.github.io/optimization-2/#staged

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network update (learning rate, alpha)
Network forward
backward prop
a0
Matrix
(1) o=a0
*w (4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network update (learning rate, alpha)
Network forward
backward prop
a0
Done! Let’s update our
network using
derivatives!
(1) o=a0
*w (4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Derivatives (chain rule)
Network forward
backward prop
a0
(1) o=a0
*w
d_a1 = (a1 - t) / (a1 * (1. - t) + 1e-7)
d_sigma = a1 * (1 - a1) # sigma prime
d_l = d_a1 * d_sigma # (a1 - t)
d_b = d_l * 1
d_o = d_1 * 1
d_W = tf.matmul(tf.transpose(a0), d_o)
# Updating network using gradients
learning_rate = 0.01
train_step = [
tf.assign(W, W - learning_rate * d_W),
tf.assign(b, b - learning_rate * tf.reduce_sum(d_b))]
(4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
backward prop
a0
(1) o=a0
*w
d_a1 = (a1 - t) / (a1 * (1. - a1) + 1e-7)
d_l = d_a1 * d_sigma # (a1 - t)
d_b = d_l * 1
d_o = d_1 * 1
train_step = [
tf.assign(W, W - learning_rate * d_W),
tf.assign(b, b - learning_rate * d_b)]
(4) E=loss(a1
,t)

* + Sigmoid loss
w b
(2) l=o+b (3) a1
=sigmoid(l)
Network forward
backward prop
a0
(1) o=a0
*w
d_a1 = (a1 - t) / (a1 * (1. - a1) + 1e-7)
d_l = d_a1 * d_sigma # (a1 - t)
d_b = d_l * 1
d_o = d_1 * 1
train_step = [
tf.assign(W, W - learning_rate * d_W / N), # sample size
tf.assign(b, b - learning_rate * tf.reduce_mean(d_b))]
(4) E=loss(a1
,t)

Exercise
● See more backprop code samples at
https://github.com/hunkim/DeepLearningZeroToAll
● https://github.com/hunkim/DeepLearningZeroToAll/blob/mast
er/lab-09-7-sigmoid_back_prop.py
● Solve XOR using NN backprop

Lab 10
NN, ReLu, Xavier, Dropout, and Adam
With TF 1.0!

Softmax classifier for MNIST
# weights & bias for nn layers
# define cost/loss & optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=hypothesis, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# initialize
sess = tf.Session()
# train my model
avg_cost = 0
feed_dict = {X: batch_xs, Y: batch_ys}
c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
print('Learning Finished!')
# Test model and check accuracy
correct_prediction = tf.equal(tf.argmax(hypothesis, 1), tf.argmax(Y, 1))
print('Accuracy:', sess.run(accuracy, feed_dict={X: mnist.test.images, Y: mnist.test.labels}))
Epoch: 0001 cost = 5.888845987
Epoch: 0002 cost = 1.860620173
Epoch: 0003 cost = 1.159035648
Epoch: 0004 cost = 0.892340870
Epoch: 0005 cost = 0.751155428
Epoch: 0006 cost = 0.662484806
Epoch: 0007 cost = 0.601544010
Epoch: 0008 cost = 0.556526115
Epoch: 0009 cost = 0.521186961
Epoch: 0010 cost = 0.493068354
Epoch: 0011 cost = 0.469686249
Epoch: 0012 cost = 0.449967254
Epoch: 0013 cost = 0.433519321
Epoch: 0014 cost = 0.419000337
Epoch: 0015 cost = 0.406490815
Learning Finished!
Accuracy: 0.9035
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-1-mnist_softmax.py

NN for MNIST
# input place holders
Y = tf.placeholder(tf.float32, [None, 10])
W1 = tf.Variable(tf.random_normal([784, 256]))
L2 = tf.nn.relu(tf.matmul(L1, W2) + b2)
hypothesis = tf.matmul(L2, W3) + b3
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=hypothesis, labels=Y))
Epoch: 0001 cost = 141.207671860
Epoch: 0002 cost = 38.788445864
Epoch: 0003 cost = 23.977515479
Epoch: 0004 cost = 16.315132428
Epoch: 0005 cost = 11.702554882
Epoch: 0006 cost = 8.573139748
Epoch: 0007 cost = 6.370995680
Epoch: 0008 cost = 4.537178684
Epoch: 0009 cost = 3.216900532
Epoch: 0010 cost = 2.329708954
Epoch: 0011 cost = 1.715552875
Epoch: 0012 cost = 1.189857912
Epoch: 0013 cost = 0.820965160
Epoch: 0014 cost = 0.624131458
Epoch: 0015 cost = 0.454633765
Learning Finished!
Accuracy: 0.9455
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-2-mnist_nn.py

http://stackoverflow.com/questions/33640581/how-to-do-xavier-initialization-on-tensorflow

Xavier for MNIST
# http://stackoverflow.com/questions/33640581
W1 = tf.get_variable("W1", shape=[784, 256],
initializer=tf.contrib.layers.xavier_initializer())
Epoch: 0001 cost = 0.301498963
Epoch: 0002 cost = 0.107252513
Epoch: 0003 cost = 0.064888892
Epoch: 0004 cost = 0.044463030
Epoch: 0005 cost = 0.029951642
Epoch: 0006 cost = 0.020663404
Epoch: 0007 cost = 0.015853033
Epoch: 0008 cost = 0.011764387
Epoch: 0009 cost = 0.008598264
Epoch: 0010 cost = 0.007383116
Epoch: 0011 cost = 0.006839140
Epoch: 0012 cost = 0.004672963
Epoch: 0013 cost = 0.003979437
Epoch: 0014 cost = 0.002714260
Epoch: 0015 cost = 0.004707661
Learning Finished!
Accuracy: 0.9783
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-3-mnist_nn_xavier.py

Xavier for MNIST
# http://stackoverflow.com/questions/33640581
Epoch: 0001 cost = 0.301498963
Epoch: 0002 cost = 0.107252513
Epoch: 0003 cost = 0.064888892
Epoch: 0004 cost = 0.044463030
Epoch: 0005 cost = 0.029951642
Epoch: 0006 cost = 0.020663404
Epoch: 0007 cost = 0.015853033
Epoch: 0008 cost = 0.011764387
Epoch: 0009 cost = 0.008598264
Epoch: 0010 cost = 0.007383116
Epoch: 0011 cost = 0.006839140
Epoch: 0012 cost = 0.004672963
Epoch: 0013 cost = 0.003979437
Epoch: 0014 cost = 0.002714260
Epoch: 0015 cost = 0.004707661
Learning Finished!
Accuracy: 0.9783 (xavier)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-3-mnist_nn_xavier.py
Epoch: 0001 cost = 141.207671860
Epoch: 0002 cost = 38.788445864
Epoch: 0003 cost = 23.977515479
Epoch: 0004 cost = 16.315132428
Epoch: 0005 cost = 11.702554882
Epoch: 0006 cost = 8.573139748
Epoch: 0007 cost = 6.370995680
Epoch: 0008 cost = 4.537178684
Epoch: 0009 cost = 3.216900532
Epoch: 0010 cost = 2.329708954
Epoch: 0011 cost = 1.715552875
Epoch: 0012 cost = 1.189857912
Epoch: 0013 cost = 0.820965160
Epoch: 0014 cost = 0.624131458
Epoch: 0015 cost = 0.454633765
Learning Finished!
Accuracy: 0.9455 (normal dist)

Deep NN for MNIST
Epoch: 0001 cost = 0.266061549
Epoch: 0002 cost = 0.080796588
Epoch: 0003 cost = 0.049075800
Epoch: 0004 cost = 0.034772298
Epoch: 0005 cost = 0.024780529
Epoch: 0006 cost = 0.017072763
Epoch: 0007 cost = 0.014031383
Epoch: 0008 cost = 0.013763446
Epoch: 0009 cost = 0.009164047
Epoch: 0010 cost = 0.008291388
Epoch: 0011 cost = 0.007319742
Epoch: 0012 cost = 0.006434021
Epoch: 0013 cost = 0.005684378
Epoch: 0014 cost = 0.004781207
Epoch: 0015 cost = 0.004342310
Learning Finished!
Accuracy: 0.9742
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-4-mnist_nn_deep.py

Dropout for MNIST
# dropout (keep_prob) rate 0.7 on training, but should be 1 for testing
keep_prob = tf.placeholder(tf.float32)
W1 = tf.get_variable("W1", shape=[784, 512])
W2 = tf.get_variable("W2", shape=[512, 512])
…
# train my model
...
feed_dict = {X: batch_xs, Y: batch_ys, keep_prob: 0.7}
c, _ = sess.run([cost, optimizer], feed_dict=feed_dict)
print('Accuracy:', sess.run(accuracy, feed_dict={
X: mnist.test.images, Y: mnist.test.labels, keep_prob: 1}))
Epoch: 0001 cost = 0.447322626
Epoch: 0002 cost = 0.157285590
Epoch: 0003 cost = 0.121884535
Epoch: 0004 cost = 0.098128681
Epoch: 0005 cost = 0.082901778
Epoch: 0006 cost = 0.075337573
Epoch: 0007 cost = 0.069752543
Epoch: 0008 cost = 0.060884363
Epoch: 0009 cost = 0.055276413
Epoch: 0010 cost = 0.054631256
Epoch: 0011 cost = 0.049675195
Epoch: 0012 cost = 0.049125314
Epoch: 0013 cost = 0.047231930
Epoch: 0014 cost = 0.041290121
Epoch: 0015 cost = 0.043621063
Learning Finished!
Accuracy: 0.9804!!
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-10-5-mnist_nn_dropout.py

Optimizers
https://www.tensorflow.org/api_guides/python/train

Optimizers
● tf.train.AdadeltaOptimizer
● tf.train.AdagradOptimizer
● tf.train.AdagradDAOptimizer
● tf.train.MomentumOptimizer
● tf.train.AdamOptimizer
● tf.train.FtrlOptimizer
● tf.train.ProximalGradientDescentOptimizer
● tf.train.ProximalAdagradOptimizer
● tf.train.RMSPropOptimizer
https://www.tensorflow.org/api_guides/python/train

http://www.denizyuret.com/2015/03/alec-radfords-animations-for.html

ADAM: a method for stochastic optimization
[Kingma et al. 2015]

Use Adam Optimizer

Summary
●Softmax VS Neural Nets for MNIST, 90% and 94.5%
●Xavier initialization: 97.8%
●Deep Neural Nets with Dropout: 98%
●Adam and other optimizers
●Exercise: Batch Normalization
- https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-
10-6-mnist_nn_batchnorm.ipynb

Lecture and Lab 11
CNN
http://hunkim.github.io/ml/

Lab 11
CNN
With TF 1.0!

Lab 11-1
CNN Basics
With TF 1.0!

CNN
http://parse.ele.tue.nl/cluster/2/CNNArchitecture.jpg

CNN for CT images
Asan Medical Center & Microsoft Medical Bigdata Contest Winner by GeunYoung Lee and Alex Kim
https://www.slideshare.net/GYLee3/ss-72966495

Convolution layer and max pooling

Simple convolution layer
Stride: 1x1
3x3x1
2x2x1 filter

Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: VALID
1
1
1
1
1
2
3
4
5
6
7
8
9
[[[[1.]],[[1.]]],
[[[1.]],[[1.]]]]
shape=(2,2,1,1)

Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: VALID
1
1
1
1
1
2
3
4
5
6
7
8
9

1
1
1
1
1
2
3
4
5
6
7
8
9
0
0
0
0
0
0
0
3
3
Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: SAME

Image: 1,3,3,1 image, Filter: 2,2,1,1, Stride: 1x1, Padding: SAME
1
1
1
1
1
2
3
4
5
6
7
8
9
0
0
0
0
0
0
0

Max Pooling
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-0-cnn_basics.ipynb

MNIST image loading

MNIST Convolution layer

MNIST Max pooling

Lab 11-2
CNN MNIST: 99%!
With TF 1.0!

# input placeholders
X_img = tf.reshape(X, [-1, 28, 28, 1]) # img 28x28x1 (black/white)
# L1 ImgIn shape=(?, 28, 28, 1)
W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01))
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.conv2d(X_img, W1, strides=[1, 1, 1, 1], padding='SAME')
L1 = tf.nn.relu(L1)
L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
'''
Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("Relu:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("MaxPool:0", shape=(?, 14, 14, 32), dtype=float32)
'''
Conv layer 1
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-1-mnist_cnn.py

'''
Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32)
'''
# L2 ImgIn shape=(?, 14, 14, 32)
# Conv ->(?, 14, 14, 64)
# Pool ->(?, 7, 7, 64)
L2 = tf.nn.conv2d(L1, W2, strides=[1, 1, 1, 1], padding='SAME')
L2 = tf.nn.relu(L2)
L2 = tf.nn.max_pool(L2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
L2 = tf.reshape(L2, [-1, 7 * 7 * 64])
'''
Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("Relu_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("MaxPool_1:0", shape=(?, 7, 7, 64), dtype=float32)
Tensor("Reshape_1:0", shape=(?, 3136), dtype=float32)
Conv layer 2

'''
Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("Reshape_1:0", shape=(?, 3136), dtype=float32)
'''
L2 = tf.reshape(L2, [-1, 7 * 7 * 64])
# Final FC 7x7x64 inputs -> 10 outputs
W3 = tf.get_variable("W3", shape=[7 * 7 * 64, 10],
hypothesis = tf.matmul(L2, W3) + b
Fully Connected (FC, Dense) layer

Training and
Evaluation
# initialize
sess = tf.Session()
# train my model
print('Learning stared. It takes sometime.')
avg_cost = 0
c, _, = sess.run([cost, optimizer], feed_dict=feed_dict)

# initialize
sess = tf.Session()
# train my model
print('Learning stared. It takes sometime.')
avg_cost = 0
c, _, = sess.run([cost, optimizer], feed_dict=feed_dict)
Epoch: 0001 cost = 0.340291267
Epoch: 0002 cost = 0.090731326
Epoch: 0003 cost = 0.064477619
Epoch: 0004 cost = 0.050683064
...
Epoch: 0011 cost = 0.017758641
Epoch: 0012 cost = 0.014156652
Epoch: 0013 cost = 0.012397016
Epoch: 0014 cost = 0.010693789
Epoch: 0015 cost = 0.009469977
Learning Finished!
Accuracy: 0.9885
Training and
Evaluation

Deep CNN
Image credit: http://personal.ie.cuhk.edu.hk/~ccloy/project_target_code/index.html

# L3 ImgIn shape=(?, 7, 7, 64)
# Conv ->(?, 7, 7, 128)
# Pool ->(?, 4, 4, 128)
# Reshape ->(?, 4 * 4 * 128) # Flatten them for FC
L3 = tf.nn.relu(L3)
L3 = tf.nn.max_pool(L3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
padding='SAME')
L3 = tf.reshape(L3, [-1, 128 * 4 * 4])
'''Tensor("Conv2D_2:0", shape=(?, 7, 7, 128), dtype=float32)
Tensor("dropout_2/mul:0", shape=(?, 4, 4, 128), dtype=float32)
Tensor("Reshape_1:0", shape=(?, 2048), dtype=float32)'''
# L4 FC 4x4x128 inputs -> 625 outputs
'''Tensor("Relu_3:0", shape=(?, 625), dtype=float32)
Tensor("dropout_3/mul:0", shape=(?, 625), dtype=float32)'''
# L5 Final FC 625 inputs -> 10 outputs
'''Tensor("add_1:0", shape=(?, 10), dtype=float32)'''
Deep CNN
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-2-mnist_deep_cnn.py
# L1 ImgIn shape=(?, 28, 28, 1)
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.relu(L1)
'''Tensor("Conv2D:0", shape=(?, 28, 28, 32), dtype=float32)
Tensor("dropout/mul:0", shape=(?, 14, 14, 32), dtype=float32)'''
# L2 ImgIn shape=(?, 14, 14, 32)
# Conv ->(?, 14, 14, 64)
# Pool ->(?, 7, 7, 64)
L2 = tf.nn.relu(L2)
'''Tensor("Conv2D_1:0", shape=(?, 14, 14, 64), dtype=float32)
Tensor("dropout_1/mul:0", shape=(?, 7, 7, 64), dtype=float32)'''

Deep CNN
# L1 ImgIn shape=(?, 28, 28, 1)
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.relu(L1)
...
...
correct_prediction = tf.equal(tf.argmax(hypothesis, 1),
tf.argmax(Y, 1))
accuracy =
tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print('Accuracy:', sess.run(accuracy,
feed_dict={X: mnist.test.images,
Y: mnist.test.labels, keep_prob: 1}))
Epoch: 0013 cost = 0.027188021
Epoch: 0014 cost = 0.023604777
Epoch: 0015 cost = 0.024607201
Learning Finished!
Accuracy: 0.9938

Lab 11-3
Class, Layers, Ensemble
With TF 1.0!

CNN
# L1 ImgIn shape=(?, 28, 28, 1)
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.relu(L1)
...
...
correct_prediction = tf.equal(tf.argmax(hypothesis, 1),
tf.argmax(Y, 1))
accuracy =
tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
print('Accuracy:', sess.run(accuracy,
feed_dict={X: mnist.test.images,
Y: mnist.test.labels, keep_prob: 1}))
Epoch: 0013 cost = 0.027188021
Epoch: 0014 cost = 0.023604777
Epoch: 0015 cost = 0.024607201
Learning Finished!
Accuracy: 0.9938

Python Class
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-3-mnist_cnn_class.py
class Model:
def __init__(self, sess, name):
self.sess = sess
self.name = name
self._build_net()
def _build_net(self):
with tf.variable_scope(self.name):
self.X = tf.placeholder(tf.float32, [None, 784])
# img 28x28x1 (black/white)
X_img = tf.reshape(self.X, [-1, 28, 28, 1])
self.Y = tf.placeholder(tf.float32, [None, 10])
# L1 ImgIn shape=(?, 28, 28, 1)
W1 = tf.Variable(tf.random_normal([3, 3, 1, 32],
stddev=0.01))
...
def predict(self, x_test, keep_prop=1.0):
return self.sess.run(self.logits,
feed_dict={self.X: x_test, self.keep_prob: keep_prop})
def get_accuracy(self, x_test, y_test, keep_prop=1.0):
return self.sess.run(self.accuracy,
feed_dict={self.X: x_test, self.Y: y_test, self.keep_prob: keep_prop})
def train(self, x_data, y_data, keep_prop=0.7):
return self.sess.run([self.cost, self.optimizer], feed_dict={
self.X: x_data, self.Y: y_data, self.keep_prob: keep_prop})
# initialize
sess = tf.Session()
m1 = Model(sess, "m1")
print('Learning Started!')
# train my model
avg_cost = 0
c, _ = m1.train(batch_xs, batch_ys)
avg_cost += c / total_batc

tf.layers
https://www.tensorflow.org/api_docs/python/tf/layers

tf.layers
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-4-mnist_cnn_layers.py
# L1 ImgIn shape=(?, 28, 28, 1)
# Conv -> (?, 28, 28, 32)
# Pool -> (?, 14, 14, 32)
L1 = tf.nn.relu(L1)
L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
L1 = tf.nn.dropout(L1, keep_prob=self.keep_prob)
…
# L2 ImgIn shape=(?, 14, 14, 32)
# Convolutional Layer #1
conv1 = tf.layers.conv2d(inputs=X_img,filters=32,kernel_size=[3,3],padding="SAME",activation=tf.nn.relu)
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], padding="SAME", strides=2)
dropout1 = tf.layers.dropout(inputs=pool1,rate=0.7, training=self.training)
# Convolutional Layer #2
conv2 = tf.layers.conv2d(inputs=dropout1,filters=64,kernel_size=[3,3],padding="SAME",activation=tf.nn.relu)
…
flat = tf.reshape(dropout3, [-1, 128 * 4 * 4])
dense4 = tf.layers.dense(inputs=flat, units=625, activation=tf.nn.relu)
dropout4 = tf.layers.dropout(inputs=dense4, rate=0.5, training=self.training)
...

Ensemble
http://rasbt.github.io/mlxtend/user_guide/classifier/StackingClassifier/

models = []
num_models = 7
for m in range(num_models):
models.append(Model(sess, "model" + str(m)))
print('Learning Started!')
# train my model
avg_cost_list = np.zeros(len(models))
batch_xs, batch_ys =mnist.train.next_batch(batch_size)
# train each model
for m_idx, m in enumerate(models):
c, _ = m.train(batch_xs, batch_ys)
avg_cost_list[m_idx] += c / total_batch
print('Epoch:','%04d'%(epoch + 1),'cost =', avg_cost_list)
class Model:
def __init__(self, sess, name):
self.sess = sess
self.name = name
self._build_net()
def _build_net(self):
with tf.variable_scope(self.name):
...
Ensemble
training
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-5-mnist_cnn_ensemble_layers.py

Ensemble prediction
0 1 2 3 4 5 6 7 8 9
0.1 0.01 0.02 0.8 ... ... ... ... ... ...
0.01 0.5 0.02 0.4 ... ... ... ... ... ...
0.01 0.01 0.1 0.7 ... ... ... ... ... ...
.
.
.
0.12 0.52 0.14 1.9 ... ... ... ... ... ...
Sum
argmax

test_size = len(mnist.test.labels)
predictions = np.zeros(test_size * 10).reshape(test_size, 10)
for m_idx, m in enumerate(models):
print(m_idx, 'Accuracy:', m.get_accuracy(mnist.test.images, mnist.test.labels))
p = m.predict(mnist.test.images)
predictions += p
ensemble_correct_prediction = tf.equal(
tf.argmax(predictions, 1), tf.argmax(mnist.test.labels, 1))
ensemble_accuracy = tf.reduce_mean(
tf.cast(ensemble_correct_prediction, tf.float32))
print('Ensemble accuracy:', sess.run(ensemble_accuracy))
Ensemble
prediction
0 Accuracy: 0.9933
1 Accuracy: 0.9946
2 Accuracy: 0.9934
3 Accuracy: 0.9935
4 Accuracy: 0.9935
5 Accuracy: 0.9949
6 Accuracy: 0.9941
Ensemble accuracy: 0.9952
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-11-5-mnist_cnn_ensemble_layers.py

Exercise
● Deep & Wide?
● CIFAR 10
● ImageNet

Lab 12
RNN
http://hunkim.github.io/ml/

Lab 12
RNN
With TF 1.0!

Lab 12-1
RNN Basics
With TF 1.0!

RNN in TensorFlow
cell = tf.contrib.rnn.BasicRNNCell(num_units=hidden_size)
...
outputs, _states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-0-rnn_basics.ipynb

RNN in TensorFlow
cell = tf.contrib.rnn.BasicRNNCell(num_units=hidden_size)
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size)
...

One node: 4 (input-dim) in 2 (hidden_size)

# One cell RNN input_dim (4) -> output_dim (2)
hidden_size = 2
x_data = np.array([[[1,0,0,0]]], dtype=np.float32)
pp.pprint(outputs.eval())
array([[[-0.42409304, 0.64651132]]])
One node: 4 (input-dim) in 2 (hidden_size)

Unfolding to n sequences
Hidden_size=2
sequence_length=5

Unfolding to n sequences
# One cell RNN input_dim (4) -> output_dim (2). sequence: 5
hidden_size = 2
x_data = np.array([[h, e, l, l, o]], dtype=np.float32)
print(x_data.shape)
pp.pprint(x_data)
outputs, states = tf.nn.dynamic_rnn(cell, x_data, dtype=tf.float32)
X_data = array
([[[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]]], dtype=float32)
Outputs = array
([[[ 0.19709368, 0.24918222],
[-0.11721198, 0.1784237 ],
[-0.35297349, -0.66278851],
[-0.70915914, -0.58334434],
[-0.38886023, 0.47304463]]], dtype=float32)
Hidden_size=2
sequence_length=5

Batching input
Hidden_size=2
sequence_length=5
batch_size=3

Batching input
# One cell RNN input_dim (4) -> output_dim (2). sequence: 5, batch 3
# 3 batches 'hello', 'eolll', 'lleel'
x_data = np.array([[h, e, l, l, o],
[e, o, l, l, l],
[l, l, e, e, l]], dtype=np.float32)
pp.pprint(x_data)
cell = rnn.BasicLSTMCell(num_units=2, state_is_tuple=True)
outputs, _states = tf.nn.dynamic_rnn(cell, x_data,
dtype=tf.float32)
array([[[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]],
[[ 0., 1., 0., 0.],
[ 0., 0., 0., 1.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.]],
[[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 1., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.]]],
Hidden_size=2
sequence_length=5
batch_size=3

Batching input
# One cell RNN input_dim (4) -> output_dim (2). sequence: 5, batch 3
x_data = np.array([[h, e, l, l, o],
[e, o, l, l, l],
[l, l, e, e, l]], dtype=np.float32)
pp.pprint(x_data)
cell = rnn.BasicLSTMCell(num_units=2, state_is_tuple=True)
outputs, _states = tf.nn.dynamic_rnn(cell, x_data,
dtype=tf.float32)
array([[[ 1., 0., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 0., 1.]],
[[ 0., 1., 0., 0.],
[ 0., 0., 0., 1.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.]],
[[ 0., 0., 1., 0.],
[ 0., 0., 1., 0.],
[ 0., 1., 0., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 1., 0.]]],
array([[[-0.0173022 , -0.12929453],
[-0.14995177, -0.23189341],
[ 0.03294011, 0.01962204],
[ 0.12852104, 0.12375218],
[ 0.13597946, 0.31746736]],
[[-0.15243632, -0.14177315],
[ 0.04586344, 0.12249056],
[ 0.14292534, 0.15872268],
[ 0.18998367, 0.21004884],
[ 0.21788891, 0.24151592]],
[[ 0.10713603, 0.11001928],
[ 0.17076059, 0.1799853 ],
[-0.03531617, 0.08993293],
[-0.1881337 , -0.08296411],
[-0.00404597, 0.07156041]]],
Hidden_size=2
sequence_length=5
batch_size=3

Lab 12-2
Hi Hello RNN
With TF 1.0!

Teach RNN ‘hihello’
h h
i e l
l
i e
h l o
l

One-hot encoding
[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 0, 1], # o 4
● text: ‘hihello’
● unique chars (vocabulary, voc):
h, i, e, l, o
● voc index:
h:0, i:1, e:2, l:3, o:4

h h
i e l
l
i e
h l o
l
[1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0]
[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 0, 1], # o 4
[0, 1, 0, 0, 0] [1, 0, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 1, 0, 0, 0] [0, 0, 0, 0, 1]

[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 0, 1], # o 4

Creating rnn cell
# RNN model
rnn_cell = rnn_cell.BasicRNNCell(rnn_size)
rnn_cell = rnn_cell. BasicLSTMCell(rnn_size)
rnn_cell = rnn_cell. GRUCell(rnn_size)

Execute RNN
# RNN model
rnn_cell = rnn_cell.BasicRNNCell(rnn_size)
outputs, _states = tf.nn.dynamic_rnn(
rnn_cell,
X,
initial_state=initial_state,
dtype=tf.float32)
hidden_rnn_size

RNN parameters
hidden_size = 5 # output from the LSTM
input_dim = 5 # one-hot size
batch_size = 1 # one sentence
sequence_length = 6 # |ihello| == 6
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-1-hello-rnn.py

Data creation
idx2char = ['h', 'i', 'e', 'l', 'o'] # h=0, i=1, e=2, l=3, o=4
x_data = [[0, 1, 0, 2, 3, 3]] # hihell
x_one_hot = [[[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[1, 0, 0, 0, 0], # h 0
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 1, 0]]] # l 3
y_data = [[1, 0, 2, 3, 3, 4]] # ihello
X = tf.placeholder(tf.float32,
[None, sequence_length, input_dim]) # X one-hot
Y = tf.placeholder(tf.int32, [None, sequence_length]) # Y label

Feed to RNN
X = tf.placeholder(
tf.float32, [None, sequence_length, hidden_size]) # X one-hot
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_size,
state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
cell, X, initial_state=initial_state, dtype=tf.float32)
x_one_hot = [[[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[1, 0, 0, 0, 0], # h 0
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 1, 0]]] # l 3
y_data = [[1, 0, 2, 3, 3, 4]] # ihello

Cost: sequence_loss

# [batch_size, sequence_length]
y_data = tf.constant([[1, 1, 1]])
# [batch_size, sequence_length, emb_dim ]
prediction1 = tf.constant([[[0.3, 0.7], [0.3, 0.7], [0.3, 0.7]]],
dtype=tf.float32)
prediction2 = tf.constant([[[0.1, 0.9], [0.1, 0.9], [0.1, 0.9]]],
dtype=tf.float32)
# [batch_size * sequence_length]
weights = tf.constant([[1, 1, 1]], dtype=tf.float32)
sequence_loss1 = tf.contrib.seq2seq.sequence_loss(prediction1, y_data,
weights)
sequence_loss2 = tf.contrib.seq2seq.sequence_loss(prediction2, y_data,
weights)
print("Loss1: ", sequence_loss1.eval(),
"Loss2: ", sequence_loss2.eval())
Cost: sequence_loss
Loss1: 0.513015
Loss2: 0.371101

Cost: sequence_loss
cell, X, initial_state=initial_state, dtype=tf.float32)
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
loss = tf.reduce_mean(sequence_loss)
train = tf.train.AdamOptimizer(learning_rate=0.1).minimize(loss)

Training
prediction = tf.argmax(outputs, axis=2)
for i in range(2000):
l, _ = sess.run([loss, train], feed_dict={X: x_one_hot, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_one_hot})
print(i, "loss:", l, "prediction: ", result, "true Y: ", y_data)
# print char using dic
result_str = [idx2char[c] for c in np.squeeze(result)]
print("tPrediction str: ", ''.join(result_str))

Results
l, _ = sess.run([loss, train], feed_dict={X: x_one_hot, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_one_hot})
print(i, "loss:", l, "prediction: ", result, "true Y: ", y_data)
print("tPrediction str: ", ''.join(result_str))
0 loss: 1.55474 prediction: [[3 3 3 3 4 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: lllloo
...
1998 loss: 0.75305 prediction: [[1 0 2 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: ihello
1999 loss: 0.752973 prediction: [[1 0 2 3 3 4]] true Y: [[1, 0, 2, 3, 3, 4]] Prediction str: ihello

Lab 12-3
RNN with long sequences
With TF 1.0!

Manual data creation
idx2char = ['h', 'i', 'e', 'l', 'o']
x_data = [[0, 1, 0, 2, 3, 3]] # hihell
x_one_hot = [[[1, 0, 0, 0, 0], # h 0
[0, 1, 0, 0, 0], # i 1
[1, 0, 0, 0, 0], # h 0
[0, 0, 1, 0, 0], # e 2
[0, 0, 0, 1, 0], # l 3
[0, 0, 0, 1, 0]]] # l 3
y_data = [[1, 0, 2, 3, 3, 4]] # ihello

Better data creation
sample = " if you want you"
idx2char = list(set(sample)) # index -> char
char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idx
sample_idx = [char2idx[c] for c in sample] # char to index
x_data = [sample_idx[:-1]] # X data sample (0 ~ n-1) hello: hell
y_data = [sample_idx[1:]] # Y label sample (1 ~ n) hello: ello
X = tf.placeholder(tf.int32, [None, sequence_length]) # X data
X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-2-char-seq-rnn.py

Hyper parameters
sample = " if you want you"
idx2char = list(set(sample)) # index -> char
char2idx = {c: i for i, c in enumerate(idx2char)} # char -> idx
# hyper parameters
dic_size = len(char2idx) # RNN input size (one hot size)
rnn_hidden_size = len(char2idx) # RNN output size
num_classes = len(char2idx) # final output size (RNN or softmax, etc.)
batch_size = 1 # one sample data, one batch
sequence_length = len(sample) - 1 # number of lstm unfolding (unit #)

LSTM and Loss
X = tf.placeholder(tf.int32, [None, sequence_length]) # X data
X_one_hot = tf.one_hot(X, num_classes) # one hot: 1 -> 0 1 0 0 0 0 0 0 0 0
cell = tf.contrib.rnn.BasicLSTMCell(num_units=rnn_hidden_size, state_is_tuple=True)
initial_state = cell.zero_state(batch_size, tf.float32)
cell, X_one_hot, initial_state=initial_state, dtype=tf.float32)
weights = tf.ones([batch_size, sequence_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(logits=outputs, targets=Y,weights=weights)
loss = tf.reduce_mean(sequence_loss)
train = tf.train.GradientDescentOptimizer(learning_rate=0.1).minimize(loss)

Training and Results
l, _ = sess.run([loss, train], feed_dict={X: x_data, Y: y_data})
result = sess.run(prediction, feed_dict={X: x_data})
print(i, "loss:", l, "Prediction:", ''.join(result_str))
0 loss: 2.29895 Prediction: nnuffuunnuuuyuy
1 loss: 2.29675 Prediction: nnuffuunnuuuyuy
...
1418 loss: 1.37351 Prediction: if you want you
1419 loss: 1.37331 Prediction: if you want you

Really long sentence?
sentence = ("if you want to build a ship, don't drum up people together to "
"collect wood and don't assign them tasks and work, but rather "
"teach them to long for the endless immensity of the sea.")
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-4-rnn_long_char.py

Really long sentence?
sentence = ("if you want to build a ship, don't drum up people together to "
"collect wood and don't assign them tasks and work, but rather "
"teach them to long for the endless immensity of the sea.")
# training dataset
0 if you wan -> f you want
1 f you want -> you want
2 you want -> you want t
3 you want t -> ou want to
…
168 of the se -> of the sea
169 of the sea -> f the sea.

Making dataset
char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}
dataX = []
dataY = []
for i in range(0, len(sentence) - seq_length):
x_str = sentence[i:i + seq_length]
y_str = sentence[i + 1: i + seq_length + 1]
print(i, x_str, '->', y_str)
x = [char_dic[c] for c in x_str] # x str to index
y = [char_dic[c] for c in y_str] # y str to index
dataX.append(x)
dataY.append(y)
# training dataset
…

RNN parameters
char_set = list(set(sentence))
char_dic = {w: i for i, w in enumerate(char_set)}
data_dim = len(char_set)
hidden_size = len(char_set)
num_classes = len(char_set)
seq_length = 10 # Any arbitrary number
batch_size = len(dataX)
# training dataset
…

Exercise
● Run long sequence RNN
● Why it does not work?

Lab 12-4
RNN with long sequences: Stacked
RNN + Softmax layer
With TF 1.0!

Wide & Deep
https://www.tensorflow.org/versions/r0.11/tutorials/wide_and_deep/index.html

Stacked RNN
X = tf.placeholder(tf.int32, [None, seq_length])
Y = tf.placeholder(tf.int32, [None, seq_length])
# One-hot encoding
X_one_hot = tf.one_hot(X, num_classes)
print(X_one_hot) # check out the shape
# Make a lstm cell with hidden_size (each unit output vector size)
cell = rnn.BasicLSTMCell(hidden_size, state_is_tuple=True)
cell = rnn.MultiRNNCell([cell] * 2, state_is_tuple=True)
# outputs: unfolding size x hidden size, state = hidden size
outputs, _states = tf.nn.dynamic_rnn(cell, X_one_hot, dtype=tf.float32)

Softmax (FC) in Deep CNN
Image credit: http://personal.ie.cuhk.edu.hk/~ccloy/project_target_code/index.html

Softmax

X_for_softmax = tf.reshape(outputs,
[-1, hidden_size])
outputs = tf.reshape(outputs,
[batch_size, seq_length, num_classes])
Softmax

Softmax
# (optional) softmax layer
X_for_softmax = tf.reshape(outputs, [-1, hidden_size])
softmax_w = tf.get_variable("softmax_w",
[hidden_size, num_classes]
softmax_b = tf.get_variable("softmax_b",[num_classes])
outputs = tf.matmul(X_for_softmax,softmax_w) + softmax_b

Loss
# reshape out for sequence_loss
# All weights are 1 (equal weights)
weights = tf.ones([batch_size, seq_length])
sequence_loss = tf.contrib.seq2seq.sequence_loss(
logits=outputs, targets=Y, weights=weights)
mean_loss = tf.reduce_mean(sequence_loss)
train_op =
tf.train.AdamOptimizer(learning_rate=0.1).minimize(mean_loss)

Training and print results
sess = tf.Session()
sess.run(tf.global_variables_initializer()
_, l, results = sess.run(
[train_op, mean_loss, outputs],
feed_dict={X: dataX, Y: dataY})
for j, result in enumerate(results):
index = np.argmax(result, axis=1)
print(i, j, ''.join([char_set[t] for t in index]), l)
0 167 tttttttttt 3.23111
…
499 167 oof the se 0.229306
499 168 tf the sea 0.229306
499 169 n the sea. 0.229306

# Let's print the last char of each result to check it works
results = sess.run(outputs, feed_dict={X: dataX})
for j, result in enumerate(results):
index = np.argmax(result, axis=1)
if j is 0: # print all for the first result to make a sentence
print(''.join([char_set[t] for t in index]), end='')
else:
print(char_set[index[-1]], end='')
Training and print results
g you want to build a ship, don't drum up people together to collect wood and don't
assign them tasks and work, but rather teach them to long for the endless immensity
of the sea.

char-rnn
http://karpathy.github.io/2015/05/21/rnn-effectiveness/

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

char/word rnn (char/word level n to n model)
https://github.com/sherjilozair/char-rnn-tensorflow
https://github.com/hunkim/word-rnn-tensorflow

Lab 12-5
Dynamic RNN
With TF 1.0!

Different sequence length
h e l l o
h i
w h y
...

h e l l o
h i <pad> <pad> <pad>
w h y <pad> <pad>
...

h e l l o
h i
w h y
...
sequence_length=[5,2,3]

Dynamic RNN
x_data = np.array([[[...]]], dtype=np.float32)
hidden_size = 2
cell = rnn.BasicLSTMCell(num_units=hidden_size,
state_is_tuple=True)
cell, x_data, sequence_length=[5,3,4],
dtype=tf.float32)
print(outputs.eval())
array([[[-0.17904168, -0.08053244],
[-0.01294809, 0.01660814],
[-0.05754048, -0.1368292 ],
[-0.08655578, -0.20553185],
[ 0.07297077, -0.21743253]],
[[ 0.10272847, 0.06519825],
[ 0.20188759, -0.05027055],
[ 0.09514933, -0.16452041],
[ 0. , 0. ],
[ 0. , 0. ]],
[[-0.04893036, -0.14655617],
[-0.07947272, -0.20996611],
[ 0.06466491, -0.02576563],
[ 0.15087658, 0.05166111],
[ 0. , 0. ]]],

Lab 12-6
RNN with time series data (stock)
With TF 1.0!

Time series data
Open High Low Volume Close
828.659973 833.450012 828.349976 1247700 831.659973
823.02002 828.070007 821.655029 1597800 828.070007
819.929993 824.400024 818.97998 1281700 824.159973
819.359985 823 818.469971 1304000 818.97998
819 823 816 1053600 820.450012
816 820.958984 815.48999 1198100 819.23999
811.700012 815.25 809.780029 1129100 813.669983
809.51001 810.659973 804.539978 989700 809.559998
807 811.840027 803.190002 1155300 808.380005
'data-02-stock_daily.csv'

Many to one
1 2 3 4 5
8
6 7
https://github.com/hunkim/DeepLearningZeroToAll/blob/master/lab-12-5-rnn_stock_prediction.py

Open High Low Volume Close
828.659973 833.450012 828.349976 1247700 831.659973
823.02002 828.070007 821.655029 1597800 828.070007
819.929993 824.400024 818.97998 1281700 824.159973
819.359985 823 818.469971 1304000 818.97998
819 823 816 1053600 820.450012
816 820.958984 815.48999 1198100 819.23999
811.700012 815.25 809.780029 1129100 813.669983
809.51001 810.659973 804.539978 989700 ?
807 811.840027 803.190002 1155300 ?

Reading data
timesteps = seq_length = 7
data_dim = 5
output_dim = 1
# Open,High,Low,Close,Volume
xy = np.loadtxt('data-02-stock_daily.csv', delimiter=',')
xy = xy[::-1] # reverse order (chronically ordered)
xy = MinMaxScaler(xy)
x = xy
y = xy[:, [-1]] # Close as label
dataX = []
dataY = []
for i in range(0, len(y) - seq_length):
_x = x[i:i + seq_length]
_y = y[i + seq_length] # Next close price
print(_x, "->", _y)
dataX.append(_x)
dataY.append(_y)
[ 0.18667876 0.20948057 0.20878184 0.
0.21744815]
[ 0.30697388 0.31463414 0.21899367
0.01247647 0.21698189]
[ 0.21914211 0.26390721 0.2246864
0.45632338 0.22496747]
[ 0.23312993 0.23641916 0.16268272
0.57017119 0.14744274]
[ 0.13431201 0.15175877 0.11617252
0.39380658 0.13289962]
[ 0.13973232 0.17060429 0.15860382
0.28173344 0.18171679]
[ 0.18933069 0.20057799 0.19187983
0.29783096 0.2086465 ]]
-> [ 0.14106001]

Training and test datasets
# split to train and testing
train_size = int(len(dataY) * 0.7)
test_size = len(dataY) - train_size
trainX, testX = np.array(dataX[0:train_size]),
np.array(dataX[train_size:len(dataX)])
trainY, testY = np.array(dataY[0:train_size]),
np.array(dataY[train_size:len(dataY)])
X = tf.placeholder(tf.float32, [None, seq_length, data_dim])

LSTM and Loss
X = tf.placeholder(tf.float32, [None, seq_length, data_dim])
cell = tf.contrib.rnn.BasicLSTMCell(num_units=hidden_dim, state_is_tuple=True)
outputs, _states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
Y_pred = tf.contrib.layers.fully_connected(
outputs[:, -1], output_dim, activation_fn=None)
# We use the last cell's output
# cost/loss
loss = tf.reduce_sum(tf.square(Y_pred - Y)) # sum of the squares
# optimizer
optimizer = tf.train.AdamOptimizer(0.01)
train = optimizer.minimize(loss)

Training and Results
sess = tf.Session()
_, l = sess.run([train, loss],
feed_dict={X: trainX, Y: trainY})
print(i, l)
testPredict = sess.run(Y_pred, feed_dict={X: testX})
plt.plot(testY)
plt.plot(testPredict)
plt.show()

Exercise
● Implement stock prediction using linear regression only
● Improve results using more features such as keywords and/or
sentiments in top news

Other RNN applications
● Language Modeling
● Speech Recognition
● Machine Translation
● Conversation Modeling/Question Answering
● Image/Video Captioning
● Image/Music/Dance Generation
http://jiwonkim.org/awesome-rnn/

Google Could ML
Examples
https://github.com/hunkim/GoogleCloudMLExamples

Local TensorFlow Tasks
TensorFlow Task Local Disk

Cloud ML TensorFlow Tasks
Google Cloud ML
TensorFlow Task

https://cloud.google.com/ml/docs/how-tos/getting-set-up

https://console.cloud.google.com/

Google Cloud commands
• gclould: command-line interface to Google Cloud Platform
- Google Cloud ML jobs (`gcloud beta ml`)
- Google Compute Engine virtual machine instances and other resources
- Google Cloud Dataproc clusters and jobs
- Google Cloud Deployment manager deployments
- …
• gsutil: command-line interface to Google Cloud Storage
https://cloud.google.com/sdk/gcloud/
https://cloud.google.com/storage/docs/gsutil

Example git repository
git clone https://github.com/hunkim/GoogleCloudMLExamples.git

https://www.tensorflow.org/versions/r0.11/how_tos/reading_data/index.html
CSV File
Reading

Setting and file copy
JOB_NAME="task9"
PROJECT_ID=`gcloud config list project --format "value(core.project)"`
STAGING_BUCKET=gs://${PROJECT_ID}-ml
INPUT_PATH=${STAGING_BUCKET}/input
gsutil cp input/input.csv $INPUT_PATH/input.csv

Create/Check the output folder

With Great Power Comes Great Responsibility

Next
• Could ML deploy
• Hyper-parameter tuning
• Distributed training tasks

Tensor flow description of ML Lab. document

More Related Content

Similar to Tensor flow description of ML Lab. document (20)

Recently uploaded (20)

Tensor flow description of ML Lab. document