SlideShare a Scribd company logo
Transfer learning
Keras & TensorFlow
Hichem Felouat
hichemfel@gmail.com
Hichem Felouat - hichemfel@gmail.com 2
Transfer learning
Transfer learning (TL) is a research problem in ML that focuses on
storing knowledge gained while solving one problem and applying it
to a different but related problem. For example, knowledge gained
while learning to recognize Cats could apply when trying to
recognize Tigers.
Hichem Felouat - hichemfel@gmail.com 3
Building Complex Models - Functional API
• The Keras functional API is a way to create
models that is more flexible than the
tf.keras.Sequential API.
• The functional API can handle models with
non-linear topology, models with shared
layers, and models with multiple inputs or
outputs.
• This architecture makes it possible for the
neural network to learn both deep patterns
(using the deep path) and simple rules
(through the short path).
Hichem Felouat - hichemfel@gmail.com 4
Building Complex Models - Functional API
input_ = keras.layers.Input(shape=X_train.shape[1:])
hidden1 = keras.layers.Dense(30, activation="relu")(input_)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.Concatenate()([input_, hidden2])
output = keras.layers.Dense(1)(concat)
model = keras.Model(inputs=[input_], outputs=[output])
Once you have built the Keras model, everything is exactly like earlier, so
there’s no need to repeat it here: you must compile the model, train it, evaluate
it, and use it to make predictions.
Hichem Felouat - hichemfel@gmail.com 5
Building Complex Models - Functional API
What if you want to send a subset of the
features through the wide path and a different
subset (possibly overlapping) through the
deep path?!
• In this case, one solution is to use
multiple inputs.
Hichem Felouat - hichemfel@gmail.com 6
Building Complex Models - Functional API
input_A = keras.layers.Input(shape=[5], name="wide_input")
input_B = keras.layers.Input(shape=[6], name="deep_input")
hidden1 = keras.layers.Dense(30, activation="relu")(input_B)
hidden2 = keras.layers.Dense(30, activation="relu")(hidden1)
concat = keras.layers.concatenate([input_A, hidden2])
output = keras.layers.Dense(1, name="output")(concat)
model = keras.Model(inputs=[input_A, input_B], outputs=[output])
For example, suppose we want to send five features through the wide path
(features 0 to 4), and six features through the deep path (features 2 to 7) :
• You should name at least the most important layers.
Hichem Felouat - hichemfel@gmail.com 7
We can compile the model as usual, but when we call the fit()
method, instead of passing a single input matrix X_train, we must
pass a pair of matrices (X_train_A, X_train_B): one per input.
The same is true for X_valid, and also for X_test and X_new when
you call evaluate() or predict():
Building Complex Models - Functional API
history = model.fit((X_train_A, X_train_B), y_train, epochs=20,
validation_data = ((X_valid_A, X_valid_B), y_valid))
model_evaluate = model.evaluate((X_test_A, X_test_B), y_test)
y_pred = model.predict((X_new_A, X_new_B))
Hichem Felouat - hichemfel@gmail.com 8
Building Complex Models - Functional API
Another use case is as a regularization technique (i.e., a training
constraint whose objective is to reduce overfitting and thus improve
the model’s ability to generalize).
For example, you may want to add some
auxiliary outputs in a neural network
architecture to ensure that the underlying
part of the network learns something useful
on its own, without relying on the rest of the
network.
Hichem Felouat - hichemfel@gmail.com 9
Building Complex Models - Functional API
[...] # Same as before, up to the main output layer
output = keras.layers.Dense(1, name="main_output")(concat)
aux_output = keras.layers.Dense(1, name="aux_output")(hidden2)
model = keras.Model(inputs=[input_A, input_B], outputs=[output,
aux_output])
• Each output will need its own loss function. Therefore, when we compile the
model, we should pass a list of losses.
model.compile(loss=["mse", "mse"], loss_weights=[0.9, 0.1], optimizer="sgd")
Hichem Felouat - hichemfel@gmail.com 10
• We care much more about the main output than about the auxiliary output,
so we want to give the main output’s loss a much greater weight.
• we need to provide labels for each output (In this example, we used the
same labels).
history = model.fit([X_train_A, X_train_B], [y_train, y_train], epochs=20,
validation_data=([X_valid_A, X_valid_B], [y_valid, y_valid]))
• When we evaluate the model, Keras will return the total loss, as well as all
the individual losses:
total_loss, main_loss, aux_loss = model.evaluate([X_test_A, X_test_B],
[y_test, y_test])
y_pred_main, y_pred_aux = model.predict([X_new_A, X_new_B])
Building Complex Models - Functional API
Hichem Felouat - hichemfel@gmail.com 11
Reusing Pretrained Layers
• It is generally not a good idea to train a very large DNN from scratch:
instead, you should always try to find an existing neural network that
accomplishes a similar task to the one you are trying to tackle then reuse
the lower layers of this network.
• This technique is called transfer learning.
• It will not only speed up training considerably but also require significantly
less training data.
• The output layer of the original model should usually be replaced
because it is most likely not useful at all for the new task, and it may not
even have the right number of outputs for the new task.
Hichem Felouat - hichemfel@gmail.com 12
Reusing Pretrained Layers
• Transfer learning will work best when the inputs have similar low-level features.
Hichem Felouat - hichemfel@gmail.com 13
Reusing Pretrained Layers
• Try freezing all the reused layers first (i.e., make their weights non-trainable
so that Gradient Descent won’t modify them), then train your model and see
how it performs. Then try unfreezing one or two of the top hidden layers to
let backpropagation tweak them and see if performance improves.
model_A = keras.models.load_model("my_model_A.h5")
model_B_on_A = keras.models.Sequential(model_A.layers[:-1])
model_B_on_A.add(keras.layers.Dense(1, activation="sigmoid"))
# If you want to avoid affecting model_A
model_A_clone = keras.models.clone_model(model_A)
model_A_clone.set_weights(model_A.get_weights())
Hichem Felouat - hichemfel@gmail.com 14
Reusing Pretrained Layers
• The new output layer was initialized randomly it will make large errors.
• Freeze the reused layers during the first few epochs, giving the new layer
some time to learn reasonable weights.
for layer in model_B_on_A.layers[:-1]:
layer.trainable = False
model_B_on_A.compile( ... )
history = model_B_on_A.fit( ..., epochs = 5, ... )
for layer in model_B_on_A.layers[:-1]:
layer.trainable = True
model_B_on_A.compile( ... )
history = model_B_on_A.fit( ... )
model_B_on_A.evaluate( ... )
Hichem Felouat - hichemfel@gmail.com 15
Pretraining on an Auxiliary Task
If you do not have much labeled training data, one last option is
to train a first neural network on an auxiliary task for which you
can easily obtain or generate labeled training data, then reuse the
lower layers of that network for your actual task. The first neural
network’s lower layers will learn feature detectors that will likely be
reusable by the second neural network.
Reusing Pretrained Layers
Hichem Felouat - hichemfel@gmail.com 16
1. LeNet-5 (1998)
2. AlexNet (2012)
3. VGG-16 (2014)
4. Inception-v1
5. Inception-v3
6. ResNet-50
7. Xception (2016)
8. Inception-v4 (2016)
9. Inception-ResNets
10.ResNeXt-50 (2017)
CNN Variations
Over the years, variants of the CNN architecture have been developed,
leading to amazing advances in the field. A good measure of this progress is
the error rate in competitions such as the ILSVRC ImageNet challenge.
https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d
Hichem Felouat - hichemfel@gmail.com 17
CNN Variations - Legend
Hichem Felouat - hichemfel@gmail.com 18
CNN Variations - LeNet-5
• It was created by Yann LeCun in 1998 and has been widely used for
handwritten digit recognition (MNIST).
• This architecture has become the standard “template”: stacking
convolutions and pooling layers and ending the network with one or more
fully-connected layers.
Hichem Felouat - hichemfel@gmail.com 19
CNN Variations - AlexNet
• The AlexNet CNN architecture won the 2012 ImageNet ILSVRC challenge.
• To reduce overfitting, the authors used two regularization techniques. First,
they applied dropout with a 50% dropout rate during training to the outputs
of layers F1 and F2. Second, they performed data augmentation.
Hichem Felouat - hichemfel@gmail.com 20
CNN Variations - VGG
• The runner-up in the ILSVRC 2014 challenge was VGG, developed by
Karen Simonyan and Andrew Zisserman from the Visual Geometry Group
(VGG) research lab at Oxford University.
• VGG-16 architecture and VGG-19 architecture
Hichem Felouat - hichemfel@gmail.com 21
CNN Variations - GoogLeNet (Inception)
Hichem Felouat - hichemfel@gmail.com 22
CNN Variations - GoogLeNet (Inception)
• Inception modules allow GoogLeNet to use parameters much more
efficiently than previous architectures.
• This 22-layer architecture with 5M parameters is called the Inception-v1.
• Having parallel towers of convolutions with different filters, followed by
concatenation, captures different features at 1×1, 3×3 and 5×5, thereby
"clustering" them.
• 1×1 convolutions are used for dimensionality reduction to remove
computational bottlenecks.
• Due to the activation function from 1×1 convolution, its addition also adds
nonlinearity.
Hichem Felouat - hichemfel@gmail.com 23
CNN Variations - GoogLeNet (Inception V3)
Hichem Felouat - hichemfel@gmail.com 24
CNN Variations - ResNet-50
• The basic building block for ResNets are the conv and identity blocks.
• It uses skip connections (also called shortcut connections).
• If you add many skip connections, the network can start making progress
even if several layers have not started learning yet.
Hichem Felouat - hichemfel@gmail.com 25
CNN Variations - Xception (2016)
Hichem Felouat - hichemfel@gmail.com 26
CNN Variations - Inception-v4 (2016)
Hichem Felouat - hichemfel@gmail.com 27
CNN Variations - Inception-ResNet-V2 (2016)
Hichem Felouat - hichemfel@gmail.com 28
CNN Variations - ResNeXt-50 (2017)
Hichem Felouat - hichemfel@gmail.com 29
Using Pretrained Models from Keras
In general, you won’t have to implement standard models like
GoogLeNet or ResNet manually, since pretrained networks are
r e a d i l y a v a i l a b l e w i t h a s i n g l e l i n e o f c o d e i n t h e
keras.applications package.
For example, you can load the ResNet-50 model, pretrained on
ImageNet, with the following line of code:
model = keras.applications.resnet50. ResNet50
(weights= "imagenet")
Hichem Felouat - hichemfel@gmail.com 30
Using Pretrained Models from Keras
Available models :
Models for image classification with weights trained on ImageNet:
1) Xception
2) VGG16
3) VGG19
4) ResNet, ResNetV2
5) InceptionV3
6) InceptionResNetV2
7) MobileNet
8) MobileNetV2
9) DenseNet
10) NASNet
https://keras.io/applications/
Hichem Felouat - hichemfel@gmail.com 31
Using Pretrained Models from Keras
Hichem Felouat - hichemfel@gmail.com 32
Using Pretrained Models from Keras
model = keras.applications.resnet50.ResNet50(weights="imagenet")
# you first need to ensure that the images have the right size
# ResNet-50(224 × 224)
images_resized = tf.image.resize(images, [224, 224])
# Each model provides a preprocess_input() function
inputs = keras.applications.resnet50.preprocess_input(images_resized * 255)
# Now we can use the pretrained model to make predictions
Y_proba = model.predict(inputs)
Hichem Felouat - hichemfel@gmail.com 33
Using Pretrained Models from Keras
import tensorflow as tf
from tensorflow import keras
model = keras.applications.vgg16.VGG16(weights=None)
# The model’s summary() method displays all the model’s layers
print(model.summary())
include_top: whether to include the top layers of the network or not (False,
True).
weights: one of None (random initialization) or 'imagenet' (pre-training on
ImageNet).
Hichem Felouat - hichemfel@gmail.com 34
Pretrained Models for Transfer Learning
• If you want to build an image classifier but you do not have
enough training data, then it is often a good idea to reuse the
lower layers of a pretrained model.
• For example Xception model, we exclude the top of the
network by setting include_top=False: this excludes the global
average pooling layer and the dense output layer. We then add
our own layers. Finally, we create the Keras Model:
Hichem Felouat - hichemfel@gmail.com 35
Pretrained Models for Transfer Learning
base_model = keras.applications.xception.Xception(weights="imagenet",include_top=False)
avg = keras.layers.GlobalAveragePooling2D()(base_model.output)
output = keras.layers.Dense(n_classes, activation="softmax")(avg)
model = keras.Model(inputs=base_model.input, outputs=output)
for layer in base_model.layers:
layer.trainable = False
optimizer = keras.optimizers.SGD(lr=0.2, momentum=0.9, decay=0.01)
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])
history = model.fit(train_set, epochs=5, validation_data=valid_set)
for layer in base_model.layers:
layer.trainable = True
optimizer = keras.optimizers.SGD(lr=0.01, momentum=0.9, decay=0.001)
model.compile(...)
history = model.fit(...)
Hichem Felouat - hichemfel@gmail.com 36
Thank you for your
attention
link of the code:
https://github.com/hichemfelouat/my-codes-of-machine-learning/blob/master/Transfer_learning.py

More Related Content

PPTX
Transfer Learning and Fine-tuning Deep Neural Networks
PDF
CNN Algorithm
PPTX
Transfer learning-presentation
PDF
Deep Learning - Convolutional Neural Networks
PPTX
Deep Learning in Computer Vision
PDF
Transfer Learning -- The Next Frontier for Machine Learning
PDF
Deep Learning: Application & Opportunity
PPTX
RNN & LSTM: Neural Network for Sequential Data
Transfer Learning and Fine-tuning Deep Neural Networks
CNN Algorithm
Transfer learning-presentation
Deep Learning - Convolutional Neural Networks
Deep Learning in Computer Vision
Transfer Learning -- The Next Frontier for Machine Learning
Deep Learning: Application & Opportunity
RNN & LSTM: Neural Network for Sequential Data

What's hot (20)

PDF
Convolutional Neural Networks (CNN)
PPTX
Introduction to CNN
PPTX
CNN Tutorial
PDF
PDF
Neural networks and deep learning
PPTX
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
PDF
Transfer Learning: An overview
PPTX
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
PPT
Back propagation
PDF
GANs and Applications
PPTX
U-Net (1).pptx
PDF
Introduction to Generative Adversarial Networks (GANs)
PPTX
Deep Learning - Convolutional Neural Networks - Architectural Zoo
PPTX
Convolution Neural Network (CNN)
PPTX
Introduction to Machine Learning
PDF
Deep learning for medical imaging
PPTX
Artificial neural network
PPT
Multi-Layer Perceptrons
PDF
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Networks (CNN)
Introduction to CNN
CNN Tutorial
Neural networks and deep learning
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Transfer Learning: An overview
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Back propagation
GANs and Applications
U-Net (1).pptx
Introduction to Generative Adversarial Networks (GANs)
Deep Learning - Convolutional Neural Networks - Architectural Zoo
Convolution Neural Network (CNN)
Introduction to Machine Learning
Deep learning for medical imaging
Artificial neural network
Multi-Layer Perceptrons
Convolutional Neural Network Models - Deep Learning
Ad

Similar to Transfer Learning (20)

PPTX
Deep learning with keras
PDF
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
PPTX
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
PPTX
Detailed_TensorFlow_Keras_CheatSheet.pptx
PDF
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
PPTX
Keras on tensorflow in R & Python
PDF
Build your own Convolutional Neural Network CNN
PDF
TensorFlow and Keras: An Overview
PDF
Key projects in AI, ML and Generative AI
PDF
hidden layers in neural networks code examples tensorflow
PDF
Keras and TensorFlow
PDF
Introduction to Chainer
PDF
Python Keras module for advanced python programming
PDF
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
PDF
3_Transfer_Learning.pdf
 
PDF
Tensorflow 2.0 and Coral Edge TPU
PDF
Neural Networks from Scratch - TensorFlow 101
PDF
Handwritten Digit Recognition using Convolutional Neural Networks
PPTX
Deep cv 101
PPTX
Automatic Attendace using convolutional neural network Face Recognition
Deep learning with keras
TensorFlow meetup: Keras - Pytorch - TensorFlow.js
Demystifying-AI-Frameworks-TensorFlow-PyTorch-JAX-and-More (1).pptx
Detailed_TensorFlow_Keras_CheatSheet.pptx
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras on tensorflow in R & Python
Build your own Convolutional Neural Network CNN
TensorFlow and Keras: An Overview
Key projects in AI, ML and Generative AI
hidden layers in neural networks code examples tensorflow
Keras and TensorFlow
Introduction to Chainer
Python Keras module for advanced python programming
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
3_Transfer_Learning.pdf
 
Tensorflow 2.0 and Coral Edge TPU
Neural Networks from Scratch - TensorFlow 101
Handwritten Digit Recognition using Convolutional Neural Networks
Deep cv 101
Automatic Attendace using convolutional neural network Face Recognition
Ad

More from Hichem Felouat (10)

PDF
مفاهيم حول الذكاء الاصطناعي تشمل تعاريف و معلومات أساسية
PDF
Natural Language Processing NLP (Transformers)
PDF
Introduction To Generative Adversarial Networks GANs
PDF
Object detection and Instance Segmentation
PDF
The fundamentals of Machine Learning
PDF
Artificial Intelligence and its Applications
PDF
Natural Language Processing (NLP)
PDF
Predict future time series forecasting
PDF
How to Build your First Neural Network
PDF
Machine Learning Algorithms
مفاهيم حول الذكاء الاصطناعي تشمل تعاريف و معلومات أساسية
Natural Language Processing NLP (Transformers)
Introduction To Generative Adversarial Networks GANs
Object detection and Instance Segmentation
The fundamentals of Machine Learning
Artificial Intelligence and its Applications
Natural Language Processing (NLP)
Predict future time series forecasting
How to Build your First Neural Network
Machine Learning Algorithms

Recently uploaded (20)

PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Basic Mud Logging Guide for educational purpose
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Insiders guide to clinical Medicine.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
RMMM.pdf make it easy to upload and study
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Pharma ospi slides which help in ospi learning
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPH.pptx obstetrics and gynecology in nursing
Basic Mud Logging Guide for educational purpose
Supply Chain Operations Speaking Notes -ICLT Program
Module 4: Burden of Disease Tutorial Slides S2 2025
Insiders guide to clinical Medicine.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
O7-L3 Supply Chain Operations - ICLT Program
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Microbial disease of the cardiovascular and lymphatic systems
Final Presentation General Medicine 03-08-2024.pptx
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
RMMM.pdf make it easy to upload and study

Transfer Learning

  • 1. Transfer learning Keras & TensorFlow Hichem Felouat hichemfel@gmail.com
  • 2. Hichem Felouat - hichemfel@gmail.com 2 Transfer learning Transfer learning (TL) is a research problem in ML that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem. For example, knowledge gained while learning to recognize Cats could apply when trying to recognize Tigers.
  • 3. Hichem Felouat - hichemfel@gmail.com 3 Building Complex Models - Functional API • The Keras functional API is a way to create models that is more flexible than the tf.keras.Sequential API. • The functional API can handle models with non-linear topology, models with shared layers, and models with multiple inputs or outputs. • This architecture makes it possible for the neural network to learn both deep patterns (using the deep path) and simple rules (through the short path).
  • 4. Hichem Felouat - hichemfel@gmail.com 4 Building Complex Models - Functional API input_ = keras.layers.Input(shape=X_train.shape[1:]) hidden1 = keras.layers.Dense(30, activation="relu")(input_) hidden2 = keras.layers.Dense(30, activation="relu")(hidden1) concat = keras.layers.Concatenate()([input_, hidden2]) output = keras.layers.Dense(1)(concat) model = keras.Model(inputs=[input_], outputs=[output]) Once you have built the Keras model, everything is exactly like earlier, so there’s no need to repeat it here: you must compile the model, train it, evaluate it, and use it to make predictions.
  • 5. Hichem Felouat - hichemfel@gmail.com 5 Building Complex Models - Functional API What if you want to send a subset of the features through the wide path and a different subset (possibly overlapping) through the deep path?! • In this case, one solution is to use multiple inputs.
  • 6. Hichem Felouat - hichemfel@gmail.com 6 Building Complex Models - Functional API input_A = keras.layers.Input(shape=[5], name="wide_input") input_B = keras.layers.Input(shape=[6], name="deep_input") hidden1 = keras.layers.Dense(30, activation="relu")(input_B) hidden2 = keras.layers.Dense(30, activation="relu")(hidden1) concat = keras.layers.concatenate([input_A, hidden2]) output = keras.layers.Dense(1, name="output")(concat) model = keras.Model(inputs=[input_A, input_B], outputs=[output]) For example, suppose we want to send five features through the wide path (features 0 to 4), and six features through the deep path (features 2 to 7) : • You should name at least the most important layers.
  • 7. Hichem Felouat - hichemfel@gmail.com 7 We can compile the model as usual, but when we call the fit() method, instead of passing a single input matrix X_train, we must pass a pair of matrices (X_train_A, X_train_B): one per input. The same is true for X_valid, and also for X_test and X_new when you call evaluate() or predict(): Building Complex Models - Functional API history = model.fit((X_train_A, X_train_B), y_train, epochs=20, validation_data = ((X_valid_A, X_valid_B), y_valid)) model_evaluate = model.evaluate((X_test_A, X_test_B), y_test) y_pred = model.predict((X_new_A, X_new_B))
  • 8. Hichem Felouat - hichemfel@gmail.com 8 Building Complex Models - Functional API Another use case is as a regularization technique (i.e., a training constraint whose objective is to reduce overfitting and thus improve the model’s ability to generalize). For example, you may want to add some auxiliary outputs in a neural network architecture to ensure that the underlying part of the network learns something useful on its own, without relying on the rest of the network.
  • 9. Hichem Felouat - hichemfel@gmail.com 9 Building Complex Models - Functional API [...] # Same as before, up to the main output layer output = keras.layers.Dense(1, name="main_output")(concat) aux_output = keras.layers.Dense(1, name="aux_output")(hidden2) model = keras.Model(inputs=[input_A, input_B], outputs=[output, aux_output]) • Each output will need its own loss function. Therefore, when we compile the model, we should pass a list of losses. model.compile(loss=["mse", "mse"], loss_weights=[0.9, 0.1], optimizer="sgd")
  • 10. Hichem Felouat - hichemfel@gmail.com 10 • We care much more about the main output than about the auxiliary output, so we want to give the main output’s loss a much greater weight. • we need to provide labels for each output (In this example, we used the same labels). history = model.fit([X_train_A, X_train_B], [y_train, y_train], epochs=20, validation_data=([X_valid_A, X_valid_B], [y_valid, y_valid])) • When we evaluate the model, Keras will return the total loss, as well as all the individual losses: total_loss, main_loss, aux_loss = model.evaluate([X_test_A, X_test_B], [y_test, y_test]) y_pred_main, y_pred_aux = model.predict([X_new_A, X_new_B]) Building Complex Models - Functional API
  • 11. Hichem Felouat - hichemfel@gmail.com 11 Reusing Pretrained Layers • It is generally not a good idea to train a very large DNN from scratch: instead, you should always try to find an existing neural network that accomplishes a similar task to the one you are trying to tackle then reuse the lower layers of this network. • This technique is called transfer learning. • It will not only speed up training considerably but also require significantly less training data. • The output layer of the original model should usually be replaced because it is most likely not useful at all for the new task, and it may not even have the right number of outputs for the new task.
  • 12. Hichem Felouat - hichemfel@gmail.com 12 Reusing Pretrained Layers • Transfer learning will work best when the inputs have similar low-level features.
  • 13. Hichem Felouat - hichemfel@gmail.com 13 Reusing Pretrained Layers • Try freezing all the reused layers first (i.e., make their weights non-trainable so that Gradient Descent won’t modify them), then train your model and see how it performs. Then try unfreezing one or two of the top hidden layers to let backpropagation tweak them and see if performance improves. model_A = keras.models.load_model("my_model_A.h5") model_B_on_A = keras.models.Sequential(model_A.layers[:-1]) model_B_on_A.add(keras.layers.Dense(1, activation="sigmoid")) # If you want to avoid affecting model_A model_A_clone = keras.models.clone_model(model_A) model_A_clone.set_weights(model_A.get_weights())
  • 14. Hichem Felouat - hichemfel@gmail.com 14 Reusing Pretrained Layers • The new output layer was initialized randomly it will make large errors. • Freeze the reused layers during the first few epochs, giving the new layer some time to learn reasonable weights. for layer in model_B_on_A.layers[:-1]: layer.trainable = False model_B_on_A.compile( ... ) history = model_B_on_A.fit( ..., epochs = 5, ... ) for layer in model_B_on_A.layers[:-1]: layer.trainable = True model_B_on_A.compile( ... ) history = model_B_on_A.fit( ... ) model_B_on_A.evaluate( ... )
  • 15. Hichem Felouat - hichemfel@gmail.com 15 Pretraining on an Auxiliary Task If you do not have much labeled training data, one last option is to train a first neural network on an auxiliary task for which you can easily obtain or generate labeled training data, then reuse the lower layers of that network for your actual task. The first neural network’s lower layers will learn feature detectors that will likely be reusable by the second neural network. Reusing Pretrained Layers
  • 16. Hichem Felouat - hichemfel@gmail.com 16 1. LeNet-5 (1998) 2. AlexNet (2012) 3. VGG-16 (2014) 4. Inception-v1 5. Inception-v3 6. ResNet-50 7. Xception (2016) 8. Inception-v4 (2016) 9. Inception-ResNets 10.ResNeXt-50 (2017) CNN Variations Over the years, variants of the CNN architecture have been developed, leading to amazing advances in the field. A good measure of this progress is the error rate in competitions such as the ILSVRC ImageNet challenge. https://towardsdatascience.com/illustrated-10-cnn-architectures-95d78ace614d
  • 17. Hichem Felouat - hichemfel@gmail.com 17 CNN Variations - Legend
  • 18. Hichem Felouat - hichemfel@gmail.com 18 CNN Variations - LeNet-5 • It was created by Yann LeCun in 1998 and has been widely used for handwritten digit recognition (MNIST). • This architecture has become the standard “template”: stacking convolutions and pooling layers and ending the network with one or more fully-connected layers.
  • 19. Hichem Felouat - hichemfel@gmail.com 19 CNN Variations - AlexNet • The AlexNet CNN architecture won the 2012 ImageNet ILSVRC challenge. • To reduce overfitting, the authors used two regularization techniques. First, they applied dropout with a 50% dropout rate during training to the outputs of layers F1 and F2. Second, they performed data augmentation.
  • 20. Hichem Felouat - hichemfel@gmail.com 20 CNN Variations - VGG • The runner-up in the ILSVRC 2014 challenge was VGG, developed by Karen Simonyan and Andrew Zisserman from the Visual Geometry Group (VGG) research lab at Oxford University. • VGG-16 architecture and VGG-19 architecture
  • 21. Hichem Felouat - hichemfel@gmail.com 21 CNN Variations - GoogLeNet (Inception)
  • 22. Hichem Felouat - hichemfel@gmail.com 22 CNN Variations - GoogLeNet (Inception) • Inception modules allow GoogLeNet to use parameters much more efficiently than previous architectures. • This 22-layer architecture with 5M parameters is called the Inception-v1. • Having parallel towers of convolutions with different filters, followed by concatenation, captures different features at 1×1, 3×3 and 5×5, thereby "clustering" them. • 1×1 convolutions are used for dimensionality reduction to remove computational bottlenecks. • Due to the activation function from 1×1 convolution, its addition also adds nonlinearity.
  • 23. Hichem Felouat - hichemfel@gmail.com 23 CNN Variations - GoogLeNet (Inception V3)
  • 24. Hichem Felouat - hichemfel@gmail.com 24 CNN Variations - ResNet-50 • The basic building block for ResNets are the conv and identity blocks. • It uses skip connections (also called shortcut connections). • If you add many skip connections, the network can start making progress even if several layers have not started learning yet.
  • 25. Hichem Felouat - hichemfel@gmail.com 25 CNN Variations - Xception (2016)
  • 26. Hichem Felouat - hichemfel@gmail.com 26 CNN Variations - Inception-v4 (2016)
  • 27. Hichem Felouat - hichemfel@gmail.com 27 CNN Variations - Inception-ResNet-V2 (2016)
  • 28. Hichem Felouat - hichemfel@gmail.com 28 CNN Variations - ResNeXt-50 (2017)
  • 29. Hichem Felouat - hichemfel@gmail.com 29 Using Pretrained Models from Keras In general, you won’t have to implement standard models like GoogLeNet or ResNet manually, since pretrained networks are r e a d i l y a v a i l a b l e w i t h a s i n g l e l i n e o f c o d e i n t h e keras.applications package. For example, you can load the ResNet-50 model, pretrained on ImageNet, with the following line of code: model = keras.applications.resnet50. ResNet50 (weights= "imagenet")
  • 30. Hichem Felouat - hichemfel@gmail.com 30 Using Pretrained Models from Keras Available models : Models for image classification with weights trained on ImageNet: 1) Xception 2) VGG16 3) VGG19 4) ResNet, ResNetV2 5) InceptionV3 6) InceptionResNetV2 7) MobileNet 8) MobileNetV2 9) DenseNet 10) NASNet https://keras.io/applications/
  • 31. Hichem Felouat - hichemfel@gmail.com 31 Using Pretrained Models from Keras
  • 32. Hichem Felouat - hichemfel@gmail.com 32 Using Pretrained Models from Keras model = keras.applications.resnet50.ResNet50(weights="imagenet") # you first need to ensure that the images have the right size # ResNet-50(224 × 224) images_resized = tf.image.resize(images, [224, 224]) # Each model provides a preprocess_input() function inputs = keras.applications.resnet50.preprocess_input(images_resized * 255) # Now we can use the pretrained model to make predictions Y_proba = model.predict(inputs)
  • 33. Hichem Felouat - hichemfel@gmail.com 33 Using Pretrained Models from Keras import tensorflow as tf from tensorflow import keras model = keras.applications.vgg16.VGG16(weights=None) # The model’s summary() method displays all the model’s layers print(model.summary()) include_top: whether to include the top layers of the network or not (False, True). weights: one of None (random initialization) or 'imagenet' (pre-training on ImageNet).
  • 34. Hichem Felouat - hichemfel@gmail.com 34 Pretrained Models for Transfer Learning • If you want to build an image classifier but you do not have enough training data, then it is often a good idea to reuse the lower layers of a pretrained model. • For example Xception model, we exclude the top of the network by setting include_top=False: this excludes the global average pooling layer and the dense output layer. We then add our own layers. Finally, we create the Keras Model:
  • 35. Hichem Felouat - hichemfel@gmail.com 35 Pretrained Models for Transfer Learning base_model = keras.applications.xception.Xception(weights="imagenet",include_top=False) avg = keras.layers.GlobalAveragePooling2D()(base_model.output) output = keras.layers.Dense(n_classes, activation="softmax")(avg) model = keras.Model(inputs=base_model.input, outputs=output) for layer in base_model.layers: layer.trainable = False optimizer = keras.optimizers.SGD(lr=0.2, momentum=0.9, decay=0.01) model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"]) history = model.fit(train_set, epochs=5, validation_data=valid_set) for layer in base_model.layers: layer.trainable = True optimizer = keras.optimizers.SGD(lr=0.01, momentum=0.9, decay=0.001) model.compile(...) history = model.fit(...)
  • 36. Hichem Felouat - hichemfel@gmail.com 36 Thank you for your attention link of the code: https://github.com/hichemfelouat/my-codes-of-machine-learning/blob/master/Transfer_learning.py