SlideShare a Scribd company logo
An Introduction to Deep Learning
Julien Simon
Principal Evangelist, Artificial Intelligence & Machine Learning
@julsimon
May 2018
What to expect
• An introduction to Deep Learning
• Common network architectures and use cases
• Resources
• Artificial Intelligence: design software applications which exhibit
human-like behavior, e.g. speech, natural language processing,
reasoning or intuition
• Machine Learning: teach machines to learn without being
explicitly programmed
• Deep Learning: using neural networks, teach machines to learn
from complex data where features cannot be explicitly expressed
Myth: AI is dark magic
aka « You’re not smart enough »
Fact: AI is math, code and chips
A bit of Science, a lot of Engineering
An introduction to Deep Learning
Activation functionsThe neuron
!
"#$
%
xi ∗ wi = u
”Multiply and Accumulate”
Source: Wikipedia
x =
x11, x12, …. x1I
x21, x22, …. x2I
… … …
xm1, xm2, …. xmI
I features
m samples
y =
2
0
…
4
m labels,
N2 categories
0,0,1,0,0,…,0
1,0,0,0,0,…,0
…
0,0,0,0,1,…,0
One-hot encoding
Neural networks
x =
x11, x12, …. x1I
x21, x22, …. x2I
… … …
xm1, xm2, …. xmI
I features
m samples
y =
2
0
…
4
m labels,
N2 categories
Total number of predictions
Accuracy =
Number of correct predictions
0,0,1,0,0,…,0
1,0,0,0,0,…,0
…
0,0,0,0,1,…,0
One-hot encoding
Neural networks
Neural networks
Initially, the network will not predict correctly
f(X1) = Y’1
A loss function measures the difference between
the real label Y1 and the predicted label Y’1
error = loss(Y1, Y’1)
For a batch of samples:
!
"#$
%&'() *"+,
loss(Yi, Y’i) = batch error
The purpose of the training process is to
minimize error by gradually adjusting weights.
Training
Training data set Training
Trained
neural network
Batch size
Learning rate
Number of epochs
Hyper parameters
Backpropagation
Stochastic Gradient Descent (1951)
Imagine you stand on top of a mountain with
skis strapped to your feet. You want to get down
to the valley as quickly as possible, but there is
fog and you can only see your immediate
surroundings. How can you get down the
mountain as quickly as possible? You look around
and identify the steepest path down, go down
that path for a bit, again look around and find
the new steepest path, go down that path, and
repeat—this is exactly what gradient descent
does.
Tim Dettmers
University of Lugano
2015
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-history-training/
The « step size » depends
on the learning rate
z=f(x,y)
Local minima and saddle points
« Do neural networks enter and
escape a series of local minima? Do
they move at varying speed as they
approach and then pass a variety of
saddle points? Answering these
questions definitively is difficult, but
we present evidence strongly
suggesting that the answer to all of
these questions is no. »
« Qualitatively characterizing neural network
optimization problems », Goodfellow et al, 2015
https://arxiv.org/abs/1412.6544
Optimizers
https://medium.com/@julsimon/tumbling-down-the-sgd-rabbit-hole-part-1-740fa402f0d7
SGD works remarkably
well and is still widely
used.
Adaptative optimizers
use a variable learning
rate.
Some even use a
learning rate per
dimension (Adam).
Validation
Validation data set
(also called dev set)
Neural network
in training
Validation
accuracy
Prediction at
the end of
each epoch
This data set must have the same distribution as real-life samples,
or else validation accuracy won’t reflect real-life accuracy.
Test
Test data set Fully trained
neural network
Test accuracy
Prediction at
the end of
experimentation
This data set must have the same distribution as real-life samples,
or else test accuracy won’t reflect real-life accuracy.
Early stopping
Training accuracy
Loss function
Accuracy
100%
Epochs
Validation accuracy
Loss
Best epoch
OVERFITTING
« Deep Learning ultimately is about finding a
minimum that generalizes well, with bonus points for
finding one fast and reliably », Sebastian Ruder
Common network architectures
and use cases
Convolutional Neural Networks (CNN)
Le Cun, 1998: handwritten digit recognition, 32x32 pixels
https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
Source: http://timdettmers.com
Extracting features with convolution
Convolution extracts features automatically.
Kernel parameters are learned during the training process.
Object Detection
https://github.com/precedenceguo/mx-rcnn https://github.com/zhreshold/mxnet-yolo
Object Segmentation
https://github.com/TuSimple/mx-maskrcnn
Text Detection and Recognition
https://github.com/Bartzi/stn-ocr
Face Detection
https://github.com/tornadomeet/mxnet-face
Real-Time Pose Estimation
https://github.com/dragonfly90/mxnet_Realtime_Multi-Person_Pose_Estimation
Long Short Term Memory Networks (LSTM)
• A LSTM neuron computes the
output based on the input and a
previous state
• LSTM networks have memory
• They’re great at predicting
sequences, e.g. machine
translation
Machine Translation
https://github.com/awslabs/sockeye
GAN: Welcome to the (un)real world, Neo
Generating new ”celebrity” faces
https://github.com/tkarras/progressive_growing_of_gans
From semantic map to 2048x1024 picture
https://tcwang0509.github.io/pix2pixHD/
Wait! There’s more!
Models can also generate text from text, text from images, text
from video, images from text, sound from video,
3D models from 2D images, etc.
An Introduction to Deep Learning (May 2018)
https://aws.amazon.com/machine-learning
https://aws.amazon.com/blogs/ai
https://mxnet.incubator.apache.org | https://github.com/apache/incubator-mxnet
https://gluon.mxnet.io | https://github.com/gluon-api
https://medium.com/@julsimon
https://youtube.com/juliensimonfr
https://github.com/juliensimon/dlnotebooks
Getting started
Thank you!
Julien Simon
Principal Evangelist, Artificial Intelligence & Machine Learning
@julsimon

More Related Content

PDF
Deep Dive on Deep Learning (June 2018)
PDF
Machine Learning Inference at the Edge
PPTX
Deep Learning with Apache MXNet (September 2017)
PDF
Deep Learning at the Edge
PDF
Deep Learning at Scale
PDF
Improving Hardware Efficiency for DNN Applications
PPTX
Squeezing Deep Learning Into Mobile Phones
PPTX
Build, train, and deploy Machine Learning models at scale (May 2018)
Deep Dive on Deep Learning (June 2018)
Machine Learning Inference at the Edge
Deep Learning with Apache MXNet (September 2017)
Deep Learning at the Edge
Deep Learning at Scale
Improving Hardware Efficiency for DNN Applications
Squeezing Deep Learning Into Mobile Phones
Build, train, and deploy Machine Learning models at scale (May 2018)

What's hot (19)

PDF
Deep Learning Computer Build
PPTX
Android and Deep Learning
PPTX
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
PPTX
Deep learning on mobile - 2019 Practitioner's Guide
PDF
Urs Köster Presenting at RE-Work DL Summit in Boston
PPTX
Mastering Computer Vision Problems with State-of-the-art Deep Learning
PDF
Recent developments in Deep Learning
PPTX
Deep Learning on Qubole Data Platform
PDF
Rethinking computation: A processor architecture for machine intelligence
PPTX
Deep learning on mobile
PPTX
Deep Learning with Microsoft R Open
PPTX
Deep learning an Introduction with Competitive Landscape
PDF
Deep Learning Primer: A First-Principles Approach
PDF
Faster deep learning solutions from training to inference - Michele Tameni - ...
PPTX
The deep learning tour - Q1 2017
PPTX
ECS for Amazon Deep Learning and Amazon Machine Learning
PDF
On-device machine learning: TensorFlow on Android
PDF
Processing images with Deep Learning
PDF
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
Deep Learning Computer Build
Android and Deep Learning
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep learning on mobile - 2019 Practitioner's Guide
Urs Köster Presenting at RE-Work DL Summit in Boston
Mastering Computer Vision Problems with State-of-the-art Deep Learning
Recent developments in Deep Learning
Deep Learning on Qubole Data Platform
Rethinking computation: A processor architecture for machine intelligence
Deep learning on mobile
Deep Learning with Microsoft R Open
Deep learning an Introduction with Competitive Landscape
Deep Learning Primer: A First-Principles Approach
Faster deep learning solutions from training to inference - Michele Tameni - ...
The deep learning tour - Q1 2017
ECS for Amazon Deep Learning and Amazon Machine Learning
On-device machine learning: TensorFlow on Android
Processing images with Deep Learning
"New Dataflow Architecture for Machine Learning," a Presentation from Wave Co...
Ad

Similar to An Introduction to Deep Learning (May 2018) (20)

PDF
An Introduction to Deep Learning (March 2018)
PDF
Deep Learning: concepts and use cases (October 2018)
PPTX
An Introduction to Deep Learning (April 2018)
PPTX
An Introduction to Deep Learning I AWS Dev Day 2018
PPTX
Deep Learning for Developers
PPTX
Deep Learning for Developers (expanded version, 12/2017)
PDF
An introduction to Deep Learning
PDF
Separating Hype from Reality in Deep Learning with Sameer Farooqui
PDF
Apache MXNet ODSC West 2018
PPTX
Deep Learning Jump Start
PDF
Deep learning: Cutting through the Myths and Hype
PDF
Deep Learning Class #1 - Go Deep or Go Home
PDF
DL Classe 1 - Go Deep or Go Home
PDF
Deep Learning Study _ FInalwithCNN_RNN_LSTM_GRU.pdf
PPTX
08 neural networks
PDF
Deep learning - a primer
PDF
Deep learning - a primer
PPTX
Deep learning
PDF
A Platform for Accelerating Machine Learning Applications
PPTX
DeepLearningLecture.pptx
An Introduction to Deep Learning (March 2018)
Deep Learning: concepts and use cases (October 2018)
An Introduction to Deep Learning (April 2018)
An Introduction to Deep Learning I AWS Dev Day 2018
Deep Learning for Developers
Deep Learning for Developers (expanded version, 12/2017)
An introduction to Deep Learning
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Apache MXNet ODSC West 2018
Deep Learning Jump Start
Deep learning: Cutting through the Myths and Hype
Deep Learning Class #1 - Go Deep or Go Home
DL Classe 1 - Go Deep or Go Home
Deep Learning Study _ FInalwithCNN_RNN_LSTM_GRU.pdf
08 neural networks
Deep learning - a primer
Deep learning - a primer
Deep learning
A Platform for Accelerating Machine Learning Applications
DeepLearningLecture.pptx
Ad

More from Julien SIMON (20)

PDF
Implementing high-quality and cost-effiient AI applications with small langua...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
PDF
Arcee AI - building and working with small language models (06/25)
PDF
deep_dive_multihead_latent_attention.pdf
PDF
Deep Dive: Model Distillation with DistillKit
PDF
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
PDF
Building High-Quality Domain-Specific Models with Mergekit
PDF
Tailoring Small Language Models for Enterprise Use Cases
PDF
Tailoring Small Language Models for Enterprise Use Cases
PDF
Julien Simon - Deep Dive: Compiling Deep Learning Models
PDF
Tailoring Small Language Models for Enterprise Use Cases
PDF
Julien Simon - Deep Dive - Optimizing LLM Inference
PDF
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
PDF
Julien Simon - Deep Dive - Quantizing LLMs
PDF
Julien Simon - Deep Dive - Model Merging
PDF
An introduction to computer vision with Hugging Face
PDF
Reinventing Deep Learning
 with Hugging Face Transformers
PDF
Building NLP applications with Transformers
PPTX
Building Machine Learning Models Automatically (June 2020)
Implementing high-quality and cost-effiient AI applications with small langua...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Trying to figure out MCP by actually building an app from scratch with open s...
Arcee AI - building and working with small language models (06/25)
deep_dive_multihead_latent_attention.pdf
Deep Dive: Model Distillation with DistillKit
Deep Dive: Parameter-Efficient Model Adaptation with LoRA and Spectrum
Building High-Quality Domain-Specific Models with Mergekit
Tailoring Small Language Models for Enterprise Use Cases
Tailoring Small Language Models for Enterprise Use Cases
Julien Simon - Deep Dive: Compiling Deep Learning Models
Tailoring Small Language Models for Enterprise Use Cases
Julien Simon - Deep Dive - Optimizing LLM Inference
Julien Simon - Deep Dive - Accelerating Models with Better Attention Layers
Julien Simon - Deep Dive - Quantizing LLMs
Julien Simon - Deep Dive - Model Merging
An introduction to computer vision with Hugging Face
Reinventing Deep Learning
 with Hugging Face Transformers
Building NLP applications with Transformers
Building Machine Learning Models Automatically (June 2020)

Recently uploaded (20)

PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Approach and Philosophy of On baking technology
PDF
Electronic commerce courselecture one. Pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Cloud computing and distributed systems.
PPTX
Big Data Technologies - Introduction.pptx
PPT
Teaching material agriculture food technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Network Security Unit 5.pdf for BCA BBA.
The AUB Centre for AI in Media Proposal.docx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
cuic standard and advanced reporting.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Mobile App Security Testing_ A Comprehensive Guide.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Approach and Philosophy of On baking technology
Electronic commerce courselecture one. Pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Cloud computing and distributed systems.
Big Data Technologies - Introduction.pptx
Teaching material agriculture food technology
Chapter 3 Spatial Domain Image Processing.pdf
A Presentation on Artificial Intelligence
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy

An Introduction to Deep Learning (May 2018)