SlideShare a Scribd company logo
Deep Learning Crash
Course
By : Vishwas Narayan
Deep Learning is Everywhere
Deep learning crash course
Deep learning crash course
Deep learning crash course
What you should Learn is to
● Build a groundbreaking intelligence just like humans through Deep Learning
● Build Neural Network with your approach and also make them make some
sustainable Decision
● Understand and give your own approach to give a effective training for the
deep learning model
What is the Difference between the Ai and others
Like
1. Deep Learning with ML and AI
2. Machine Learning with AI
Here never forget somehow you train to get the model
What you will learn?
● Loss Function and Optimizers
● Gradient Descent Algorithm
● Neural Network Architecture
What is Deep Learning?
Microsoft Word - Turing Test.doc (umbc.edu)
Deep learning crash course
Machine learning is basically
Teaching machine learn patterns in data
Deep learning crash course
Deep learning crash course
Deep learning crash course
Deep learning crash course
Deep learning crash course
Deep learning crash course
What is Deep Learning
A Machine learning Technique that learn features and tasks directly from data.
Inputs are run through “Neural Networks”
Neural Network have hidden layers
Why Deep Learning?
● Machines never get Fatigue.
● They need to get trained from the Human intelligence -
● They just fetch patterns.
In Deep Learning
Features can be learnt from Raw Data
What they Really mean to us?
Black Box
Neuron
X
Y
Some functional Output
X
Y
Some functional Output that
is inspired by brain
DATA
Algorithm
Output
Traditionally
DATA
Algorithm
Output
DATA
Output
DATA
Model
Model
Predictions
Neuron
DATA
Output
Model
Insight
Intent
Why do we need now?
● Data is Prevalent?
● Improved Hardware Architecture
● New Software Architecture
Neural Networks
Inspired by the Neurons on the Brain.
Building Block of the Neural Network is
Neuron
Neural Network
● Take data as the input
● Train themselves to understand the patterns in the data
A simple Neural Network
Learning Process of the Neural Network
1. Learning Process of the Neural Network
2. Forward Propagation
3. Back Propagation
Forward Propagation
Weights and Biases
● Weights - How important information can this neural network can get
● Bias - allow the right decision to be taken into consideration
Back Propagation
Feedback loop
In Backpropagation
Loss function helps the Neural Network Quantify the deviation front he expected
output.
Randomly Initialize the Parameters
Back Propagation
● Use of the Loss Function
● Go Backwards and self tune the initial weights and biases
● Values adjusted to better fit prediction of the model that is trained from the
data.
Learning Algorithm for the Neural Network
● Initialize the parameters with some tuned and calculated value.
● Feed Input data to the Network
● Compare the Predicted value with the expected data and calculate loss
● Perform Backpropagation to propagate this loss back through the network.
● Update parameters based on the loss
● Iterate the previous steps till the loss is minimized.
Terms used in the Neural Network
Activation Function
● Helps to decide whether a Neuron can be a drop out or can contribute to the
next layer based on the dataset that it is trained on.
● Introduce non linearity into the Neural Network.
Deep learning crash course
Deep learning crash course
Deep learning crash course
Which Activation Function to use?
● For the Binary Classification: Sigmoid or the Relu is Used for the best results.
● In the case of classifiers, sigmoid functions and their combinations often
perform better.
● Because of the vanishing gradient issue, sigmoids and tanh functions are
sometimes avoided.
● The ReLU function is a generic activation function that is employed in the
majority of applications these days.
Activation Function Condition
● If we have dead neurons in our networks, the leaky ReLU function is the best
option.
● Remember that the ReLU function should only be utilised in the hidden
layers.
● As a general guideline, you should start with the ReLU function and then go
on to other activation functions if the ReLU function does not produce the best
results.
The king here is the data nad queen also
No matter what you are training a model using a dataset that is available for
you,Neural Network is as it becomes.
Loss Function
We know that from the Random Weights and Biases the Neural Network Makes
decision - Expected Output ,Weights and Biases are calculated.
Thus they Quantify the deviation of the predicted output by the NEural NEtwork to
the expected output.
The loss functions in Regression are
● Absolute Error Loss
● Huber Loss
● Squared Error Loss
The loss Function in Binary Classification
Binary Cross Entropy
Hinge Loss
Multi Class Classification loss functions
Multi-Class Cross Entropy Loss
KL(Kullback–Leibler) Divergence
Optimizers
During the Training process we will adjust the parameters to minimize the loss
function and make our model as optimized as possible for the Use.
Optimizers are basically
A function that combines to the loss function and model parameters by updating
the Neural Network based on the output of the Loss Function.
Gradient Descent
Iterative function that starts off at a random Point on the loss function and travel
down its slope in steps(learning rate -from user) until it reaches the lowest point of
the function.
1. Again this depends on the data.but they are
2. Most popular optimizer
3. Fast,Robust,Flexible.
Algorithm in the Lay man Terms
1. Calculate what a small change in the each individual weights would do to the
loss function
2. Adjust each parameter based int eh its gradient(differential)
3. Repeat Steps one and Two until lower loss function is calculated by the
Neural Network.
Deep learning crash course
To avoid getting stuck in the local minima
We use Learning Rate
● Usually a small number that is multiplied to the scale of the gradients,which is
any changes made to the weights are quite small.
● If we take large steps as learning rate then algorithm will tend to overshoot
the global minimum
● Where we also don't want the algorithm to take forever to train and converge
to the Global minimum.
Deep learning crash course
They are more robust ,why?
● Like Gradient Descent,Except uses a subset of training example rather than
the entire lot.
● SGD is Gradient descent that uses batch on each training.
● Use of the Momentum to Accumulate gradients.
● Less intensive computation as they are batched.
Backpropagation
A simple implementation of the Gradient descent on the neural network.
AdaGrad
● Adaptive learning rate to individual features.
● Some weights will have different learning rates
● Ideal for the Sparse datasets with many input examples missing
● Learning rate tends to get lower accordingly.
Parameters and Hyperparameters
● What are model parameters?
● Variable from the neural Network whose values can be estimated from the
data.
● Required by the model to make prediction
● Value define the learnt parameters from the data.
● Not set manually.saved as the Neural Network is trained.
Example - Weights and Biases.
What are model Hyper parameters?
● They are configured externally to the neural network : Value cannot be
estimated until we train the dataset on the neural network
● No clear way to find the best value
● When the DL algorithm is tuned ,you are really tuning the hyperparameters.
● This is tuned manually
Example - Learning Rate,C and Alpha in SVM,Epochs,k in the kNEes
Summary
Model parameters -> Estimated front the data
Model Hyperparameters -> Can't be estimated from the data
HyperParameters are often called as the parameters as they are a part of the
Machine Learning that must set manually and tuned.
Epochs,Batches,Batch Size and Iterations
Need to learn to do this to your Neural Network when the dataset is too Big.
Break the dataset into smaller chunks and feed those chunks to the Neural
Network One by One.
Epochs
When the Entire dataset is passed forward to the Neural Network and only once
they get trained in the Network.
We use more than one epoch to help model generalize better and accurate.
There is no absolute count for the dataset as its different for different datasets.
Batch and Batch size
We divide large dataset into the smaller batches and feed those batches to the
Neural Network
Batch Size - Total number of the training examples in the Batches.
Iterations
Number of Batches needed to complete one epoch
Number f batches = Number of iterations in one epochs
Let's have some more insights
Suppose we have 1 million Number if dataset as the Training Example and you
divide the dataset into the batches of 500 ,to Complete 1 Epoch ,it would take
20000 iterations.
Conclusion for the terms used in NN’s
How to design an Architecture?
Which Activation Function to use?
The only thing is to
Types of Learning
There are Three Main Types -
● Supervised Learning
● Unsupervised Learning
● Reinforcement Learning
Supervised Learning
● Algorithms designed to learn from Examples.
● Models are trained on well-labelled data
● Each example has
● Input Object - Typically a Vector
● Desired Output Value Supervised Signal
During Training
Searches the pattern and correlate with the desired output.
After Training
Takes the unseen inputs and determine which label to classify it to.
Objective of a Supervised learning model
Is to predict the correct label for the unseen data.
Deep learning crash course
Supervised learning is of two types
● Classification
● Regression
Classification
● Take Input data and assign it to a class/category.
● Models finds features in the data that correlates to the class and creates a
mapping function
● This mapping function will be used to classify unseen data from testing and
the validation set from the cross validation of the data
Binary and Multiclass classification
Definition and Example
Popular Classification Algorithms
● Logistic Regression.
● Naïve Bayes.
● Stochastic Gradient Descent.
● K-Nearest Neighbours.
● Decision Tree.
● Random Forest.
● Support Vector Machine
Regression
Model tries to find a relationship between dependent and independent variable.
Goal is always to predict continuous values such as a test score.
Equation is always continuous
Simple Linear Regression
Different Regression Algorithm
● Linear Regression.
● Logistic Regression.
● Ridge Regression.
● Lasso Regression.
● Polynomial Regression.
● Bayesian Linear Regression.
Application of Supervised Learning
● Text categorization
● Face Detection
● Signature recognition
● Customer discovery
● Spam detection
● Weather forecasting
● Predicting housing prices based on the prevailing market price
● Stock price predictions, among others
Unsupervised Learning
● Uses to manifest underlying pattern in data
● Used in Exploratory Data Analysis
● Need no labelled data,they use the feature from the Data
Unsupervised Learning is of
● Clustering
● Association
Clustering -Partitional Clustering
● Partitional Clustering
● Each Data point can belong to a single cluster
Clustering - Hierarchical Clustering
● Clusters within the clusters
● Datapoint may belong to different clusters
Association
Attempts to find different relationship between the different entities.
Example - Market Basket Analysis
Some Clustering Algorithm
1. Clustering Dataset
2. Affinity Propagation
3. Agglomerative Clustering
4. BIRCH
5. DBSCAN
6. K-Means
7. Mini-Batch K-Means
8. Mean Shift
9. OPTICS
10.Spectral Clustering
11.Gaussian Mixture Model
Application of the Unsupervised Learning
● Fraud detection
● Malware detection
● Identification of human errors during data entry
● Conducting accurate basket analysis, etc.
Reinforcement Learning
Enable the intelligent entity to learn in an interactive environment by trial and
error(by Policy and reward network) based on its own actions and experience.
This is a very new way of getting the things learnt.
If your Neural Network doesn't work well then you have to use the Reinforcement
:Learning.
Reward and Punishment is the key here
Uses the positive and negative signals as the behavior to understand what has
been learnt.
Goal of Reinforcement learning is to
● Find a Suitable model that would maximize the total cumulative reward and
make a very approximate result that might help in making some more
discussion.
● Maximize the points won in a training over many examples
● Penalize when they make wrong decisions
● Reward where they make Right decision
Usually modelled as a “Markov Decision Process”
Penalty/
Reward
Next State
Action
Application of the Robotics
● Robotics
● Business strategy
● Traffic Light Control
● Web system configuration
● NLP
○ to personalize suggestions
○ deliver more meaningful notifications to users
○ optimize video streaming quality.
● Gaming
● Bidding
Some core and Canonical Problems in Deep Learning
Basically we find this as the situation:
Model should perform well on training data and new test data
Most common problem faced will always be overfitting
So the data points as the example is here
● Data is Skew
● Data is Random
● They don't care anything and
anybody they are just generated
● They are collected to make
sense and make a model
● They are collected to make the
right decision from the model
Underfitting
Over-fitting
Tackling Overfitting is
1. Hold-out
2. Cross-validation
3. Data augmentation
4. Feature selection
5. L1 / L2 regularization
6. Remove layers / number of units per layer
7. Dropout
8. Early stopping
Deep learning crash course
Data Augmentation
Just create some fake data as much as possible from the data itself.
Deep learning crash course
Early Stopping
Use Early Stopping to Halt the Training of Neural Networks At the Right Time (machinelearningmastery.com)
When do you need to do this?
Training error decreases steadily but the validation error increases after a certain
point.
Neural Network are in plenty
So go the sources that I am saying in this stream.
So we talked a lot about the models
So now let's get to know how we can build a model.
Gathering Data
Picking the right data is very important,Good way to start is you need to make
assumption about the data that you need.
Size of the data set also matters
No one size fits all
Amount of the data needed = 10 times the model parameters
Quality of the data also matters
Data has to be more Accurate and Reliable with no Adversaries.
Noiseless Features.
Some Dataset Repositories are
I will list out here.
Pre- Processing dataset
Split the dataset into the subset.
Training data set
Testing Data Set
Validation Data Set
We can randomly split the dataset
This process depends on
● Number of the samples in eh data
● Model Being Trained
Simple rule of thumb
● Few Hyperparameters means small validation set
● Many hyperparameters means large validation set
The ratio in which you split the dataset is specific to your
Use Case
Dataset
Train
Train
Test
Folds
Test Validation
Deep learning crash course
Look for the missing data
● Nan or Null
● Eliminated Features or the Missing Value
● Impute the Missing data
Sampling
Use a sample of the dataset
Why we need this
● Faster Convergence
● Reduces the Disk Space
Preprocessing is Required for the FEature Scaling
● Crucial Step for the Model TRaining:
● Normalization
● Standardization
Then obviously train and Evaluate.
Optimization
Will be Continued ...

More Related Content

PPTX
Deep Learning Tutorial
PPTX
Introduction Of Artificial neural network
PPT
Artificial neural network
PDF
Numerical methods presentation 11 iteration method
PPTX
Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...
PPTX
Artifical Neural Network and its applications
PDF
Building a Neural Machine Translation System From Scratch
ODP
Artificial Neural Network
Deep Learning Tutorial
Introduction Of Artificial neural network
Artificial neural network
Numerical methods presentation 11 iteration method
Deep Learning With Python | Deep Learning And Neural Networks | Deep Learning...
Artifical Neural Network and its applications
Building a Neural Machine Translation System From Scratch
Artificial Neural Network

What's hot (20)

PDF
Convolutional neural network
PPT
Semi-supervised Learning
PDF
Overview of Convolutional Neural Networks
PPT
Neural Networks
PPTX
Recommendation system
PPTX
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
PDF
An Introduction to Deep Learning
PPT
neural networks
PDF
AI PPT-ALR_Unit-3-1.pdf
PDF
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
PPT
rnn BASICS
PPTX
Neural network
PPTX
Deep Learning - RNN and CNN
PDF
Handwritten Digit Recognition using Convolutional Neural Networks
PDF
Compiler Design- Machine Independent Optimizations
PPTX
[PR12] Inception and Xception - Jaejun Yoo
PDF
Deep Feed Forward Neural Networks and Regularization
PPTX
Artificial Neural Network
PPTX
Introduction to artificial neural network
PPTX
Artificial Neural Network (ANN) Basic
Convolutional neural network
Semi-supervised Learning
Overview of Convolutional Neural Networks
Neural Networks
Recommendation system
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
An Introduction to Deep Learning
neural networks
AI PPT-ALR_Unit-3-1.pdf
Deep Learning Tutorial | Deep Learning Tutorial for Beginners | Neural Networ...
rnn BASICS
Neural network
Deep Learning - RNN and CNN
Handwritten Digit Recognition using Convolutional Neural Networks
Compiler Design- Machine Independent Optimizations
[PR12] Inception and Xception - Jaejun Yoo
Deep Feed Forward Neural Networks and Regularization
Artificial Neural Network
Introduction to artificial neural network
Artificial Neural Network (ANN) Basic
Ad

Similar to Deep learning crash course (20)

PDF
Separating Hype from Reality in Deep Learning with Sameer Farooqui
PDF
Deep Learning Study _ FInalwithCNN_RNN_LSTM_GRU.pdf
PDF
Getting started with Machine Learning
PPTX
UNIT IV NEURAL NETWORKS - Multilayer perceptron
PPTX
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
PPTX
Introduction to Deep learning and H2O for beginner's
PPTX
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
PPTX
Deeplearning for Computer Vision PPT with
PPTX
Nimrita deep learning
PDF
#7 Neural Networks Artificial intelligence
PPTX
08 neural networks
PDF
CSSC ML Workshop
PPTX
Neural network basic and introduction of Deep learning
PDF
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
PPTX
Introduction to Machine Learning basics.pptx
PPTX
Machine Learning Essentials Demystified part2 | Big Data Demystified
PPTX
An Introduction to Deep Learning
PPTX
Introduction to Deep Learning
PPTX
Artificial intelligence learning presentations
PDF
Neural networks
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Deep Learning Study _ FInalwithCNN_RNN_LSTM_GRU.pdf
Getting started with Machine Learning
UNIT IV NEURAL NETWORKS - Multilayer perceptron
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Introduction to Deep learning and H2O for beginner's
Reason To Switch to DNNDNNs excel in handling huge volumes of data (e.g., ima...
Deeplearning for Computer Vision PPT with
Nimrita deep learning
#7 Neural Networks Artificial intelligence
08 neural networks
CSSC ML Workshop
Neural network basic and introduction of Deep learning
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Introduction to Machine Learning basics.pptx
Machine Learning Essentials Demystified part2 | Big Data Demystified
An Introduction to Deep Learning
Introduction to Deep Learning
Artificial intelligence learning presentations
Neural networks
Ad

More from Vishwas N (20)

PDF
API Testing and Hacking.pdf
PDF
API Hijacking.pdf
PDF
What should be your approach for solving ML_CV problem statements_.pdf
PDF
Deepfence.pdf
PDF
DevOps - A Purpose for an Institution.pdf
PDF
API Testing and Hacking (1).pdf
PDF
API Hijacking (1).pdf
PDF
Dapr.pdf
PDF
linkerd.pdf
PDF
HoloLens.pdf
PDF
Automated Governance for the DevOps Institutions.pdf
PDF
Lets build with DevSecOps Culture.pdf
PDF
Github Actions and Terraform.pdf
PDF
KEDA.pdf
PPTX
Ram bleed the hardware based approach for the hackers
PPTX
Container on azure
PPTX
Deeplearning and dev ops azure
PPTX
Azure data lakes
PPTX
Azure dev ops
PPTX
Azure ai on premises with docker
API Testing and Hacking.pdf
API Hijacking.pdf
What should be your approach for solving ML_CV problem statements_.pdf
Deepfence.pdf
DevOps - A Purpose for an Institution.pdf
API Testing and Hacking (1).pdf
API Hijacking (1).pdf
Dapr.pdf
linkerd.pdf
HoloLens.pdf
Automated Governance for the DevOps Institutions.pdf
Lets build with DevSecOps Culture.pdf
Github Actions and Terraform.pdf
KEDA.pdf
Ram bleed the hardware based approach for the hackers
Container on azure
Deeplearning and dev ops azure
Azure data lakes
Azure dev ops
Azure ai on premises with docker

Recently uploaded (20)

PDF
Testing WebRTC applications at scale.pdf
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PDF
RPKI Status Update, presented by Makito Lay at IDNOG 10
PPTX
SAP Ariba Sourcing PPT for learning material
PDF
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
PPTX
presentation_pfe-universite-molay-seltan.pptx
PPTX
INTERNET------BASICS-------UPDATED PPT PRESENTATION
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
DOCX
Unit-3 cyber security network security of internet system
PPTX
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
PPTX
Digital Literacy And Online Safety on internet
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PPTX
innovation process that make everything different.pptx
PDF
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PPTX
international classification of diseases ICD-10 review PPT.pptx
PDF
The Internet -By the Numbers, Sri Lanka Edition
PDF
Unit-1 introduction to cyber security discuss about how to secure a system
PPTX
PptxGenJS_Demo_Chart_20250317130215833.pptx
Testing WebRTC applications at scale.pdf
The New Creative Director: How AI Tools for Social Media Content Creation Are...
RPKI Status Update, presented by Makito Lay at IDNOG 10
SAP Ariba Sourcing PPT for learning material
Automated vs Manual WooCommerce to Shopify Migration_ Pros & Cons.pdf
presentation_pfe-universite-molay-seltan.pptx
INTERNET------BASICS-------UPDATED PPT PRESENTATION
Slides PDF The World Game (s) Eco Economic Epochs.pdf
Unit-3 cyber security network security of internet system
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
Digital Literacy And Online Safety on internet
Module 1 - Cyber Law and Ethics 101.pptx
innovation process that make everything different.pptx
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
WebRTC in SignalWire - troubleshooting media negotiation
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
international classification of diseases ICD-10 review PPT.pptx
The Internet -By the Numbers, Sri Lanka Edition
Unit-1 introduction to cyber security discuss about how to secure a system
PptxGenJS_Demo_Chart_20250317130215833.pptx

Deep learning crash course

  • 1. Deep Learning Crash Course By : Vishwas Narayan
  • 2. Deep Learning is Everywhere
  • 6. What you should Learn is to ● Build a groundbreaking intelligence just like humans through Deep Learning ● Build Neural Network with your approach and also make them make some sustainable Decision ● Understand and give your own approach to give a effective training for the deep learning model
  • 7. What is the Difference between the Ai and others Like 1. Deep Learning with ML and AI 2. Machine Learning with AI Here never forget somehow you train to get the model
  • 8. What you will learn? ● Loss Function and Optimizers ● Gradient Descent Algorithm ● Neural Network Architecture
  • 9. What is Deep Learning? Microsoft Word - Turing Test.doc (umbc.edu)
  • 11. Machine learning is basically Teaching machine learn patterns in data
  • 18. What is Deep Learning A Machine learning Technique that learn features and tasks directly from data. Inputs are run through “Neural Networks” Neural Network have hidden layers
  • 19. Why Deep Learning? ● Machines never get Fatigue. ● They need to get trained from the Human intelligence - ● They just fetch patterns.
  • 20. In Deep Learning Features can be learnt from Raw Data
  • 21. What they Really mean to us? Black Box Neuron X Y Some functional Output X Y Some functional Output that is inspired by brain
  • 26. Why do we need now? ● Data is Prevalent? ● Improved Hardware Architecture ● New Software Architecture
  • 27. Neural Networks Inspired by the Neurons on the Brain.
  • 28. Building Block of the Neural Network is Neuron
  • 29. Neural Network ● Take data as the input ● Train themselves to understand the patterns in the data
  • 30. A simple Neural Network
  • 31. Learning Process of the Neural Network 1. Learning Process of the Neural Network 2. Forward Propagation 3. Back Propagation
  • 33. Weights and Biases ● Weights - How important information can this neural network can get ● Bias - allow the right decision to be taken into consideration
  • 35. In Backpropagation Loss function helps the Neural Network Quantify the deviation front he expected output.
  • 37. Back Propagation ● Use of the Loss Function ● Go Backwards and self tune the initial weights and biases ● Values adjusted to better fit prediction of the model that is trained from the data.
  • 38. Learning Algorithm for the Neural Network ● Initialize the parameters with some tuned and calculated value. ● Feed Input data to the Network ● Compare the Predicted value with the expected data and calculate loss ● Perform Backpropagation to propagate this loss back through the network. ● Update parameters based on the loss ● Iterate the previous steps till the loss is minimized.
  • 39. Terms used in the Neural Network Activation Function ● Helps to decide whether a Neuron can be a drop out or can contribute to the next layer based on the dataset that it is trained on. ● Introduce non linearity into the Neural Network.
  • 43. Which Activation Function to use? ● For the Binary Classification: Sigmoid or the Relu is Used for the best results. ● In the case of classifiers, sigmoid functions and their combinations often perform better. ● Because of the vanishing gradient issue, sigmoids and tanh functions are sometimes avoided. ● The ReLU function is a generic activation function that is employed in the majority of applications these days.
  • 44. Activation Function Condition ● If we have dead neurons in our networks, the leaky ReLU function is the best option. ● Remember that the ReLU function should only be utilised in the hidden layers. ● As a general guideline, you should start with the ReLU function and then go on to other activation functions if the ReLU function does not produce the best results.
  • 45. The king here is the data nad queen also No matter what you are training a model using a dataset that is available for you,Neural Network is as it becomes.
  • 46. Loss Function We know that from the Random Weights and Biases the Neural Network Makes decision - Expected Output ,Weights and Biases are calculated. Thus they Quantify the deviation of the predicted output by the NEural NEtwork to the expected output.
  • 47. The loss functions in Regression are ● Absolute Error Loss ● Huber Loss ● Squared Error Loss
  • 48. The loss Function in Binary Classification Binary Cross Entropy Hinge Loss
  • 49. Multi Class Classification loss functions Multi-Class Cross Entropy Loss KL(Kullback–Leibler) Divergence
  • 50. Optimizers During the Training process we will adjust the parameters to minimize the loss function and make our model as optimized as possible for the Use.
  • 51. Optimizers are basically A function that combines to the loss function and model parameters by updating the Neural Network based on the output of the Loss Function.
  • 52. Gradient Descent Iterative function that starts off at a random Point on the loss function and travel down its slope in steps(learning rate -from user) until it reaches the lowest point of the function. 1. Again this depends on the data.but they are 2. Most popular optimizer 3. Fast,Robust,Flexible.
  • 53. Algorithm in the Lay man Terms 1. Calculate what a small change in the each individual weights would do to the loss function 2. Adjust each parameter based int eh its gradient(differential) 3. Repeat Steps one and Two until lower loss function is calculated by the Neural Network.
  • 55. To avoid getting stuck in the local minima We use Learning Rate ● Usually a small number that is multiplied to the scale of the gradients,which is any changes made to the weights are quite small. ● If we take large steps as learning rate then algorithm will tend to overshoot the global minimum ● Where we also don't want the algorithm to take forever to train and converge to the Global minimum.
  • 57. They are more robust ,why? ● Like Gradient Descent,Except uses a subset of training example rather than the entire lot. ● SGD is Gradient descent that uses batch on each training. ● Use of the Momentum to Accumulate gradients. ● Less intensive computation as they are batched.
  • 58. Backpropagation A simple implementation of the Gradient descent on the neural network.
  • 59. AdaGrad ● Adaptive learning rate to individual features. ● Some weights will have different learning rates ● Ideal for the Sparse datasets with many input examples missing ● Learning rate tends to get lower accordingly.
  • 60. Parameters and Hyperparameters ● What are model parameters? ● Variable from the neural Network whose values can be estimated from the data. ● Required by the model to make prediction ● Value define the learnt parameters from the data. ● Not set manually.saved as the Neural Network is trained. Example - Weights and Biases.
  • 61. What are model Hyper parameters? ● They are configured externally to the neural network : Value cannot be estimated until we train the dataset on the neural network ● No clear way to find the best value ● When the DL algorithm is tuned ,you are really tuning the hyperparameters. ● This is tuned manually Example - Learning Rate,C and Alpha in SVM,Epochs,k in the kNEes
  • 62. Summary Model parameters -> Estimated front the data Model Hyperparameters -> Can't be estimated from the data HyperParameters are often called as the parameters as they are a part of the Machine Learning that must set manually and tuned.
  • 63. Epochs,Batches,Batch Size and Iterations Need to learn to do this to your Neural Network when the dataset is too Big. Break the dataset into smaller chunks and feed those chunks to the Neural Network One by One.
  • 64. Epochs When the Entire dataset is passed forward to the Neural Network and only once they get trained in the Network. We use more than one epoch to help model generalize better and accurate. There is no absolute count for the dataset as its different for different datasets.
  • 65. Batch and Batch size We divide large dataset into the smaller batches and feed those batches to the Neural Network Batch Size - Total number of the training examples in the Batches.
  • 66. Iterations Number of Batches needed to complete one epoch Number f batches = Number of iterations in one epochs
  • 67. Let's have some more insights Suppose we have 1 million Number if dataset as the Training Example and you divide the dataset into the batches of 500 ,to Complete 1 Epoch ,it would take 20000 iterations.
  • 68. Conclusion for the terms used in NN’s How to design an Architecture? Which Activation Function to use? The only thing is to
  • 69. Types of Learning There are Three Main Types - ● Supervised Learning ● Unsupervised Learning ● Reinforcement Learning
  • 70. Supervised Learning ● Algorithms designed to learn from Examples. ● Models are trained on well-labelled data ● Each example has ● Input Object - Typically a Vector ● Desired Output Value Supervised Signal
  • 71. During Training Searches the pattern and correlate with the desired output.
  • 72. After Training Takes the unseen inputs and determine which label to classify it to.
  • 73. Objective of a Supervised learning model Is to predict the correct label for the unseen data.
  • 75. Supervised learning is of two types ● Classification ● Regression
  • 76. Classification ● Take Input data and assign it to a class/category. ● Models finds features in the data that correlates to the class and creates a mapping function ● This mapping function will be used to classify unseen data from testing and the validation set from the cross validation of the data
  • 77. Binary and Multiclass classification Definition and Example
  • 78. Popular Classification Algorithms ● Logistic Regression. ● Naïve Bayes. ● Stochastic Gradient Descent. ● K-Nearest Neighbours. ● Decision Tree. ● Random Forest. ● Support Vector Machine
  • 79. Regression Model tries to find a relationship between dependent and independent variable. Goal is always to predict continuous values such as a test score.
  • 80. Equation is always continuous
  • 82. Different Regression Algorithm ● Linear Regression. ● Logistic Regression. ● Ridge Regression. ● Lasso Regression. ● Polynomial Regression. ● Bayesian Linear Regression.
  • 83. Application of Supervised Learning ● Text categorization ● Face Detection ● Signature recognition ● Customer discovery ● Spam detection ● Weather forecasting ● Predicting housing prices based on the prevailing market price ● Stock price predictions, among others
  • 84. Unsupervised Learning ● Uses to manifest underlying pattern in data ● Used in Exploratory Data Analysis ● Need no labelled data,they use the feature from the Data
  • 85. Unsupervised Learning is of ● Clustering ● Association
  • 86. Clustering -Partitional Clustering ● Partitional Clustering ● Each Data point can belong to a single cluster
  • 87. Clustering - Hierarchical Clustering ● Clusters within the clusters ● Datapoint may belong to different clusters
  • 88. Association Attempts to find different relationship between the different entities. Example - Market Basket Analysis
  • 89. Some Clustering Algorithm 1. Clustering Dataset 2. Affinity Propagation 3. Agglomerative Clustering 4. BIRCH 5. DBSCAN 6. K-Means 7. Mini-Batch K-Means 8. Mean Shift 9. OPTICS 10.Spectral Clustering 11.Gaussian Mixture Model
  • 90. Application of the Unsupervised Learning ● Fraud detection ● Malware detection ● Identification of human errors during data entry ● Conducting accurate basket analysis, etc.
  • 91. Reinforcement Learning Enable the intelligent entity to learn in an interactive environment by trial and error(by Policy and reward network) based on its own actions and experience. This is a very new way of getting the things learnt. If your Neural Network doesn't work well then you have to use the Reinforcement :Learning.
  • 92. Reward and Punishment is the key here Uses the positive and negative signals as the behavior to understand what has been learnt.
  • 93. Goal of Reinforcement learning is to ● Find a Suitable model that would maximize the total cumulative reward and make a very approximate result that might help in making some more discussion. ● Maximize the points won in a training over many examples ● Penalize when they make wrong decisions ● Reward where they make Right decision Usually modelled as a “Markov Decision Process”
  • 95. Application of the Robotics ● Robotics ● Business strategy ● Traffic Light Control ● Web system configuration ● NLP ○ to personalize suggestions ○ deliver more meaningful notifications to users ○ optimize video streaming quality. ● Gaming ● Bidding
  • 96. Some core and Canonical Problems in Deep Learning Basically we find this as the situation: Model should perform well on training data and new test data Most common problem faced will always be overfitting
  • 97. So the data points as the example is here ● Data is Skew ● Data is Random ● They don't care anything and anybody they are just generated ● They are collected to make sense and make a model ● They are collected to make the right decision from the model
  • 100. Tackling Overfitting is 1. Hold-out 2. Cross-validation 3. Data augmentation 4. Feature selection 5. L1 / L2 regularization 6. Remove layers / number of units per layer 7. Dropout 8. Early stopping
  • 102. Data Augmentation Just create some fake data as much as possible from the data itself.
  • 104. Early Stopping Use Early Stopping to Halt the Training of Neural Networks At the Right Time (machinelearningmastery.com)
  • 105. When do you need to do this? Training error decreases steadily but the validation error increases after a certain point.
  • 106. Neural Network are in plenty So go the sources that I am saying in this stream.
  • 107. So we talked a lot about the models So now let's get to know how we can build a model.
  • 108. Gathering Data Picking the right data is very important,Good way to start is you need to make assumption about the data that you need.
  • 109. Size of the data set also matters No one size fits all Amount of the data needed = 10 times the model parameters
  • 110. Quality of the data also matters Data has to be more Accurate and Reliable with no Adversaries. Noiseless Features.
  • 111. Some Dataset Repositories are I will list out here.
  • 112. Pre- Processing dataset Split the dataset into the subset. Training data set Testing Data Set Validation Data Set We can randomly split the dataset
  • 113. This process depends on ● Number of the samples in eh data ● Model Being Trained
  • 114. Simple rule of thumb ● Few Hyperparameters means small validation set ● Many hyperparameters means large validation set
  • 115. The ratio in which you split the dataset is specific to your Use Case
  • 118. Look for the missing data ● Nan or Null ● Eliminated Features or the Missing Value ● Impute the Missing data
  • 119. Sampling Use a sample of the dataset
  • 120. Why we need this ● Faster Convergence ● Reduces the Disk Space
  • 121. Preprocessing is Required for the FEature Scaling ● Crucial Step for the Model TRaining: ● Normalization ● Standardization Then obviously train and Evaluate.

Editor's Notes

  • #10: Microsoft Word - Turing Test.doc (umbc.edu)
  • #43: Activation Functions In Neural Network | by Gaurav Rajpal | Analytics Vidhya | Medium
  • #54: Calculating Gradient Descent Manually | by Chi-Feng Wang | Towards Data Science