2. ● The term "Artificial neural network" refers to a biologically inspired sub-field of artificial intelligence modeled after the brain.
● An Artificial neural network is usually a computational network based on biological neural networks that construct the structure of the human brain.
● Similar to a human brain has neurons interconnected to each other, artificial neural networks also have neurons that are linked to each other in various layers of the networks.
● These neurons are known as nodes.
6. The architecture of an artificial neural network:
Artificial Neural Network primarily consists of three layers:
7. ● An Artificial Neural Network in the field of Artificial
intelligence where it attempts to mimic the network of neurons
makes up a human brain so that computers will have an option
to understand things and make decisions in a human-like
manner.
● The artificial neural network is designed by programming
computers to behave simply like interconnected brain cells.
● There are around 1000 billion neurons in the human brain.
Each neuron has an association point somewhere in the range
of 1,000 and 100,000.
● In the human brain, data is stored in such a manner as to be
distributed, and we can extract more than one piece of this data
when necessary from our memory parallelly.
8. ● To understand the concept of the architecture of an artificial
neural network, we have to understand what a neural network
consists of.
● In order to define a neural network that consists of a large
number of artificial neurons, which are termed units arranged
in a sequence of layers.
● Lets us look at various types of layers available in an artificial
neural network.
9. Input Layer:
As the name suggests, it accepts inputs in several different formats
provided by the programmer.
Hidden Layer:
The hidden layer presents in-between input and output layers. It
performs all the calculations to find hidden features and patterns.
Output Layer:
The input goes through a series of transformations using the
hidden layer, which finally results in output that is conveyed using
this layer.
10. ● The artificial neural network takes input and computes the
weighted sum of the inputs and includes a bias.
● This computation is represented in the form of a transfer
function.
● It determines weighted total is passed as an input to an activation
function to produce the output.
● Activation functions choose whether a node should fire or not.
● Only those who are fired make it to the output layer.
● There are distinctive activation functions available that can be
applied upon the sort of task we are performing.
32. Types of Artificial Neural Network:
Feedforward Neural Network:
● The feedforward neural network is one of the most basic artificial
neural networks.
● In this ANN, the data or the input provided travels in a single
direction.
● It enters into the ANN through the input layer and exits through
the output layer while hidden layers may or may not exist.
● So the feedforward neural network has a front-propagated wave
only and usually does not have backpropagation.
33. Types of Artificial Neural Network:
Convolutional Neural Network::
● A Convolutional neural network has some similarities to the feed-
forward neural network, where the connections between units have
weights that determine the influence of one unit on another unit.
● But a CNN has one or more than one convolutional layer that uses
a convolution operation on the input and then passes the result
obtained in the form of output to the next layer.
● CNN has applications in speech and image processing which is
particularly useful in computer vision.
34. Types of Artificial Neural Network:
Modular Neural Network:
● A Modular Neural Network contains a collection of different
neural networks that work independently towards obtaining the
output with no interaction between them.
● Each of the different neural networks performs a different sub-
task by obtaining unique inputs compared to other networks.
● The advantage of this modular neural network is that it breaks
down a large and complex computational process into smaller
components, thus decreasing its complexity while still obtaining
the required output.
35. Types of Artificial Neural Network:
Radial basis function Neural Network:
● Radial basis functions are those functions that consider the
distance of a point concerning the center.
● RBF functions have two layers.
● In the first layer, the input is mapped into all the Radial basis
functions in the hidden layer and then the output layer computes
the output in the next step.
● Radial basis function nets are normally used to model the data that
represents any underlying trend or function.
36. Types of Artificial Neural Network:
Recurrent Neural Network:
● The Recurrent Neural Network saves the output of a layer and
feeds this output back to the input to better predict the outcome
of the layer.
● The first layer in the RNN is quite similar to the feed-forward neural
network and the recurrent neural network starts once the output of
the first layer is computed.
● After this layer, each unit will remember some information from the
previous step so that it can act as a memory cell in performing
computations.
37. Applications of Artificial Neural Networks
● Social Media
● Marketing and Sales:
● Healthcare:
● Personal Assistants:
38. McCulloch-Pitts Model of Neuron
● The McCulloch-Pitts neural model, which was the earliest ANN
model,
39. ● X1…..Xn are Inputs.
● W1…..Wn are the weights for those inputs. Are of two types—
Excitatory and Inhibitory.
● The excitatory inputs have weights of positive magnitude
and the inhibitory weights have weights of negative
magnitude.
● The inputs could be either 0 or 1.
y_sum =
If y_sum>= θ , then y_out is 1
If y_sum <θ , then y_out is 0
40. Adder Function(Summation Junction):
Sums the weighted input. Inputs are multiplied with their
weights.
Activation Function (Threshold Function):
● It has a threshold function as an activation function.
● So, the output signal y_out is
○ 1 if the input y_sum >= threshold value,
○ else 0
41. ● First scenario: It is not raining, nor it is sunny
● Second scenario: It is not raining, but it is sunny
● Third scenario: It is raining, and it is not sunny
● Fourth scenario: It is raining as well as it is sunny
To analyse the situations using the McCulloch-Pitts neural model, I can consider
the input signals as follows:
● X1: Is it raining?
● X2 : Is it sunny?
Simple McCulloch-Pitts neurons can be used to design logical operations.
For that purpose, the connection weights need to be correctly decided along
with the threshold function.
42. ● So, the value of both scenarios can be either 0 or 1.
● We can use the value of both weights X1 and X2 as 1 and a threshold
function as 1.
● So, the neural network model will look like:
43. Situation x1 x2 ysum yout
1 0 0 0 0
2 0 1 1 1
3 1 0 1 1
4 1 1 2 1
Truth Table for this case will be:
From the truth table, I can conclude
that in the situations where the
value of yout is 1, John needs to
carry an umbrella.
Hence, he will need to carry an
umbrella in scenarios 2, 3 and 4.
44. Numerical:
The following inputs and weights are Fed to McCulloch Pitts neuron model.
Assuming that threshold θ is 0.3. Does the neuron fire?
Inputs(X) Weights(W)
1 1
0 -0.5
0.5 -1
46. Perceptron with Bias
● The perceptron is a single processing unit of any neural network. Frank
Rosenblatt first proposed in 1958 is a simple neuron which is used to
classify its input into one or two categories.
● Perceptron is a linear classifier, and is used in supervised learning. It helps
to organize the given input data.
● Perceptron is one of the simplest Artificial neural network
architectures.
● It is the simplest type of feedforward neural network, consisting of a
single layer of input nodes that are fully connected to a layer of output
nodes.
● It can learn the linearly separable patterns.
● it uses slightly different types of artificial neurons known as threshold
logic units (TLU).
49. Basic Components of Perceptron
● Input Features: The perceptron takes multiple input features,
each input feature represents a characteristic or attribute of the
input data.
● Weights: Each input feature is associated with a weight,
determining the significance of each input feature in influencing
the perceptron’s output. During training, these weights are
adjusted to learn the optimal values.
● Summation Function: The perceptron calculates the weighted
sum of its inputs using the summation function. The summation
function combines the inputs with their respective weights to
produce a weighted sum.
50. Basic Components of Perceptron
Activation Function:
The weighted sum is then passed through an activation function.
Perceptron uses Heaviside step function functions. which take the
summed values as input and compare with the threshold and
provide the output as 0 or 1.
Output:
The final output of the perceptron, is determined by the activation
function’s result. For example, in binary classification problems,
the output might represent a predicted class (0 or 1).
51. Basic Components of Perceptron
Bias:
A bias term is often included in the perceptron model. The bias
allows the model to make adjustments that are independent of the
input. It is an additional parameter that is learned during training.
Learning Algorithm (Weight Update Rule):
During training, the perceptron learns by adjusting its weights and
bias based on a learning algorithm. A common approach is the
perceptron learning algorithm, which updates weights based on
the difference between the predicted output and the true output.
52. These components work together to enable a perceptron to learn
and make predictions. While a single perceptron can perform
binary classification, more complex tasks require the use of
multiple perceptrons organized into layers, forming a neural
network.
53. Types of Perceptron
Single-Layer Perceptron: This type of perceptron is limited to
learning linearly separable patterns. effective for tasks where the
data can be divided into distinct categories through a straight line.
Multilayer Perceptron: Multilayer perceptrons possess enhanced
processing capabilities as they consist of two or more layers, adept
at handling more complex patterns and relationships within the data.
54. Single-Layer Perceptron:
● It is one of the oldest and first introduced neural networks.
● It was proposed by Frank Rosenblatt in 1958.
● Perceptron is also known as an artificial neural network.
● Perceptron is mainly used to compute the logical gate like AND,
OR, and NOR which has binary input and binary output.
55. Multi-Layer Perceptron:
● Multi-layer perception is also known as MLP.
● It is fully connected dense layers, which transform any input dimension to
the desired dimension.
● A multi-layer perception is a neural network that has multiple layers.
● To create a neural network we combine neurons together so that the
outputs of some neurons are inputs of other neurons.
● Multi-Layer perceptron defines the most complex architecture of artificial
neural networks.
● It is substantially formed from multiple layers of the perceptron.
● TensorFlow is a very popular deep learning framework released by, and
this notebook will guide to build a neural network with this library.
56. Multi-Layer Perceptron:
● MLP networks are used for supervised learning format. A typical learning
algorithm for MLP networks is also called back propagation's algorithm.
● A multilayer perceptron (MLP) is a feed forward artificial neural network
that generates a set of outputs from a set of inputs.
● An MLP is characterized by several layers of input nodes connected as a
directed graph between the input nodes connected as a directed graph
between the input and output layers.
● MLP uses backpropagation for training the network. MLP is a deep
learning method.
58. Limitations of Perceptron
● The perceptron was an important development in the history of neural
networks, as it demonstrated that simple neural networks could learn to
classify patterns. However, its capabilities are limited:
● The perceptron model has some limitations that can make it unsuitable for
certain types of problems:
● Limited to linearly separable problems.
● Convergence issues with non-separable data
● Requires labeled data
● Sensitivity to input scaling
59. Activation Functions
● The activation function refers to the set of transfer functions used to
achieve the desired output.
● If the weighted sum is equal to zero, then bias is added to make the output
non-zero or something else to scale up to the system's response.
● Bias has the same input, and weight equals to 1.
● Here the total of weighted inputs can be in the range of 0 to positive
infinity.
● Here, to keep the response in the limits of the desired value, a certain
maximum value is benchmarked, and the total of weighted inputs is
passed through the activation function.
61. Types of Activation Functions
● Linear or Identity Activation Function
● Non-linear Activation Function
○ Sigmoid or Logistic Activation Function
○ Tanh or hyperbolic tangent Activation Function
○ ReLU (Rectified Linear Unit) Activation Function
○ Leaky ReLU
62. Linear or Identity Activation Function
● As you can see the function is a line or linear. Therefore, the output of the
functions will not be confined between any range.
Equation : f(x) = x
Range : (-infinity to infinity)
It doesn’t help with the
complexity or various
parameters of usual data that is
fed to the neural networks.
63. Sigmoid or Logistic Activation Function
● The Sigmoid Function curve looks like a S-shape.
The main reason why we use
sigmoid function is because it
exists between (0 to 1).
Therefore, it is especially used for
models where we have to predict
the probability as an output.
Since probability of anything exists
only between the range of 0 and
1, sigmoid is the right choice.
64. ● The function is differentiable.That means, we can find the slope of the
sigmoid curve at any two points.
● The function is monotonic but function’s derivative is not.
● The logistic sigmoid function can cause a neural network to get stuck at
the training time.
● The softmax function is a more generalized logistic activation function
which is used for multiclass classification.
65. Tanh or hyperbolic tangent Activation Function
tanh is also like logistic
sigmoid but better.
The range of the tanh
function is from (-1 to 1).
tanh is also sigmoidal (s -
shaped).
The advantage is that the
negative inputs will be mapped
strongly negative and the zero
inputs will be mapped near
zero in the tanh graph.
66. ReLU (Rectified Linear Unit) Activation Function
The ReLU function is the
Rectified linear unit. It is the
most widely used activation
function. It is defined as:
The main advantage of using the
ReLU function over other
activation functions is that it does
not activate all the neurons at the
same time.
67. ● As you can see, the ReLU is half rectified (from bottom). f(z) is zero
when z is less than zero and f(z) is equal to z when z is above or equal to
zero.
● Range: [ 0 to infinity)
● The function and its derivative both are monotonic.
● But the issue is that all the negative values become zero immediately
which decreases the ability of the model to fit or train from the data
properly.
● That means any negative input given to the ReLU activation function
turns the value into zero immediately in the graph, which in turns affects
the resulting graph by not mapping the negative values appropriately.
68. Leaky ReLU
Leaky ReLU function is nothing but an improved version of the ReLU
function.Instead of defining the Relu function as 0 for x less than 0, we
define it as a small linear component of x. It can be defined as:
70. Error Backpropagation Algorithm
● Backpropagation is an algorithm that backpropagates the errors
from the output nodes to the input nodes.
● Therefore, it is simply referred to as the backward propagation of
errors.
● It uses in the vast applications of neural networks in data mining
like Character recognition, Signature verification, etc.
● Backpropagation is a widely used algorithm for training
feedforward neural networks.
71. The backpropagation algorithm works by
● computing the gradient of the loss function with respect to each
weight via the chain rule,
● computing the gradient layer by layer,
● and iterating backward from the last layer to avoid redundant
computation of intermediate terms in the chain rule.
73. Working of Backpropagation:
● Neural networks use supervised learning to generate output
vectors from input vectors that the network operates on.
● It Compares generated output to the desired output and
generates an error report if the result does not match the
generated output vector.
● Then it adjusts the weights according to the bug report to get
your desired output.
75. ●x = inputs training
vector x=(x1,x2,…………
xn).
●t = target vector
t=(t1,t2……………tn).
●δk = error at output
unit.
●δj = error at hidden
layer.
●α = learning rate.
●V0j = bias of hidden
unit j.
76. Features of Backpropagation:
● It is the gradient descent method as used in the case of simple
perceptron network with the differentiable unit.
● It is different from other networks in respect to the process by
which the weights are calculated during the learning period of the
network.
● Training is done in the three stages :
○ the feed-forward of input training pattern
○ the calculation and backpropagation of the error
○ updation of the weight
77. Backpropagation Algorithm:
Step 1: Inputs X, arrive through the preconnected path.
Step 2: The input is modeled using true weights W. Weights are usually
chosen randomly.
Step 3: Calculate the output of each neuron from the input layer to the hidden
layer to the output layer.
Step 4: Calculate the error in the outputs
Backpropagation Error = Actual Output – Desired Output
Step 5: From the output layer, go back to the hidden layer to adjust the
weights to reduce the error.
Step 6: Repeat the process until the desired output is achieved.
78. Training Algorithm :
Step 1: Initialize weight to small random values.
Step 2: While the stepsstopping condition is to be false do step 3 to 10.
Step 3: For each training pair do step 4 to 9 (Feed-Forward).
Step 4: Each input unit receives the signal unit and transmits the signal xi
signal to all the units.
79. Step 5 : Each hidden unit Zj (z=1 to a) sums its weighted input signal to
calculate its net input
80. Backpropagation Error :
Step 6: Each output unit yk (k=1 to n) receives a target pattern
corresponding to an input pattern then error is calculated as:
Step 7: Each hidden unit Zj (j=1 to a) sums its input from all units in the
layer above
81. Updation of weight and bias :
Step 8: Each output unit yk (k=1 to m) updates its bias and weight (j=1 to a). The
weight correction term is given by :
82. Step 9: Test the stopping condition. The stopping condition can be the
minimization of error, number of epochs.
Step 10: Stop