Machine learningiwijshdbebhehehshshsj.pdf

Mattu University
Engineering and Technology College
Department of Computer Science
Target Dept.:- Computer Science 4th year Evening
By: Tekalign B. (takkee525@gmail.com)
Elective II
(Introduction to Machine Learning)
ML

Lecture Four
Neural Networks
Engineering and Technology College Mattu University

 Neural networks are a subset of machine
learning that are modeled loosely after the
human brain and are used to recognize
patterns and solve a variety of complex
problems in areas such as image
recognition, speech recognition, and
natural language processing.
What is a Neural Network?
At its core, a neural network is a collection
of neurons, or nodes, linked together in a
fashion that mimics the human brain. Each
neuron receives inputs, processes those
inputs, and produces outputs based on those
inputs. Neural networks are typically
organized into layers:
Overview

 Neural networks mimic the basic functioning of the human brain and
are inspired by how the human brain interprets information. They solve
various real-time tasks because of its ability to perform computations
quickly and its fast responses.
What are Neural Networks?
Neural Networks as functions such
that
NN :X → Y,
where X can be a continuous space
[0, 1]n or a discrete space {0, 1}
n and Y =
[0, 1] or {0, 1}, correspondingly. In
this way, it can be thought of as a
classifier,
but it can also be used to
approximate other real-valued
functions.

 We use artificial neural networks because they learn very
efficiently and adaptively. They have the capability to learn “how”
to solve a specific problem from the training data it receives. After
learning, it can be used to solve that specific problem very quickly
and efficiently with high accuracy.
 Some real-life applications of neural networks include Air Traffic
Control, Optical Character Recognition as used by some scanning
apps like Google Lens, Voice Recognition, etc.
How Do Neural Networks Learn?
 Neural networks learn through a process called training. During
training, the network is fed data that is already labeled with correct
answers. It makes predictions based on its current state of weights
and biases, compares these predictions to the correct answers, and
then adjusts its weights and biases to improve its predictions. This
process is repeated many times, and the network improves
gradually.
Importance of Neural Networks

1. Forward Propagation: The data is passed through the network
to generate an output.
2. Loss Function: This calculates the difference between the
network's prediction and the actual outcome.
3. Backpropagation: This process adjusts the weights and biases
of the network to minimize the loss.
4. Optimization Algorithms: These are methods like gradient
descent that guide the adjustment of weights to reduce the loss
most efficiently.
Types of Learnings in Neural Networks
 Supervised Learning
 Unsupervised Learning
 Reinforcement Learning
Training Process

Neural networks are employed across various domains for:
 Identifying objects, faces, and understanding spoken language in
applications like self-driving cars and voice assistants.
 Analyzing and understanding human language, enabling sentiment
analysis, chatbots, language translation, and text generation.
 Diagnosing diseases from medical images, predicting patient outcomes,
and drug discovery.
 Predicting stock prices, credit risk assessment, fraud detection, and
algorithmic trading.
 Personalizing content and recommendations in e-commerce, streaming
platforms, and social media.
What are Neural Networks Used For?

 Powering robotics and autonomous vehicles by processing
sensor data and making real-time decisions.
 Enhancing game AI, generating realistic graphics, and creating
immersive virtual environments.
 Monitoring and optimizing manufacturing processes,
predictive maintenance, and quality control.
 Analyzing complex datasets, simulating scientific phenomena,
and aiding in research across disciplines.
 Generating music, art, and other creative content.
What are Neural Networks Used For…

1. Image Recognition: Neural networks can identify objects within
images with great accuracy. For example, a neural network trained on
thousands of images of dogs and cats can learn to distinguish between
the two animals.
2. Speech Recognition: Neural networks are used in voice-activated
systems, like digital assistants, to interpret human speech.
3. Natural Language Processing: Models like GPT (which powers this
chatbot) use neural networks to understand and generate human-like
text based on the input they receive.
Example Applications of NN

 Neural networks are inspired by the
structure and function of the human brain.
They consist of interconnected nodes called
artificial neurons, arranged in layers. It is
composed of an input layer, one or more
hidden layers, and an output layer made up
of layers of artificial neurons that are
coupled. The two stages of the basic process
are called backpropagation and forward
propagation.
Layers
Input Layer: Receives raw data, like pixels
from an image.
Hidden Layers: Perform the bulk of the
processing, often multiple layers are stacked.
Output Layer: Produces the final result, like a
classification (cat or dog) or a prediction
(temperature tomorrow).
Connections
 Each neuron connects to others in the next
layer with associated weights.
 Weights determine the influence of one
neuron on another.
How Does a Neural Network work?

 Input Layer: Each feature in the
input layer is represented by a node
on the network, which receives
input data.
 Weights and Connections: The
weight of each neuronal connection
indicates how strong the connection
is. Throughout training, these
weights are changed.
 Hidden Layers: Each hidden layer
neuron processes inputs by
multiplying them by weights, adding
them up, and then passing them
through an activation function. By
doing this, non-linearity is
introduced, enabling the network to
recognize intricate patterns.
 Output: The final result is produced
by repeating the process until the
output layer is reached.
Forward Propagation in Neural Network

 Loss Calculation: The network’s output is
evaluated against the real goal values, and a loss
function is used to compute the difference. For a
regression problem, the Mean Squared Error
(MSE) is commonly used as the cost function.
 Loss Function: MSE =
 Gradient Descent: Gradient descent is then used
by the network to reduce the loss. To lower the
inaccuracy, weights are changed based on the
derivative of the loss with respect to each weight.
 Adjusting weights: The weights are adjusted at
each connection by applying this iterative process,
or backpropagation, backward across the network.
 Training: During training with different data
samples, the entire process of forward
propagation, loss calculation, and backpropagation
is done iteratively, enabling the network to adapt
and learn patterns from the data.
 Actvation Functions: Model non-linearity is
introduced by activation functions like the
rectified linear unit (ReLU) or sigmoid. Their
decision on whether to “fire” a neuron is based on
the whole weighted input.
Backpropagation

There are seven types of neural networks that can be used.
 Feedforward Networks(Perceptron): A feedforward neural network is a
simple artificial neural network architecture in which data moves from input to
output in a single direction. It has input, hidden, and output layers; feedback
loops are absent. Its straightforward architecture makes it appropriate for a
number of applications, such as regression and pattern recognition. This is a
simplest type of neural network, consisting of a single layer of neurons.
Perceptrons can only learn linear relationships between inputs and
outputs.
The perceptron consists of 4 parts.
 Input values or One input layer
 Weights and Bias
 Net sum
 Activation Function

Types of Neural Networks

NN-Working Explained
 An artificial neuron can be thought of as a simple or multiple linear regression
model with an activation function at the end. A neuron from layer i will take the
output of all the neurons from the later i-1 as inputs calculate the weighted sum
and add bias to it.
 The first neuron from the first layer is connected to all the inputs from the
previous layer, Similarly, the second neuron from the first hidden layer will also
be connected to all the inputs from the previous layer and so on for all the
neurons in the first hidden layer.
 For neurons in the second hidden layer (outputs of the previously hidden layer)
are considered as inputs and each of these neurons are connected to previous
neurons, likewise. This whole process is called Forward propagation.

NN-Working Explanation Contd..
• After this, there is an interesting thing that happens. Once we have predicted the
output it is then compared to the actual output. We then calculate the loss and try
to minimize it. But how can we minimize this loss? For this, there comes another
concept which is known as Back Propagation. First, the loss is calculated then
weights and biases are adjusted in such a way that they try to minimize the loss.
Weights and biases are updated with the help of another algorithm called gradient
descent. We basically move in the direction opposite to the gradient. This concept
is derived from the Taylor series.

Types of Neural Networks
There are seven types of neural networks that can be used.
• Feedforward Networks(Perceptron): A feedforward neural network is a
simple artificial neural network architecture in which data moves from input to
output in a single direction. It has input, hidden, and output layers; feedback loops
are absent. Its straightforward architecture makes it appropriate for a number of
applications, such as regression and pattern recognition. This is a simplest type of
neural network, consisting of a single layer of neurons. Perceptrons can only
learn linear relationships between inputs and outputs.
The perceptron consists of 4 parts.
 Input values or One input layer
 Weights and Bias
 Net sum
 Activation Function

How does it work?
The perceptron works on these
simple steps
a. All the inputs x are multiplied with their weights w.
Let’s call it k.
b. Add all the multiplied values and call them
Weighted Sum.
c. Apply that weighted sum to the correct
Activation Function. Activation functions are
used to map the input between the required
values like (0, 1) or (-1, 1).
Weights shows the strength of the particular node.
A bias value allows you to shift the activation function
curve up or down.

Types of Neural Networks Contd..
• Multilayer Perceptron (MLP): MLP is a type of feedforward neural network
with three or more layers, including an input layer, one or more hidden layers,
and an output layer. It uses nonlinear activation functions. A more complex
network with multiple hidden layers between the input and output layers. MLPs
can learn non-linear relationships between inputs and outputs, making them
more powerful than perceptrons. MultiLayer Perceptron Neural Network is a
Neural Network with multiple layers, and all its layers are connected. It uses a
BackPropagation algorithm for training the model. Multilayer Perceptron is a
class of Deep Learning, also known as MLP.

MLP Contd..
Input Layer
It is the initial or starting layer of the Multilayer perceptron. It takes input from the training data set and forwards it to the hidden
layer. There are n input nodes in the input layer. The number of input nodes depends on the number of dataset features. Each input
vector variable is distributed to each of the nodes of the hidden layer.
Hidden Layer
It is the heart of all Artificial neural networks. This layer comprises all computations of the neural network. The edges of the
hidden layer have weights multiplied by the node values. This layer uses the activation function.
There can be one or two hidden layers in the model.
Several hidden layer nodes should be accurate as few nodes in the hidden layer make the model unable to work efficiently with
complex data. More nodes will result in an overfitting problem.
Output Layer
This layer gives the estimated output of the Neural Network. The number of nodes in the output layer depends on the type of
problem. For a single targeted variable, use one node. N classification problem, ANN uses N nodes in the output layer.

Working of MultiLayer Perceptron Neural Network
• The input node represents the feature of the dataset.
• Each input node passes the vector input value to the hidden layer.
• In the hidden layer, each edge has some weight multiplied by the input variable. All the
production values from the hidden nodes are summed together. To generate the output
• The activation function is used in the hidden layer to identify the active nodes.
• The output is passed to the output layer.
• Calculate the difference between predicted and actual output at the output layer.
• The model uses backpropagation after calculating the predicted output.

• Convolutional Neural Network (CNN): is a specialized artificial neural
network designed for image processing. It employs convolutional layers to
automatically learn hierarchical features from input images, enabling effective
image recognition and classification. CNNs have revolutionized computer vision
and are pivotal in tasks like object detection and image analysis.

Working of Convolutional Neural Network(CNN)
 Convolutional Neural Networks (CNNs) are a type of deep learning neural network
architecture that is particularly well suited to image classification and object recognition
tasks. A CNN works by transforming an input image into a feature map, which is then
processed through multiple convolutional and pooling layers to produce a predicted
output.
 A convolutional neural network starts by taking an input image, which is then transformed
into a feature map through a series of convolutional and pooling layers. The convolutional
layer applies a set of filters to the input image, each filter producing a feature map that
highlights a specific aspect of the input image. The pooling layer then down samples the
feature map to reduce its size, while retaining the most important information.
 The feature map produced by the convolutional layer is then passed through multiple
additional convolutional and pooling layers, each layer learning increasingly complex
features of the input image. The final output of the network is a predicted class label or
probability score for each class, depending on the task.

Layers of Convolutional Neural Network
The layers of a Convolutional Neural Network (CNN) can be broadly classified into the following
categories:
1. Convolutional Layer: The convolutional layer is responsible for extracting features from the input
image. It performs a convolution operation on the input image, where a filter or kernel is applied to
the image to identify and extract specific features.
2. Pooling Layer: The pooling layer is responsible for reducing the spatial dimensions of the feature
maps produced by the convolutional layer. It performs a down-sampling operation to reduce the size
of the feature maps and reduce computational complexity.
3. Activation Layer: The activation layer applies a non-linear activation function, such as the ReLU
function, to the output of the pooling layer. This function helps to introduce non-linearity into the
model, allowing it to learn more complex representations of the input data.

Layers of CNN Contd..
4.Fully Connected Layer: The fully connected layer is a traditional neural
network layer that connects all the neurons in the previous layer to all the neurons
in the next layer. This layer is responsible for combining the features learned by the
convolutional and pooling layers to make a prediction.
5.Normalization Layer: The normalization layer performs normalization
operations, such as batch normalization or layer normalization, to ensure that the
activations of each layer are well-conditioned and prevent overfitting.
6. Dropout Layer: The dropout layer is used to prevent overfitting by randomly
dropping out neurons during training. This helps to ensure that the model does not
memorize the training data but instead generalizes to new, unseen data.

Layers of CNN Contd..
7. Dense Layer: After the convolutional and pooling layers have extracted features
from the input image, the dense layer can then be used to combine those features
and make a final prediction. In a CNN, the dense layer is usually the final layer and
is used to produce the output predictions(Play key Role). The activations from the
previous layers are flattened and passed as inputs to the dense layer, which
performs a weighted sum of the inputs and applies an activation function to
produce the final output.

Benefits of Convolutional Neural Network
1.Feature extraction: CNNs are capable of automatically extracting relevant
features from an input image, reducing the need for manual feature engineering.
2.Spatial invariance: CNNs can recognize objects in an image regardless of their
location, size, or orientation, making them well-suited to object recognition tasks.
3.Robust to noise: CNNs can often handle noisy or cluttered images, making them
useful for real-world applications where image quality may be variable.
4.Transfer learning: CNNs can leverage pre-trained models, reducing the amount
of data and computational resources required to train a new model.
5.Performance: CNNs have demonstrated state-of-the-art performance on a range
of computer vision tasks, including image classification, object detection, and
semantic segmentation.

Limitations of Convolutional Neural Network
Computational cost: Training a deep CNN can be computationally expensive,
requiring significant amounts of data and computational resources.
Overfitting: Deep CNNs are prone to overfitting, especially when trained on small
datasets, where the model may memorize the training data rather than generalize to
new, unseen data.
Lack of interpretability: CNNs are considered to be a “black box” model, making
it difficult to understand why a particular prediction was made.
Limited to grid-like structures: CNNs are limited to grid-like structures and
cannot handle irregular shapes or non-grid-like data structures.

• Recurrent Neural Network (RNN): An artificial neural network type intended
for sequential data processing is called a Recurrent Neural Network (RNN). It
is appropriate for applications where contextual dependencies are critical, such as
time series prediction and natural language processing, since it makes use of
feedback loops, which enable information to survive within the network. It can
handle sequential data, such as text or time series data. RNNs can process
information from previous steps and use it to influence future predictions.
Recurrent Neural Network also known as (RNN) that works better than a
simple neural network when data is sequential like Time-Series data and text
data.

How Does Recurrent Neural Networks Work?
 The input layer ‘x’ takes in the input to the neural network
and processes it and passes it onto the middle layer.
 The middle layer ‘h’ can consist of multiple hidden layers,
each with its own activation functions and weights and
biases. If you have a neural network where the various
parameters of different hidden layers are not affected by the
previous layer, i.e.. the neural network does not have
memory, then you can use a recurrent neural network.
 The Recurrent Neural Network will standardize the different
activation functions and weights and biases so that each
hidden layer has the same parameters. Then, instead of
creating multiple hidden layers, it will create one and
loop over it as many times as required.

Why Recurrent Neural Networks?
RNN were created because there were a few issues in the feed-forward neural
network:
• Cannot handle sequential data
• Considers only the current input
• Cannot memorize previous inputs
The solution to these issues is the RNN. An RNN can handle sequential data,
accepting the current input data, and previously received inputs. RNNs can
memorize previous inputs due to their internal memory.

Advantages of Recurrent Neural Network
Recurrent Neural Networks (RNNs) have several advantages over other types of
neural networks, including:
• Ability To Handle Variable-Length Sequences
RNNs are designed to handle input sequences of variable length, which makes
them well-suited for tasks such as speech recognition, natural language processing,
and time series analysis.
• Memory Of Past Inputs
RNNs have a memory of past inputs, which allows them to capture information
about the context of the input sequence. This makes them useful for tasks such as
language modeling, where the meaning of a word depends on the context in which
it appears.
• Parameter Sharing
RNNs share the same set of parameters across all time steps, which reduces the
number of parameters that need to be learned and can lead to better generalization.
• Non-Linear Mapping
RNNs use non-linear activation functions, which allows them to learn complex,
non-linear mappings between inputs and outputs.

Advantages of Recurrent Neural Network…
• Sequential Processing
RNNs process input sequences sequentially, which makes them computationally
efficient and easy to parallelize.
• Flexibility
RNNs can be adapted to a wide range of tasks and input types, including text,
speech, and image sequences.
• Improved Accuracy
RNNs have been shown to achieve state-of-the-art performance on a variety of
sequence modeling tasks, including language modeling, speech recognition, and
machine translation.
These advantages make RNNs a powerful tool for sequence modeling and analysis,
and have led to their widespread use in a variety of applications, including natural
language processing, speech recognition, and time series analysis.

• Long Short-Term Memory (LSTM): LSTM is a type of RNN that is designed
to overcome the vanishing gradient problem in training RNNs. It uses memory
cells and gates to selectively read, write, and erase information.
• A special type of RNN that is better at learning long-term dependencies in data.
LSTMs are widely used for tasks such as machine translation and speech
recognition.

How LSTMs work
Core Concept: Memory and Gates
• LSTMs introduce a concept called a memory cell. This cell acts like a conveyor
belt that can store information for long periods. Crucial information from past
inputs can persist in the cell, while irrelevant details can be selectively forgotten.
To control the flow of information through the cell, LSTMs use three special gates:
1.Forget Gate: Decides what information to forget from the cell's current state.
2.Input Gate: Determines what new information to store in the cell.
3.Output Gate: Controls what information from the cell state to include in the
output of the LSTM.

Step-by-Step Processing OF lSTM
1.Input Layer: Receives the current input in the sequence.
2.Forget Gate: Analyzes the previous cell state and current input. It outputs a value between
0 and 1 for each element in the cell state, indicating how much to forget (0) or retain (1).
3.Input Gate: Processes the previous cell state and current input. It generates a candidate
value for the new information to be stored in the cell and another value deciding which
parts of the candidate value to incorporate.
4.Cell State Update: The forget gate's output is multiplied by the previous cell state to
determine what to forget. The input gate's output is multiplied by the candidate value to
decide what new information to add. These are summed to update the cell state.
5.Output Gate: Examines the current cell state and prior hidden state. It outputs a value
between 0 and 1 for each element in the cell state, determining how much of the cell state
to include in the output.
6.Hidden Layer: The output gate's output is multiplied by the cell state to create the output
of the LSTM cell, which is then passed to the next LSTM cell in the sequence or used for
prediction.

Benefits of LSTMs
• Learning Long-Term Dependencies: LSTMs can effectively learn important information
from far back in the sequence, making them suitable for tasks like machine translation,
speech recognition, and time series forecasting.
• Selective Memory: LSTMs can forget irrelevant details while remembering crucial
information, improving their performance on long sequences.
Example of LTSM Working
Bob is a nice person. Dan, on the Other hand, is evil
Let’s take an example to understand how LSTM works. Here we have two sentences
separated by a full stop. The first sentence is “Bob is a nice person,” and the second sentence
is “Dan, on the Other hand, is evil”. It is very clear, in the first sentence, we are talking
about Bob, and as soon as we encounter the full stop(.), we started talking about Dan.
As we move from the first sentence to the second sentence, our network should realize that
we are no more talking about Bob. Now our subject is Dan. Here, the Forget gate of the
network allows it to forget about it. Let’s understand the roles played by these gates in
LSTM architecture.

Tekalign B. (Department of Computer Science)
End of Chapter Four

Machine learningiwijshdbebhehehshshsj.pdf

More Related Content

Similar to Machine learningiwijshdbebhehehshshsj.pdf (20)

Recently uploaded (20)

Machine learningiwijshdbebhehehshshsj.pdf