Introduction to machine learning chapter

UNIT I - Introduction to Machine Learning
& Preparing to Model

What is Human Learning?
1. The Sensory System
2. The Central Nervous System
3. The Short-term (Working) Memory Process
4. The Long-Term Memory Process
5. DIY Learning
Types of Human Learning
1. Learning under expert guidance
2. Learning guided by knowledge gained from experts
3. Learning by self

Human Learning
• Learning is referred to as the process of
gaining information through observation.
• And why do we need to learn?
• In our daily life, we need to carry out
multiple activities.
• A simple learning- walking down the street
or doing the homework.
• A complex learning - like deciding the angle
of a rocket should be launched, so that it
can have a particular route.

Human Learning ...
The Sensory System – quickly gathers all input that enters the human
brain and body – most are discarded immediately.
The Central Nervous System – (the brain and spinal cord) responsible
for coordinating selected sensory inputs into the rest of our body.
It rapidly gathers, organizes, interprets, and makes sense of the inputs, to
prepare our body and mind to adapt and take action based on need or
circumstance.

Human Learning ...
 The Short-Term (Working) Memory Process link with prior knowledge or
previous experiences with new information for understandings.
 create mental visualizations, and recognize familiar patterns, which in
turn prepares the brain to establish relationships, organize information,
create categorize, and consider new understandings.

Human Learning ...
The Long-term Memory Process – Throughout life, the human brain
develops critical connections between short-term and long- term
memories,
Which in turn expands our ideas, thoughts, interactions, feelings and
visualizations of past, present, future, or imagined events.
As the brain “pulls” retained long- term memories into working memory,
further consideration is given to select thoughts.

Human Learning ...
DIY Learning - occurs when humans apply and transfer new learnings to
other and varied circumstances.
When learning environments and conditions engage multiple
connections to the brain, humans are more likely to attempts to process,
take action, and apply new learnings, which in turn increases long – term
memory and sustained understanding.

Human Learning ...
• Innovative learning – humans explore authentic ways to transfer new ideas and
feelings to other and varied circumstances.
• It provides a rich opportunity for
• increased recall, transfer, insight, and applications reaching beyond the limits of
the subject matter into real – world problems
• The innovative learning require reasoning, problem solving, and communication.

Types of Human Learning
Human learning happens in one of the three ways –
1. Learning under expert guidance
• The subject directly teaches us by experts,
• We build our own notion indirectly, based on what we have learnt from the expert in the
past learning by self
• We do it ourselves, may be after multiple attempts, some being unsuccessful
(Self Learning)

1. Learning Under Expert Guidance
• An infant learn most of the things straight from its parents.
->Parts of body, relations, colours, fruits, birds, animals and etc.
• When the baby starts going to school, he learns from teachers
->alphabets and digits, form words from the alphabets and numbers
from the digits, form of sentences, paragraphs, complex mathematics,
science, etc.
• Then he learns other school subjects
• Then starts higher studies where the person learns about more complex,
application-oriented skills.
• The baby is able to learn all these things from his teacher who already has
knowledge on these areas.

1. Learning Under Expert Guidance …
• Then the person starts working as a professional in some field.
• He needs to learn more about the hands-on application of the knowledge that
he has acquired.
• In all phases of life of a human being, there is an element of guided learning.
• So guided learning is the process of gaining information from a person having
sufficient knowledge.

• Learning by past lessons. (Learning to make decisions)
• The knowledge which has been taught by teacher or mentor, at some point of
time in some other form/context, helpful in different situation.
• For example, a baby can group together all objects of same colour, because
at some point of time or other his parents have told him which colour is blue,
which is red, which is green, etc.
• A grown-up kid can select one odd word from a set of words because of his
ability to label the words as verbs or nouns, taught by his English teacher
long back.
• In all these situations, there is no direct learning.
• It is some past information shared on some different context, which is used
as a learning to make decisions.

3. Learning by self
• In many situations, humans are left to learn on their own.
• A classic example is a baby learning to walk through obstacles.
• Learning to ride a cycle as a kid,
• Not all things are taught by others.
• A lot of things need to be learnt only from mistakes made in the past.
• Form a check list dos and don’ts, based on our experiences.

UNIT I - Introduction to Machine Learning &
Preparing to Model
What is Machine Learning?
• How do machines learn?
1. Data input
2. Abstraction
3. Generalization

Machine Learning
• Machine learning is a branch of Artificial Intelligence (Al) And Computer
Science which focuses on the use of data and algorithms to imitate the way
that humans learn, gradually improving its accuracy.
By IBM

What is Machine learning?
• Machine learning is a subfield of artificial intelligence, which enables
machines to learn from past data or experiences without being explicitly
programmed.
• Machine learning uses statistical techniques to enable computers to learn and
make decisions.
• It is predicated on the idea that computers can learn from data, spot patterns,
and make judgments with little assistance from humans.

Machine Learning...
• A computer program is said to learn from experience E, with respect to
some class of tasks T, and performance measure P,
• if its performance at tasks in T, as measured by P, improves with experience
E. by Tom M. Mitchell
• A machine can be considered to learn, if it is able to gather experience by
doing a certain task, and improve its performance in doing the similar tasks
in the future.
• The past experience, it means past data related to the task, this data is an
input to the machine from some source.

Difference between Traditional and Machine Learning
programming
• In traditional programming, we would feed the input data and a wellwritten
and tested program into a machine to generate output.
• In machine learning, input data, along with the output, is fed into the
machine during the learning phase, and it works out a program for itself.

How do machines learn?
The basic machine learning process can be divided into three parts.
1. Data Input: Past data or information is utilized as a basis for future decision-
making
2. Abstraction: The input data is represented in a broader way through the
underlying algorithm
3. Generalization: The abstracted representation is generalized to form a
framework for making decision.

Data input
• Data is gathered from environment using sensors and/or past data taken from dataset.
• A better learning strategy needs to be adopted:
1. To be able to deal with the vastness of the subject matter and the related issues in
memorizing it
2. To be able to answer questions where a direct answer has not been learnt,
• Figure out the key points or ideas amongst a vast pool of knowledge.
• This helps in creating an outline of topics and a conceptual mapping of those
outlined topics with the entire knowledge pool.

Abstraction
• During the machine learning process, knowledge is fed in the form of input data.
• The data cannot be used in the original shape and form.
• Abstraction helps in deriving a conceptual map based on the input data.
• This map, or a model is known in the machine learning paradigm, is summarized
knowledge representation of the raw data.

Abstraction
• The model may be in any one of the following forms
1. Computational blocks like if/else rules
2. Mathematical equations
3. Specific data structures like trees or graphs
4. Logical groupings of similar observations

Abstraction
• The choice of the model used to solve a specific learning problem.
• The decision related to the choice of model is taken based on multiple aspects,
some of which are listed below:
• The type of problem to be solved:
• Whether the problem is related to forecast or prediction, analysis of trend,
understanding the different segments or groups of objects, etc.
• Nature of the input data:
• How exhaustive (completeness) the input data is, and the data types, etc.
• Domain of the problem:
• A critical domain with a high rate of data input and need for immediate decision
making.
• e.g. fraud detection problem in banking domain.

Generalization
• The abstraction process, or training the model, used for abstract the knowledge
which comes as input data in the form of a model.
• The generalization is, the abstracted knowledge to a form 'which can be used to
take future decisions.
• The model is trained based on a finite set of data, which may possess a limited set
of characteristics.
• Apply the model to take decision on a set of unknown data, usually termed as test
data, then some problems occurs.

Generalization
Then there are two problems:
• 1. The trained model is aligned with the training data too much/ hence may not
portray the actual trend.
• 2. The test data have sometimes certain characteristics unknown to the training
data.

Types of Machine Learning
• Based on the methods and way of learning, machine learning is divided into
mainly four types

Supervised Machine Learning
• Supervised machined learning is based on supervision.
• It train the machines using the labelled dataset, and based on the training, the
machine predicts the output.
• The labelled data specifies that some of the inputs are already mapped to the
output.
• Train the machine with the input and corresponding output, and then the machine
will predict the output using the test dataset.

Supervised Machine Learning
• An input dataset of cats and dog images.
• First, provide the training to the machine to understand
the images, such as the shape & size of the tail of cat and
dog, shape of eyes, colour, height (dogs are taller, cats are smaller), etc.
• After completion of training, we input the picture of a cat and ask the machine
to Identify the object and predict the output.
• Now the machine is well trained, so it will check all the features of the object,
such as height, shape, colour, eyes, ears, tail, et., and find that it’s a cat.
• So it will put it in the caat category.
• This is the process of how the machine identifies the objects in supervised learning.

Types of Supervised Machine learning
Supervised machine learning can be classified into two types of problems,
• Classification
• Regression

Classification
• Classification algorithms are used to solve the classification problems in which the
output variable is categorical,
• such as "Yes “
or No, Male or Female, Red or Blue, etc.
• The classification algorithms predict the categories present in the dataset.
• Some applications of classification algorithms are
• Spam Detection, Email filtering, etc.
• Some popular classification algorithms are given below:
• Random Forest Algorithm
• Decision Tree Algorithm
• Logistic Regression Algorithm
• Support Vector Machine Algorithm

Regression
• Regression algorithms are used to solve regression problems in which there is a
linear relationship between input and output variables.
• These are used to predict continuous output variables, such as market trends,
weather prediction, mark of student etc.
• Some popular Regression algorithms are given below:
• Simple Linear Regression Algorithm
• Multivariate Regression Algorithm
• Decision Tree Algorithm
• Lasso Regression

Advantages and Disadvantages of Supervised Learning
Advantages:
• Since supervised learning work with the labelled dataset so we can have an exact
idea about the classes of objects.
• These algorithms are helpful in predicting the output on the basis of prior
experience.
Disadvantages:
• These algorithms are not able to solve complex tasks.
• It may predict the wrong output if the test data is different from the training data.
• It requires lots of computational time to train the algorithm.

Applications of Supervised Learning
Some common applications of Supervised Learning are given below:
• Image Segmentation
• Medical Diagnosis
• Fraud Detection
• Spam detection
• Speech Recognition

Unsupervised Machine Learning
• In unsupervised machine learning, the machine is trained using the unlabeled
dataset, and the machine predicts the output without any supervision.
• The main aim of the unsupervised learning algorithm is to group or categories the
unsorted dataset according to the similarities, patterns, and differences.
• Machines are instructed to find the hidden patterns from the input dataset.

Unsupervised Machine Learning
• Input it into the machine learning model
is a basket of fruit images
• The images are totally unknown to the model, and the
task of the machine is to find the patterns and
categories of the objects.
• The machine will discover its patterns and differences,
such as colour difference, shape difference.
• Then it should predict the output when it is tested with
the test dataset.

Categories of Unsupervised Machine learning
Unsupervised learning can be further classified into two types, which are given
below:
• Clustering
• Association

Clustering
• The clustering is used to find the inherent groups from the data.
• the objects in the group are most similar to each other and no similarities with the
objects of other groups.
• An example of the clustering algorithm is grouping the customers by their
purchasing behaviour.

The Popular Clustering Algorithms
Some of the popular clustering algorithms are given below:
• K-Means Clustering algorithm
• Mean-shift algorithm
• DBSCAN Algorithm
• Principal Component Analysis
• Independent Component Analysis

Association
• Association finds interesting relations among variables within
a large dataset.
• It is used to find the dependency of one data item on another
data item.
• Based of dependency it maps those variables so that it can
generate maximum.

Advantages and Disadvantages of Unsupervised Learning
Algorithm
Advantages:
• These algorithms can be used for complicated tasks compared to the supervised ones
because these algorithms work on the unlabeled dataset.
• Unsupervised algorithms are preferable for various tasks as getting the unlabeled
dataset is easier as compared to the labelled dataset.
Disadvantages:
• The output of an unsupervised algorithm can be less accurate as the dataset is not
labelled,
• algorithms are not trained with the exact output in prior.
• It is more difficult as it works with the unlabeled dataset that does not ma with the
output.

Applications of Unsupervised Learning
• Network Analysis
• Recommendation Systems
• Anomaly Detection
• Singular Value Decomposition

Semi-Supervised Learning
• To overcome the drawbacks of supervised learning and unsupervised learning
algorithms, the concept of Semi-supervised learning is introduced.
• It is the intermediate ground between Supervised (With Labelled training data)
and Unsupervised learning (with no labelled training data) algorithms
• Hence, it uses the combination of labelled and unlabeled datasets during the
training period.
• Initially, similar data is clustered along with an unsupervised learning algorithm,
and further, it helps to label the unlabeled data into labelled data.
• It is because labelled data is a comparatively more expensive than unlabeled data.

Advantages and disadvantages of Semisupervised Learning
• Advantages:
• It is simple and easy to understand the algorithm.
• It is highly efficient.
• is used to solve drawbacks of Supervised and Unsupervised Learning algorithms.
• Disadvantages:
• Iterations results may not be stable,
• We cannot apply these algorithms to network-level data.
• Accuracy is low.

Reinforcement Learning
• Reinforcement learning works on a feedback-based process, learning from
experiences, and improve its performance.
• Agent gets rewarded for each good action and get punished for each bad action;
• hence the goal of reinforcement learning agent is to maximize the rewards.
• A reinforcement learning problem can be formalized using Markov Decision
Process (MOP).
• In MDP, the agent constantly interacts with the environment and performs actions;
at each action, the environment responds and generates a new state.
• In RL, there is no labelled data like supervised learning, and agents learn from
their experiences only.

Categories of Reinforcement Learning
Reinforcement learning is categorized mainly into two types of
methods/algorithms:
• Passive Reinforcement Learning:
• the agent's policy (sequence of actions) is fixed which means that it is told
what to do,
• Therefore, the goal of a passive RL agent is to execute a fixed policy and
evaluate it.
• Active Reinforcement Learning:
• An agent needs to decide what to do as there's no fixed policy that it can
act on.
• An active RL agent is to act and learn an optimal policy.

Applications of Reinforcement learning
• Video Games
• Resource Management
• Robotics
• Text Mining and etc.

Advantages and Disadvantages of Reinforcement Learning
Advantages
• It helps in solving complex real-world problems which are difficult to be solved
by general techniques.
• The learning model of RL is similar to the learning of human beings; hence
most accurate results can be found.
• Helps in achieving long term results.
Disadvantages
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.

Introduction to machine learning chapter

More Related Content

Similar to Introduction to machine learning chapter (20)

Recently uploaded (20)

Introduction to machine learning chapter