SlideShare a Scribd company logo
International Journal of Research and Scientific Innovation (IJRSI) | Volume IV, Issue VII, July 2017 | ISSN 2321–2705
www.rsisinternational.org Page 90
Comparative Analysis: Effective Information
Retrieval Using Different Learning Approach
Vijay Kumar Mishra
Asst. Professor, Department of Computer Application, Feroze Gandhi Institute of Engg. And Technology, Raebareli, India
Abstract – Information Retrieval is the activity of searching
meaningful information from a collection of information
resources such as Documents, relational databases and the
World Wide Web. Information retrieval system mainly consists
of two phases, storing indexed documents and retrieval of
relevant result. Retrieving information effectively from huge
data storage, it requires Machine Learning for computer
systems. Machine learning has objective to instruct computers to
use data or past experience to solve a given problem. Machine
learning has number of applications, including classifier to be
trained on email messages to learn in order to distinguish
between spam and non-spam messages, systems that analyze past
sales data to predict customer buying behavior, fraud detection
etc. Machine learning can be applied as association analysis
through supervised learning, unsupervised learning and
Reinforcement Learning. The goal of these three learning is to
provide an effective way of information retrieval from data
warehouse to avoid problems such as ambiguity. This study will
compare the effectiveness and impuissance of these learning
approaches.
Keywords – Information Retrieval (IR), Machine Learning (ML),
Supervised Machine Learning (SML) , Reinforcement Learning
(RL).
I. INTRODUCTION
achine Learning ,is a type of Artificial Intelligence
provides computers with the ability to learn to behave
more intelligently rather than just storing and retrieving data
items like a database system and other applications would do.
Machine learning is inspired with a variety of academic fields,
including statistics, biology, computer science and
psychology. The basic function of Machine learning is to tell
computers how to automatically find a good predictor based
on past experiences and this job is done by good classifier.
The process of using a model to forecast unknown values
(output variables), using a number of known values (input
variables) is called Classification. The classification process is
performed on data set D which holds following objects:
1. Set size → A = {A1,A2,…..,A|A|} , where |A| denotes the
number of attributes or the size of the set A.
2. Class label→ C: Target attribute; C = {c1, c2,….,c|C|} ,
where |C| is the number of classes and |C|>=2 .
Here ML provides the classification function which relates the
attribute values in A and classes in C on given data set D. The
most important application of ML is in Data mining. As It is
difficult to find the solution of certain problem when multiple
alternatives are available and it may be possible to make
wrong choice of alternative. So, Machine learning can often
be successfully applied to these problems, improving the
efficiency of systems and the designs of machines . In
machine learning algorithms, every instance of particular
dataset is represented by using the same set of features. The
nature of these features could be continuous, categorical or
binary. If instances are given with known labels (i.e. the
corresponding correct outputs) then the learning scheme is
known as supervised , while in unsupervised learning
approach the instances are unlabeled. Through applying these
unsupervised (clustering) algorithms, researchers are
optimistic to discover unknown, but useful, classes of items
[3]. Another kind of machine learning is reinforcement
learning. Here the training information provided to the
learning system by the environment (i.e. external trainer) is in
the form of a scalar reinforcement signal. The learner is not
told which action has to take, as in most forms of machine
learning, but instead must discover which actions yield the
most reward by trying them[2][1].
II. SUPERVISED LEARNING
Supervised learning is the machine learning approach consists
of:
1. Supervised training data to infer a function.
2. Each training data consists of a set of training
examples
3. Each example is a pair consisting of an input object
and a desired output value.
A supervised learning algorithm analyzes the training data
and produces an inferred function, which is called a classifier
(if the output is discrete, see classification) or a regression
function (if the output is continuous, see regression). The
inferred function should predict the correct output value for
any valid input object. This requires the learning algorithm to
generalize from the training data to unseen situations in a
"reasonable" way.
Supervised learning implies learning a mapping between a set
of input variables X and an output variable Y and applying
this mapping to predict the outputs for unseen data.
Supervised learning is the most important methodology in
machine learning and it also has a central importance in the
M
International Journal of Research and Scientific Innovation (IJRSI) | Volume IV, Issue VII, July 2017 | ISSN 2321–2705
www.rsisinternational.org Page 91
processing of multimedia data.[10] In face recognition,
supervised learning is applied as learning by examples in
terms of face color, structure etc so that it learn to define a
face after several repetition.
Classification architecture in supervised learning can be
understood by the following figure 1:
Fig. 1 Classification Architecture
A. HOW SUPERVISED ALGORITHMS WORKS
Given a set of training examples of the form
{(x1,y1),…..,(xn,yn)}, a learning algorithm seeks a function,
g:X->Y where X is the input space and Y is the output space.
The function g is an element of some space of possible
functions G , usually called the hypothesis space. It is
sometimes convenient to represent g using a scoring function
f:X*Y->R such that R is defined as returning the value that
gives the highest score: g(x)=arg max f(x,y) . Let F denote the
space of scoring functions.
B. PROBLEMS OF SUPERVISED LEARNING
ALGORITHMS
Human beings have natural ability to learn from past
experiences but this is not the case with computer systems.. In
supervised or Inductive machine learning, our main goal is to
learn a target function that can be used to predict the values of
a class. The process of applying supervised ML to a real-
world problem is described in below figure 2.
Fig. 2 Supervised Learning Model
The first step in this learning is to deal with dataset.To
prepare the dataset in a better way, an appropriate expert
could suggest better selection of features. If concerned expert
is not in reach, then the other approach is “brute-force”, which
means to consider each possible choices available in the hope
that the right (informative, relevant) features can be isolated.
However, a dataset collected by the “brute-force” method is
not directly suitable for induction. Ultimately, in most cases it
contains noise and missing feature values, and therefore
requires significant pre-processing [2]. In the next step, data
preparation and data preprocessing is a key function of
researcher in Supervised Machine Learning (SML).
Researchers have given techniques to deal with missing data
issue. Hodge & Austin [4] have conducted a survey of
contemporary techniques for outlier (noise) detection. Karanjit
& Shuchita[5] have also discussed different outlier detection
methods which are being used in different machine learning.
H. Jair [6] has done comparison on 6 different outlier
detection methods by performing experiment on benchmark
datasets and a synthetic astronomical domain.[1]
III. UNSUPERVISED LEARNING APPROACH
In contrast with supervised learning and reinforcement
learning, unsupervised learning has no pre defined target
outputs or environmental evaluations associated with each
input; rather than it considers only what aspects of the input
schema should be captured in the output. Unsupervised
learning studies how systems can learn to represent particular
input patterns in a way that reflects the statistical structure of
the overall collection of input patterns.
The power of unsupervised machine learning is that it can
spot important correlation and connection between data points
that no human would think to look for. This learning approach
of AI can identify the signals, pattern and linkage in data sets.
Unsupervised learning in general has a long and distinguished
history. Some early influences were Horace Barlow (see
Barlow, 1992), who sought ways of characterising neural
codes, Donald MacKay (1956), who adopted a cybernetic-
theoretic approach, and David Marr (1970), who made an
early unsupervised learning postulate about the goal of
learning in his model of the neocortex. The Hebb rule (Hebb,
1949), which links statistical methods to neurophysiological
experiments on plasticity, has also cast a long shadow. [7]
Unsupervised learning is much common in the brain than
supervised learning. For example there are around 106
photoreceptors in each human eye whose activities are
regularly changing with the world we visualize and which
give all the information about the objects, how they are
presented, what the lighting conditions are, etc.[7]
Example of unsupervised learning is Clustering or cluster
analysis. Clustering is the process of grouping the objects of
same type (in some sense) so that an object of a group is
much similar to another object. Assume the situation where
the inputs are the photoreceptor activities generated by
International Journal of Research and Scientific Innovation (IJRSI) | Volume IV, Issue VII, July 2017 | ISSN 2321–2705
www.rsisinternational.org Page 92
various images of a guava and an apple. In the space of all
possible activities, these particular inputs form two clusters,
with many fewer degrees of variation than 106
, i.e. Lower
dimension. One natural task for unsupervised learning is to
find and characterize these separate, low dimensional clusters.
Unsupervised learning is applicable in face recognition. Since
there is no desired output therefore in this situation,
categorization is done to differentiate correctly between the
face of a horse, human and cat. Two classes of method have
been suggested for unsupervised learning. Density estimation
techniques explicitly build statistical models (such as
BAYESIAN NETWORKS) of how underlying causes could
create the input. Feature extraction techniques try to extract
statistical regularities (or sometimes irregularities) directly
from the inputs.[7]
IV. REINFORCEMENT LEARNING
Reinforcement learning, a machine learning approach is
concerned with how agents ought to take actions in an
environment so as to maximize reward.
The basic reinforcement model consists of:
1. A set of environment states S.
2. A set of actions A.
3. Rules of transitioning between states.
4. Rules that determine the scalar immediate reward of
transition and
5. Rules that describe what the agent observes.
The RL method demonstrates a rather targeted generation of
candidate links between textual requirements artifacts (high
environment through a discrete sequence of steps and actions
over time t, where t = 0, 1, 2, 3, etc. At each step t, the agent
evaluates the state st ∈ S, where S is a set of all possible states.
Based on the state st, the agent selects an action at ∈ A (st),
where A is a set of possible actions available to the agent in
state st. As the result of the action taken at the moment t, the
agent gains reward rt+1, and moves to the state st+1. Figure 3
displays the reinforcement learning process:
Fig. 3 Reinforcement Learning
As shown in Figure 3, both environment model and reward
function are responsible factors to find the optimal solution.
The mapping of the state st into action at is determined by a
policy πt. Since each state st can present a set of possible
actions A(st), the policy πt denotes the probabilities of
selecting one of the possible actions determined by the state st.
The mapping of states to actions is represented as πt(s,a), the
probability of selecting action a =at, when state s=st. The
agent’s goal is to maximize the total rewards acquired in the
long run. The reward the agent collects depends upon the
actions it takes.[8]
Two components make reinforcement learning powerful:
1. The use of samples to optimize performance.
2. The use of function approximation to deal with large
environments.
Reinforcement learning can be used in large environments in
any of the following situations:
1. A model of environment is known but an analytic
solution is not available.
2. Only a solution model of the environment is given.
3. The only way to collect information about the
environment is by interacting with it.
The first two of these problems could be considered planning
problems (since some form of model is available), while the
last one could be considered as a genuine learning problem.
V. COMPARISON
Comparison of all three learning approach can be seen in
tabular form:
TABLE I. Comparison of three learning approach
Factors Supervised
Learning
Unsupervised
Learning
Reinforcement
Learning
Form of
information
provided
Each instance
has a known
label. Input is
provided with
corresponding
correct output.
Instances are
unlabeled, there is
no error or reward
signal to evaluate
a optimal solution.
Information is
provided by
environment in the
form of
reinforcement
signals. Correct
input/output pairs
are never presented.
Application Neural
networks.
Cluster based
retrieval of
images.
Robot control,
elevator scheduling,
telecommunications
.
Drawback Learning from
past
experience is
the
challenging
task for
computers.
It does not
necessarily
provides any
insight into what
the correlation and
connection
between data
points mean.
Model of the
environment must
be known.
International Journal of Research and Scientific Innovation (IJRSI) | Volume IV, Issue VII, July 2017 | ISSN 2321–2705
www.rsisinternational.org Page 93
VI. CONCLUSION
This paper represents a comparative analysis of three learning
approach for IR i.e. Supervised learning, Unsupervised
learning and Reinforcement learning. Due to scope of this
paper, it is very difficult to discuss the strength and
weaknesses of each approach of learning. The selection of
learning approach in ML is mainly depends on nature of task.
All these three learning approaches have applications in
different fields for effective information retrieval. We cannot
state any approach as the best approach or the worst approach,
as each one is applicable on the basis of how the features of
particular approach are helpful to handle the problems.
REFERENCES
[1]. Iqbal Muhammad and Zhu Yan, “Supervised Machine Learning
Approaches: a Survey”, School of Information Sciences and
Technology, Southwest Jiao tong University, China, pp. 946-947.
[2]. S. B. Kotsiantis, “Supervised Machine Learning: A Review of
Classification Techniques”, Informatics, Vol. 31, No. 3, pp. 249-
268, 2007.
[3]. James Cussens, “Machine Learning”, IEEE Journal of Computing
and Control, Vol. 7, No. 4, pp 164-168, 1996.
[4]. Victoria J. Hodge and Jim Austin, “A Survey of Outlier Detection
Methodologies”, Artificial Intelligence Review, Vol. 22, No. 2,
pp. 85-126, 2004.
[5]. Hugo Jair Escalante, “A Comparison of Outlier Detection
Algorithms for Machine Learning”, CIC-2005 Congreso
Internacional en Computation-IPN, 2005.
[6]. Karanjit Singh and Shuchita Upadhyaya, “Outlier Detection:
Applications and Techniques”, International Journal of Computer
Science Issues, Vol. 9, Issue. 1, No. 3, pp. 307-323, 2012.
[7]. Peter Dayan, “Unsupervised Learning”, MIT, pp. 1-2.
[8]. Hakim Sultanov and Jane Huffman Hayes, “Application of
Reinforcement Learning to Requirements Engineering:
Requirements Tracing, IEEE 2013, pp. 52-53.
[9]. Padraig Cunningham, Mathew Cord and Sarah Jane Delany,
“Supervised learning”, pp.21.

More Related Content

PDF
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
PDF
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
PDF
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
PDF
Comparative Study on Machine Learning Algorithms for Network Intrusion Detect...
PDF
IRJET- Design an Approach for Prediction of Human Activity Recognition us...
PDF
IRJET - Disease Detection in Plant using Machine Learning
PDF
IRJET- Student Placement Prediction using Machine Learning
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
Comparative Study on Machine Learning Algorithms for Network Intrusion Detect...
IRJET- Design an Approach for Prediction of Human Activity Recognition us...
IRJET - Disease Detection in Plant using Machine Learning
IRJET- Student Placement Prediction using Machine Learning

What's hot (20)

PDF
A Survey on Machine Learning Algorithms
PPT
Detection of plant diseases
PDF
A Study on Machine Learning and Its Working
PDF
Smart Fruit Classification using Neural Networks
PDF
The International Journal of Engineering and Science (The IJES)
PDF
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
PDF
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
PDF
Identification of Disease in Leaves using Genetic Algorithm
PDF
DEEP-LEARNING-BASED HUMAN INTENTION PREDICTION WITH DATA AUGMENTATION
PDF
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
PDF
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
PDF
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PPTX
Kapil dikshit ppt
PPTX
Regression with Microsoft Azure & Ms Excel
PDF
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
PDF
IRJET- The Machine Learning: The method of Artificial Intelligence
PDF
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
PDF
A one decade survey of autonomous mobile robot systems
A Survey on Machine Learning Algorithms
Detection of plant diseases
A Study on Machine Learning and Its Working
Smart Fruit Classification using Neural Networks
The International Journal of Engineering and Science (The IJES)
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
Identification of Disease in Leaves using Genetic Algorithm
DEEP-LEARNING-BASED HUMAN INTENTION PREDICTION WITH DATA AUGMENTATION
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Kapil dikshit ppt
Regression with Microsoft Azure & Ms Excel
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...
IRJET- The Machine Learning: The method of Artificial Intelligence
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
A one decade survey of autonomous mobile robot systems
Ad

Similar to Comparative Analysis: Effective Information Retrieval Using Different Learning Approach (20)

PPTX
Session 17-18 machine learning very important and good type student favour.pptx
PDF
machine learning
DOCX
Introduction to Machine Learning for btech 7th sem
PDF
Machine Learning Basics_Dr.Balamurugan.pdf
PDF
Chapter 5 - Machine which of Learning.pdf
PDF
Machine Learning - Deep Learning
PPTX
unit 1.2 supervised learning.pptx
PPTX
Mal8iiiiiiiiiiiiiiiii8iiiiii Unit-I.pptx
PDF
Chapter 5 - Machine of it Learning (1).pdf
PPTX
types of learningyjhjfhnjfnfnhnnnnn.pptx
PDF
Introduction to machine learning
PDF
Unit1_Types of MACHINE LEARNING 2020pattern.pdf
PDF
Mlmlmlmlmlmlmlmlmlmlmlmlmlmlmlml.lmlmlmlmlm
PDF
An Introduction to Machine Learning
PPTX
Machine Learning with Python- Methods for Machine Learning.pptx
PPTX
Day1-Introdtechhnology of techuction.pptx
PPTX
chapter Three artificial intelligence 1.pptx
PPTX
It's Machine Learning Basics -- For You!
PDF
An Overview of Supervised Machine Learning Paradigms and their Classifiers
PPTX
Machine Learning Contents.pptx
Session 17-18 machine learning very important and good type student favour.pptx
machine learning
Introduction to Machine Learning for btech 7th sem
Machine Learning Basics_Dr.Balamurugan.pdf
Chapter 5 - Machine which of Learning.pdf
Machine Learning - Deep Learning
unit 1.2 supervised learning.pptx
Mal8iiiiiiiiiiiiiiiii8iiiiii Unit-I.pptx
Chapter 5 - Machine of it Learning (1).pdf
types of learningyjhjfhnjfnfnhnnnnn.pptx
Introduction to machine learning
Unit1_Types of MACHINE LEARNING 2020pattern.pdf
Mlmlmlmlmlmlmlmlmlmlmlmlmlmlmlml.lmlmlmlmlm
An Introduction to Machine Learning
Machine Learning with Python- Methods for Machine Learning.pptx
Day1-Introdtechhnology of techuction.pptx
chapter Three artificial intelligence 1.pptx
It's Machine Learning Basics -- For You!
An Overview of Supervised Machine Learning Paradigms and their Classifiers
Machine Learning Contents.pptx
Ad

More from RSIS International (20)

PDF
Teacher’s Accomplishment Level of The Components of an E-Learning Module: A B...
PDF
Development Administration and the Challenges of Neo-liberal Reforms in the E...
PDF
The Nexus of Street Trading and Juvenile Delinquency: A Study of Chanchaga Lo...
PDF
Determination of Bacteriological and Physiochemical Properties of Som-Breiro ...
PDF
Power and Delay Analysis of Logic Circuits Using Reversible Gates
PDF
Innovative ICT Solutions and Entrepreneurship Development in Rural Area Such ...
PDF
Indigenous Agricultural Knowledge and the Sustenance of Local Livelihood Stra...
PDF
Wireless radio signal drop due to foliage in illuba bore zone ethiopia
PDF
The Bridging Process: Filipino Teachers’ View on Mother Tongue
PDF
Optimization of tungsten inert gas welding on 6063 aluminum alloy on taguchi ...
PDF
Investigation of mechanical properties of carbidic ductile cast iron
PDF
4th international conference on multidisciplinary research & practice (4ICMRP...
PDF
Six Sigma Methods and Formulas for Successful Quality Management
PDF
Task Performance Analysis in Virtual Cloud Environment
PDF
Design and Fabrication of Manually Operated Wood Sawing Machine: Save Electri...
PDF
Effect of Surface Treatment on Settlement of Coir Mat Reinforced Sand
PDF
Augmentation of Customer’s Profile Dataset Using Genetic Algorithm
PDF
System Development for Verification of General Purpose Input Output
PDF
De-noising of Fetal ECG for Fetal Heart Rate Calculation and Variability Anal...
PDF
Active Vibration Control of Composite Plate
Teacher’s Accomplishment Level of The Components of an E-Learning Module: A B...
Development Administration and the Challenges of Neo-liberal Reforms in the E...
The Nexus of Street Trading and Juvenile Delinquency: A Study of Chanchaga Lo...
Determination of Bacteriological and Physiochemical Properties of Som-Breiro ...
Power and Delay Analysis of Logic Circuits Using Reversible Gates
Innovative ICT Solutions and Entrepreneurship Development in Rural Area Such ...
Indigenous Agricultural Knowledge and the Sustenance of Local Livelihood Stra...
Wireless radio signal drop due to foliage in illuba bore zone ethiopia
The Bridging Process: Filipino Teachers’ View on Mother Tongue
Optimization of tungsten inert gas welding on 6063 aluminum alloy on taguchi ...
Investigation of mechanical properties of carbidic ductile cast iron
4th international conference on multidisciplinary research & practice (4ICMRP...
Six Sigma Methods and Formulas for Successful Quality Management
Task Performance Analysis in Virtual Cloud Environment
Design and Fabrication of Manually Operated Wood Sawing Machine: Save Electri...
Effect of Surface Treatment on Settlement of Coir Mat Reinforced Sand
Augmentation of Customer’s Profile Dataset Using Genetic Algorithm
System Development for Verification of General Purpose Input Output
De-noising of Fetal ECG for Fetal Heart Rate Calculation and Variability Anal...
Active Vibration Control of Composite Plate

Recently uploaded (20)

PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Geodesy 1.pptx...............................................
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
UNIT 4 Total Quality Management .pptx
PDF
PPT on Performance Review to get promotions
PDF
Structs to JSON How Go Powers REST APIs.pdf
PPTX
web development for engineering and engineering
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Lecture Notes Electrical Wiring System Components
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
OOP with Java - Java Introduction (Basics)
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Geodesy 1.pptx...............................................
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
UNIT 4 Total Quality Management .pptx
PPT on Performance Review to get promotions
Structs to JSON How Go Powers REST APIs.pdf
web development for engineering and engineering
CH1 Production IntroductoryConcepts.pptx
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Lecture Notes Electrical Wiring System Components
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk

Comparative Analysis: Effective Information Retrieval Using Different Learning Approach

  • 1. International Journal of Research and Scientific Innovation (IJRSI) | Volume IV, Issue VII, July 2017 | ISSN 2321–2705 www.rsisinternational.org Page 90 Comparative Analysis: Effective Information Retrieval Using Different Learning Approach Vijay Kumar Mishra Asst. Professor, Department of Computer Application, Feroze Gandhi Institute of Engg. And Technology, Raebareli, India Abstract – Information Retrieval is the activity of searching meaningful information from a collection of information resources such as Documents, relational databases and the World Wide Web. Information retrieval system mainly consists of two phases, storing indexed documents and retrieval of relevant result. Retrieving information effectively from huge data storage, it requires Machine Learning for computer systems. Machine learning has objective to instruct computers to use data or past experience to solve a given problem. Machine learning has number of applications, including classifier to be trained on email messages to learn in order to distinguish between spam and non-spam messages, systems that analyze past sales data to predict customer buying behavior, fraud detection etc. Machine learning can be applied as association analysis through supervised learning, unsupervised learning and Reinforcement Learning. The goal of these three learning is to provide an effective way of information retrieval from data warehouse to avoid problems such as ambiguity. This study will compare the effectiveness and impuissance of these learning approaches. Keywords – Information Retrieval (IR), Machine Learning (ML), Supervised Machine Learning (SML) , Reinforcement Learning (RL). I. INTRODUCTION achine Learning ,is a type of Artificial Intelligence provides computers with the ability to learn to behave more intelligently rather than just storing and retrieving data items like a database system and other applications would do. Machine learning is inspired with a variety of academic fields, including statistics, biology, computer science and psychology. The basic function of Machine learning is to tell computers how to automatically find a good predictor based on past experiences and this job is done by good classifier. The process of using a model to forecast unknown values (output variables), using a number of known values (input variables) is called Classification. The classification process is performed on data set D which holds following objects: 1. Set size → A = {A1,A2,…..,A|A|} , where |A| denotes the number of attributes or the size of the set A. 2. Class label→ C: Target attribute; C = {c1, c2,….,c|C|} , where |C| is the number of classes and |C|>=2 . Here ML provides the classification function which relates the attribute values in A and classes in C on given data set D. The most important application of ML is in Data mining. As It is difficult to find the solution of certain problem when multiple alternatives are available and it may be possible to make wrong choice of alternative. So, Machine learning can often be successfully applied to these problems, improving the efficiency of systems and the designs of machines . In machine learning algorithms, every instance of particular dataset is represented by using the same set of features. The nature of these features could be continuous, categorical or binary. If instances are given with known labels (i.e. the corresponding correct outputs) then the learning scheme is known as supervised , while in unsupervised learning approach the instances are unlabeled. Through applying these unsupervised (clustering) algorithms, researchers are optimistic to discover unknown, but useful, classes of items [3]. Another kind of machine learning is reinforcement learning. Here the training information provided to the learning system by the environment (i.e. external trainer) is in the form of a scalar reinforcement signal. The learner is not told which action has to take, as in most forms of machine learning, but instead must discover which actions yield the most reward by trying them[2][1]. II. SUPERVISED LEARNING Supervised learning is the machine learning approach consists of: 1. Supervised training data to infer a function. 2. Each training data consists of a set of training examples 3. Each example is a pair consisting of an input object and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred function, which is called a classifier (if the output is discrete, see classification) or a regression function (if the output is continuous, see regression). The inferred function should predict the correct output value for any valid input object. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. Supervised learning implies learning a mapping between a set of input variables X and an output variable Y and applying this mapping to predict the outputs for unseen data. Supervised learning is the most important methodology in machine learning and it also has a central importance in the M
  • 2. International Journal of Research and Scientific Innovation (IJRSI) | Volume IV, Issue VII, July 2017 | ISSN 2321–2705 www.rsisinternational.org Page 91 processing of multimedia data.[10] In face recognition, supervised learning is applied as learning by examples in terms of face color, structure etc so that it learn to define a face after several repetition. Classification architecture in supervised learning can be understood by the following figure 1: Fig. 1 Classification Architecture A. HOW SUPERVISED ALGORITHMS WORKS Given a set of training examples of the form {(x1,y1),…..,(xn,yn)}, a learning algorithm seeks a function, g:X->Y where X is the input space and Y is the output space. The function g is an element of some space of possible functions G , usually called the hypothesis space. It is sometimes convenient to represent g using a scoring function f:X*Y->R such that R is defined as returning the value that gives the highest score: g(x)=arg max f(x,y) . Let F denote the space of scoring functions. B. PROBLEMS OF SUPERVISED LEARNING ALGORITHMS Human beings have natural ability to learn from past experiences but this is not the case with computer systems.. In supervised or Inductive machine learning, our main goal is to learn a target function that can be used to predict the values of a class. The process of applying supervised ML to a real- world problem is described in below figure 2. Fig. 2 Supervised Learning Model The first step in this learning is to deal with dataset.To prepare the dataset in a better way, an appropriate expert could suggest better selection of features. If concerned expert is not in reach, then the other approach is “brute-force”, which means to consider each possible choices available in the hope that the right (informative, relevant) features can be isolated. However, a dataset collected by the “brute-force” method is not directly suitable for induction. Ultimately, in most cases it contains noise and missing feature values, and therefore requires significant pre-processing [2]. In the next step, data preparation and data preprocessing is a key function of researcher in Supervised Machine Learning (SML). Researchers have given techniques to deal with missing data issue. Hodge & Austin [4] have conducted a survey of contemporary techniques for outlier (noise) detection. Karanjit & Shuchita[5] have also discussed different outlier detection methods which are being used in different machine learning. H. Jair [6] has done comparison on 6 different outlier detection methods by performing experiment on benchmark datasets and a synthetic astronomical domain.[1] III. UNSUPERVISED LEARNING APPROACH In contrast with supervised learning and reinforcement learning, unsupervised learning has no pre defined target outputs or environmental evaluations associated with each input; rather than it considers only what aspects of the input schema should be captured in the output. Unsupervised learning studies how systems can learn to represent particular input patterns in a way that reflects the statistical structure of the overall collection of input patterns. The power of unsupervised machine learning is that it can spot important correlation and connection between data points that no human would think to look for. This learning approach of AI can identify the signals, pattern and linkage in data sets. Unsupervised learning in general has a long and distinguished history. Some early influences were Horace Barlow (see Barlow, 1992), who sought ways of characterising neural codes, Donald MacKay (1956), who adopted a cybernetic- theoretic approach, and David Marr (1970), who made an early unsupervised learning postulate about the goal of learning in his model of the neocortex. The Hebb rule (Hebb, 1949), which links statistical methods to neurophysiological experiments on plasticity, has also cast a long shadow. [7] Unsupervised learning is much common in the brain than supervised learning. For example there are around 106 photoreceptors in each human eye whose activities are regularly changing with the world we visualize and which give all the information about the objects, how they are presented, what the lighting conditions are, etc.[7] Example of unsupervised learning is Clustering or cluster analysis. Clustering is the process of grouping the objects of same type (in some sense) so that an object of a group is much similar to another object. Assume the situation where the inputs are the photoreceptor activities generated by
  • 3. International Journal of Research and Scientific Innovation (IJRSI) | Volume IV, Issue VII, July 2017 | ISSN 2321–2705 www.rsisinternational.org Page 92 various images of a guava and an apple. In the space of all possible activities, these particular inputs form two clusters, with many fewer degrees of variation than 106 , i.e. Lower dimension. One natural task for unsupervised learning is to find and characterize these separate, low dimensional clusters. Unsupervised learning is applicable in face recognition. Since there is no desired output therefore in this situation, categorization is done to differentiate correctly between the face of a horse, human and cat. Two classes of method have been suggested for unsupervised learning. Density estimation techniques explicitly build statistical models (such as BAYESIAN NETWORKS) of how underlying causes could create the input. Feature extraction techniques try to extract statistical regularities (or sometimes irregularities) directly from the inputs.[7] IV. REINFORCEMENT LEARNING Reinforcement learning, a machine learning approach is concerned with how agents ought to take actions in an environment so as to maximize reward. The basic reinforcement model consists of: 1. A set of environment states S. 2. A set of actions A. 3. Rules of transitioning between states. 4. Rules that determine the scalar immediate reward of transition and 5. Rules that describe what the agent observes. The RL method demonstrates a rather targeted generation of candidate links between textual requirements artifacts (high environment through a discrete sequence of steps and actions over time t, where t = 0, 1, 2, 3, etc. At each step t, the agent evaluates the state st ∈ S, where S is a set of all possible states. Based on the state st, the agent selects an action at ∈ A (st), where A is a set of possible actions available to the agent in state st. As the result of the action taken at the moment t, the agent gains reward rt+1, and moves to the state st+1. Figure 3 displays the reinforcement learning process: Fig. 3 Reinforcement Learning As shown in Figure 3, both environment model and reward function are responsible factors to find the optimal solution. The mapping of the state st into action at is determined by a policy πt. Since each state st can present a set of possible actions A(st), the policy πt denotes the probabilities of selecting one of the possible actions determined by the state st. The mapping of states to actions is represented as πt(s,a), the probability of selecting action a =at, when state s=st. The agent’s goal is to maximize the total rewards acquired in the long run. The reward the agent collects depends upon the actions it takes.[8] Two components make reinforcement learning powerful: 1. The use of samples to optimize performance. 2. The use of function approximation to deal with large environments. Reinforcement learning can be used in large environments in any of the following situations: 1. A model of environment is known but an analytic solution is not available. 2. Only a solution model of the environment is given. 3. The only way to collect information about the environment is by interacting with it. The first two of these problems could be considered planning problems (since some form of model is available), while the last one could be considered as a genuine learning problem. V. COMPARISON Comparison of all three learning approach can be seen in tabular form: TABLE I. Comparison of three learning approach Factors Supervised Learning Unsupervised Learning Reinforcement Learning Form of information provided Each instance has a known label. Input is provided with corresponding correct output. Instances are unlabeled, there is no error or reward signal to evaluate a optimal solution. Information is provided by environment in the form of reinforcement signals. Correct input/output pairs are never presented. Application Neural networks. Cluster based retrieval of images. Robot control, elevator scheduling, telecommunications . Drawback Learning from past experience is the challenging task for computers. It does not necessarily provides any insight into what the correlation and connection between data points mean. Model of the environment must be known.
  • 4. International Journal of Research and Scientific Innovation (IJRSI) | Volume IV, Issue VII, July 2017 | ISSN 2321–2705 www.rsisinternational.org Page 93 VI. CONCLUSION This paper represents a comparative analysis of three learning approach for IR i.e. Supervised learning, Unsupervised learning and Reinforcement learning. Due to scope of this paper, it is very difficult to discuss the strength and weaknesses of each approach of learning. The selection of learning approach in ML is mainly depends on nature of task. All these three learning approaches have applications in different fields for effective information retrieval. We cannot state any approach as the best approach or the worst approach, as each one is applicable on the basis of how the features of particular approach are helpful to handle the problems. REFERENCES [1]. Iqbal Muhammad and Zhu Yan, “Supervised Machine Learning Approaches: a Survey”, School of Information Sciences and Technology, Southwest Jiao tong University, China, pp. 946-947. [2]. S. B. Kotsiantis, “Supervised Machine Learning: A Review of Classification Techniques”, Informatics, Vol. 31, No. 3, pp. 249- 268, 2007. [3]. James Cussens, “Machine Learning”, IEEE Journal of Computing and Control, Vol. 7, No. 4, pp 164-168, 1996. [4]. Victoria J. Hodge and Jim Austin, “A Survey of Outlier Detection Methodologies”, Artificial Intelligence Review, Vol. 22, No. 2, pp. 85-126, 2004. [5]. Hugo Jair Escalante, “A Comparison of Outlier Detection Algorithms for Machine Learning”, CIC-2005 Congreso Internacional en Computation-IPN, 2005. [6]. Karanjit Singh and Shuchita Upadhyaya, “Outlier Detection: Applications and Techniques”, International Journal of Computer Science Issues, Vol. 9, Issue. 1, No. 3, pp. 307-323, 2012. [7]. Peter Dayan, “Unsupervised Learning”, MIT, pp. 1-2. [8]. Hakim Sultanov and Jane Huffman Hayes, “Application of Reinforcement Learning to Requirements Engineering: Requirements Tracing, IEEE 2013, pp. 52-53. [9]. Padraig Cunningham, Mathew Cord and Sarah Jane Delany, “Supervised learning”, pp.21.