SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6737
Comparison of Classification Algorithms Using Machine
Learning
Ankta Pal 1, Neelesh Shrivastava2, Pradeep Tripathi3
M.Tech Scholar, Department of Computer Science & Engineering, VITS Satna, (M.P), India, Email:ankitapal964@gmail.com 1
Asst Prof, Department of Computer Science & Engineering, VITS Satna, (M.P)2
Asst Prof & Head, Department of Computer Science & Engineering, VITS Satna, (M.P)3
---------------------------------------------------------------------***---------------------------------------------------------------------
ABSTRACT
In this work our main focus is on regression
which is one of the most important methods in
machine learning algorithm. Regression is a
statistical approach that is used to find the
relationship between variables. It is basically
used to predict the outcome from the given
dataset. In this work we will discuss the
regression algorithms which are available in
machine learning algorithm and propose one
algorithm that will have less train error and test
error as compared to other existing algorithm.
The accuracy measure will be in the formoftrain
and test error.
.Keywords: Classification, Data Mining, Linear
Regression, Machine Learningtechniques, python.
I INTRODUCTION
Machine learning systems itself grasp programs
or plan from data. This is generally a very
impressive alternative to making or substitute
constructing them and in the last some past
years the utilizing of machine learning has
increase rapidly in computer science. Machine
learning is used in Web search i.e Query search,
Network filters, recommendingin many systems,
for placing ad, To find-out credit scoring, fraud
detection, In stock trading, drug design in
medical fields, and many other applications. A
recent report from the many big and Global
Institute like McKinsey asserts that machine
learning (a.k.a. data mining or find-out future
analysis) will be the next generation technology
for society and market where we are keeping
abundant amount of data [16]. Somanymachine
learning projects extends their time to process
the given data to give better results in many
domains. By developing this technology
knowledge is fairly easy to communicate for
business requirement.
In Machine Leaning we have number of major
component out of them some is very important
to understanding about how machine learning
explore and work efficiently.
Figure 1: Evaluation of Machine Learning
Representation: A classifier can represent in
such manner (means a definite language) so that
a computer can understand easily.
Evaluation: It is like function which decides
which classifier is bad and which one is good.
This is also called objective function.
1.1Classification of Machine Learning
There are 3 branches ofmachine learningwe can
understand this classification in details with
sketch diagram.
Learning
Representation
Evaluation
Optimization
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6738
Figure 2: Classification of Machine Learning
Supervised Learning: In supervised machine
learning, a system is trained with data that has
been labeled. The labels categories each data
point into one or more groups, suchas‘apples’or
‘oranges’. The system learns how this data
known as training data is structured, and uses
this to predict the categories ofnewor‘test’data.
Unsupervised Learning: In this learning,
learning is without labels. It aims to detect the
characteristics that make data points more or
less similar to each other, for example by
creating clusters and assigning data to these
clusters.
Reinforcement Learning: In this learning
focuses on learning from experience, and lies
between unsupervised and supervised learning.
In a typical reinforcement learning setting, an
agent interactswithitsenvironment, andisgiven
a reward function that it tries to optimize, for
example the system might be rewarded for
winning a game. The goal of the agent is to learn
the consequences of its decisions, such as which
moves were important in winning a game, andto
use this learning to find strategies that maximize
its rewards.
1.2Machine Learning in Daily Life
Machine learning is using by us in day to day life
in various form outofsome namesisgiven below
with their working behavior.
Recommendersystems: suggestingproductsor
services that recommendproductsorserviceson
the basis of previous choices are amongst the
most widely recognized application of machine
learning.
Organizing information: In search engines and
spam filtering Machine learning also helps
provide the results of queries enteredin internet
search engines, such as Google.
II REVIEW OF LITERATURE
According to the authors [2], neural networks,
SVM and decision trees are the admiredschemes
for classification. In this paper [3] three
techniques are compared by applying ML
techniques on KDD CUP'99 data set. The
techniques are supposed to be good for
identifying the anomalies detection, but the
performance may differ in terms of different
algorithms.
After reading we realize that gradient tree
boosting algorithms in this part. The Explanation
follows from the same idea in existingliteratures
in gradient boosting. Specifically, the second
order method is originated from Friedman et al.
[12]. We make minor improvements in the
regularized objective, which may get helpful in
implementation or using.
The work [4] presentsthat often the case that the
matrix XtX is “close” to singular. This process is
known as multi colored in a multiple regression
model. In this phase we can find-out the OLS
estimates, but they will likely have “bad”
statistical properties. Slight variations in the
statistical data (like adding or removing a few
out puts) will lead to finding important changes
in the coefficient estimates.
III DESCRIPTION OF USED ALGORITHM
Simple Linear Regression: Simple linear
regression can be explained with the help of two
variables.
Machine Learning
Supervised ML
Unsupervised ML
Reinforcement ML
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6739
Figure 3: Model of Regression
Note:Linear Regression might be oldbut it’sstill
useful, but there’s a drawback of using linear
regression because it’s made on assumptions
that our data have linear relationships while in
many real-world scenarios that not true. It’s
quite useful to understand Linear Regression
because of its simplicity and later on it will be
useful to understand more modern approaches
and the state of The art Algorithms such as
Neural Networks and many more.
Support Vector Machine: Support Vector
Machine” (SVM)isa supervisedmachine learning
algorithm which can be used for both
classification and regression challenges.
However, it is generally used in classification
problems means to categorized problems into
solution.
Figure 4: Model of Support Vector Machine
IV EXPERIMENTAL FRAMEWORK
Python is a prominent environment using by
researcher to development or deployment of
generated systems. It has vast set of libraries
with number of modules, packages that
supports programmer to attain in manyways to
complete their work efficiently.
Python and its librariesare usingin data science
and data analysis very efficiently. They are also
largely used for creating expandable machine
learning algorithms.
Figure 5: Libraries of Python
Figure 6: GUI Anaconda
Anaconda is a totallyfreeEnvironmenttheirsourceis
really open to all for doing much.
V ALGORITHM
1. Input / Load data set
2. Apply feature extraction
3. Received Extracted data as output
4. Generate Training and Testing data set ( By
applying techniques: )
Python Utility
PANDAS
SciKit-Learn
SciPy
Matplotlib
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6740
5. Apply Multiple Classification algorithm to
training dataset (MLR)
6. Build the Reduction Explanatory Predictor
7. Building Model using different classifier
8. Perform / Obtain validity check
9. Utilize the “test” set predictions to calculate
all the performance metrics (Measure Accuracy
and other parameters)
10.Compare the Accuracy among different
classification algorithm.
VI IMPLEMENTATION
The model employs filters for faster evaluation
and lesser overall time. The pre-processing
methods and application of filters affect a lot in
final evaluation results of classifiers (ML based
models). The feature extraction methods,
conversion of nominal to binaryandcleaningare
few of those filters.
Figure 7: Proposed Data Mining Framework for Classification
In this section we have shown the output of the
regression algorithm with their residual plot, train
error and test error.
Algorithm Test Error Train Error
Ridge Regression 14.296076 12.729437
Knn 5.768323 12.492261
Bayesian Regression 0.131753 12.784852
Decision Tree 5.237878 14.264513
SVM 4.073167 5.772826
Elastic Net 14.274904 12.816194
Proposed Regression 0.131753 5.772826
Table 1: Output Results
Data Set (Load Data Set)
Feature Extraction
Split Dataset (Training & Test Dataset)
Proposed Classifier
Test Classifier
Evaluate Classifier
Knowledge (Performance)
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6741
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6742
VII CONCLUSION AND FUTURE WORK
In our work we have tried to minimize the train and
test error. So, we have already discussed about the
regression algorithms and all have their own
computation strategy. Out of these regression
algorithms we have observed that Bayesian
regression and svm is performing better in terms of
test error and train error respectively. So our
approach is basically to combine the features of
Bayesian and regression, so that we get combine
output of both. After implementing the combine
algorithm of Bayesian and svm we have shown they
are giving good result in terms of test error and train
error.
REFERENCES
[1] R. Bekkerman. The present and the future of the kdd
cup competition: an outsider's perspective.
[2] R. Bekkerman, M. Bilenko, and J. Langford. Scaling Up
Machine Learning: Parallel and Distributed
Approaches.Cambridge University Press, New York, NY,
USA, 2011.
[3] J. Bennett and S. Lanning. The netix prize. In
Proceedings of the KDD Cup Workshop 2007, pages
3{6,New York, Aug. 2007.
[4] L. Breiman. Random forests. Maching Learning,
45(1):5{32, Oct. 2001.
[5] C. Burges. From ranknet to lambdarank to
lambdamart:An overview. Learning, 11:23{581, 2010.
[6] O. Chapelle and Y. Chang. Yahoo! Learning to Rank
Challenge Overview. Journal of Machine Learning
[7] T. Chen, H. Li, Q. Yang, and Y. Yu. General functional
matrix factorization usinggradient boosting.InProceeding
of 30th International Conference on Machine Learning
(ICML'13), volume 1, pages 436{444, 2013.
[8] T. Chen, S. Singh, B. Taskar, and C. Guestrin. E_cient
second-order gradient boosting for conditional random
_elds. In Proceeding of 18th Arti_cial Intelligence and
Statistics Conference (AISTATS'15), volume 1, 2015.

More Related Content

PDF
IRJET- Syllabus and Timetable Generation System
PDF
IRJET- A Web-Based Career Spot for Placement Activities and Data Analysis
PDF
IRJET - House Price Prediction using Machine Learning and RPA
PDF
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
PDF
IRJET- Logistics Network Superintendence Based on Knowledge Engineering
PDF
Hh3512801283
PDF
IRJET- Contradicting the Hypothesis of Data Analytics with the Help of a Use-...
PDF
IRJET- Machine Learning Techniques for Code Optimization
IRJET- Syllabus and Timetable Generation System
IRJET- A Web-Based Career Spot for Placement Activities and Data Analysis
IRJET - House Price Prediction using Machine Learning and RPA
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
IRJET- Logistics Network Superintendence Based on Knowledge Engineering
Hh3512801283
IRJET- Contradicting the Hypothesis of Data Analytics with the Help of a Use-...
IRJET- Machine Learning Techniques for Code Optimization

What's hot (20)

PDF
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
PDF
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
PDF
IRJET - An User Friendly Interface for Data Preprocessing and Visualizati...
PDF
IRJET- Design an Approach for Prediction of Human Activity Recognition us...
PDF
IRJET- The Machine Learning: The method of Artificial Intelligence
PDF
Oo estimation through automation of the predictive object points sizing metric
PDF
A Comprehensive review of Conversational Agent and its prediction algorithm
PDF
Technovision
PDF
LabVIEW - Teaching tool for control design subject
PDF
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
PDF
Predicting performance of classification algorithms
PDF
IRJET - Automated Water Meter: Prediction of Bill for Water Conservation
PDF
Using Data Mining to Identify COSMIC Function Point Measurement Competence
PDF
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
PDF
PREDICTING BANKRUPTCY USING MACHINE LEARNING ALGORITHMS
PDF
Analysis of Agile and Multi-Agent Based Process Scheduling Model
PDF
IRJET - Job Portal Analysis and Salary Prediction System
PDF
AnAccurate and Dynamic Predictive Mathematical Model for Classification and P...
PDF
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
PDF
Mathematical models and algorithms challenges
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
MACHINE LEARNING ALGORITHMS FOR HETEROGENEOUS DATA: A COMPARATIVE STUDY
IRJET - An User Friendly Interface for Data Preprocessing and Visualizati...
IRJET- Design an Approach for Prediction of Human Activity Recognition us...
IRJET- The Machine Learning: The method of Artificial Intelligence
Oo estimation through automation of the predictive object points sizing metric
A Comprehensive review of Conversational Agent and its prediction algorithm
Technovision
LabVIEW - Teaching tool for control design subject
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
Predicting performance of classification algorithms
IRJET - Automated Water Meter: Prediction of Bill for Water Conservation
Using Data Mining to Identify COSMIC Function Point Measurement Competence
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
PREDICTING BANKRUPTCY USING MACHINE LEARNING ALGORITHMS
Analysis of Agile and Multi-Agent Based Process Scheduling Model
IRJET - Job Portal Analysis and Salary Prediction System
AnAccurate and Dynamic Predictive Mathematical Model for Classification and P...
EDGE DETECTION IN DIGITAL IMAGE USING MORPHOLOGY OPERATION
Mathematical models and algorithms challenges
Ad

Similar to IRJET- Comparison of Classification Algorithms using Machine Learning (20)

PPTX
Machine Learning: Transforming Data into Insights
PDF
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
PDF
IRJET- Supervised Learning Classification Algorithms Comparison
PDF
IRJET- Supervised Learning Classification Algorithms Comparison
PDF
Introduction to Machine Learning with Python ( PDFDrive.com ).pdf
PDF
IRJET- Machine Learning: Survey, Types and Challenges
PPTX
Machine learning Method and techniques
PPTX
Introduction to machine learning
PPTX
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
PDF
Choosing a Machine Learning technique to solve your need
PPTX
TE_B_10_INTERNSHIP_PPT_ANIKET_BHAVSAR.pptx
PDF
A detailed analysis of the supervised machine Learning Algorithms
PDF
Fundamentals Of Machine Learning For Predictive Data Analytics Algorithms Wor...
PPTX
internship ppt.pptx
PDF
A Study on Machine Learning and Its Working
PPTX
Machine Learning in the Financial Industry
PDF
Introduction To Machine Learning With Python A Guide For Data Scientists 1st ...
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
Machine Learning: Transforming Data into Insights
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
Introduction to Machine Learning with Python ( PDFDrive.com ).pdf
IRJET- Machine Learning: Survey, Types and Challenges
Machine learning Method and techniques
Introduction to machine learning
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
Choosing a Machine Learning technique to solve your need
TE_B_10_INTERNSHIP_PPT_ANIKET_BHAVSAR.pptx
A detailed analysis of the supervised machine Learning Algorithms
Fundamentals Of Machine Learning For Predictive Data Analytics Algorithms Wor...
internship ppt.pptx
A Study on Machine Learning and Its Working
Machine Learning in the Financial Industry
Introduction To Machine Learning With Python A Guide For Data Scientists 1st ...
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
Well-logging-methods_new................
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
Sustainable Sites - Green Building Construction
PPTX
UNIT 4 Total Quality Management .pptx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPT
Project quality management in manufacturing
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
Well-logging-methods_new................
CH1 Production IntroductoryConcepts.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Lecture Notes Electrical Wiring System Components
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
bas. eng. economics group 4 presentation 1.pptx
Lesson 3_Tessellation.pptx finite Mathematics
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Sustainable Sites - Green Building Construction
UNIT 4 Total Quality Management .pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Project quality management in manufacturing
Strings in CPP - Strings in C++ are sequences of characters used to store and...
Model Code of Practice - Construction Work - 21102022 .pdf

IRJET- Comparison of Classification Algorithms using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6737 Comparison of Classification Algorithms Using Machine Learning Ankta Pal 1, Neelesh Shrivastava2, Pradeep Tripathi3 M.Tech Scholar, Department of Computer Science & Engineering, VITS Satna, (M.P), India, Email:ankitapal964@gmail.com 1 Asst Prof, Department of Computer Science & Engineering, VITS Satna, (M.P)2 Asst Prof & Head, Department of Computer Science & Engineering, VITS Satna, (M.P)3 ---------------------------------------------------------------------***--------------------------------------------------------------------- ABSTRACT In this work our main focus is on regression which is one of the most important methods in machine learning algorithm. Regression is a statistical approach that is used to find the relationship between variables. It is basically used to predict the outcome from the given dataset. In this work we will discuss the regression algorithms which are available in machine learning algorithm and propose one algorithm that will have less train error and test error as compared to other existing algorithm. The accuracy measure will be in the formoftrain and test error. .Keywords: Classification, Data Mining, Linear Regression, Machine Learningtechniques, python. I INTRODUCTION Machine learning systems itself grasp programs or plan from data. This is generally a very impressive alternative to making or substitute constructing them and in the last some past years the utilizing of machine learning has increase rapidly in computer science. Machine learning is used in Web search i.e Query search, Network filters, recommendingin many systems, for placing ad, To find-out credit scoring, fraud detection, In stock trading, drug design in medical fields, and many other applications. A recent report from the many big and Global Institute like McKinsey asserts that machine learning (a.k.a. data mining or find-out future analysis) will be the next generation technology for society and market where we are keeping abundant amount of data [16]. Somanymachine learning projects extends their time to process the given data to give better results in many domains. By developing this technology knowledge is fairly easy to communicate for business requirement. In Machine Leaning we have number of major component out of them some is very important to understanding about how machine learning explore and work efficiently. Figure 1: Evaluation of Machine Learning Representation: A classifier can represent in such manner (means a definite language) so that a computer can understand easily. Evaluation: It is like function which decides which classifier is bad and which one is good. This is also called objective function. 1.1Classification of Machine Learning There are 3 branches ofmachine learningwe can understand this classification in details with sketch diagram. Learning Representation Evaluation Optimization
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6738 Figure 2: Classification of Machine Learning Supervised Learning: In supervised machine learning, a system is trained with data that has been labeled. The labels categories each data point into one or more groups, suchas‘apples’or ‘oranges’. The system learns how this data known as training data is structured, and uses this to predict the categories ofnewor‘test’data. Unsupervised Learning: In this learning, learning is without labels. It aims to detect the characteristics that make data points more or less similar to each other, for example by creating clusters and assigning data to these clusters. Reinforcement Learning: In this learning focuses on learning from experience, and lies between unsupervised and supervised learning. In a typical reinforcement learning setting, an agent interactswithitsenvironment, andisgiven a reward function that it tries to optimize, for example the system might be rewarded for winning a game. The goal of the agent is to learn the consequences of its decisions, such as which moves were important in winning a game, andto use this learning to find strategies that maximize its rewards. 1.2Machine Learning in Daily Life Machine learning is using by us in day to day life in various form outofsome namesisgiven below with their working behavior. Recommendersystems: suggestingproductsor services that recommendproductsorserviceson the basis of previous choices are amongst the most widely recognized application of machine learning. Organizing information: In search engines and spam filtering Machine learning also helps provide the results of queries enteredin internet search engines, such as Google. II REVIEW OF LITERATURE According to the authors [2], neural networks, SVM and decision trees are the admiredschemes for classification. In this paper [3] three techniques are compared by applying ML techniques on KDD CUP'99 data set. The techniques are supposed to be good for identifying the anomalies detection, but the performance may differ in terms of different algorithms. After reading we realize that gradient tree boosting algorithms in this part. The Explanation follows from the same idea in existingliteratures in gradient boosting. Specifically, the second order method is originated from Friedman et al. [12]. We make minor improvements in the regularized objective, which may get helpful in implementation or using. The work [4] presentsthat often the case that the matrix XtX is “close” to singular. This process is known as multi colored in a multiple regression model. In this phase we can find-out the OLS estimates, but they will likely have “bad” statistical properties. Slight variations in the statistical data (like adding or removing a few out puts) will lead to finding important changes in the coefficient estimates. III DESCRIPTION OF USED ALGORITHM Simple Linear Regression: Simple linear regression can be explained with the help of two variables. Machine Learning Supervised ML Unsupervised ML Reinforcement ML
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6739 Figure 3: Model of Regression Note:Linear Regression might be oldbut it’sstill useful, but there’s a drawback of using linear regression because it’s made on assumptions that our data have linear relationships while in many real-world scenarios that not true. It’s quite useful to understand Linear Regression because of its simplicity and later on it will be useful to understand more modern approaches and the state of The art Algorithms such as Neural Networks and many more. Support Vector Machine: Support Vector Machine” (SVM)isa supervisedmachine learning algorithm which can be used for both classification and regression challenges. However, it is generally used in classification problems means to categorized problems into solution. Figure 4: Model of Support Vector Machine IV EXPERIMENTAL FRAMEWORK Python is a prominent environment using by researcher to development or deployment of generated systems. It has vast set of libraries with number of modules, packages that supports programmer to attain in manyways to complete their work efficiently. Python and its librariesare usingin data science and data analysis very efficiently. They are also largely used for creating expandable machine learning algorithms. Figure 5: Libraries of Python Figure 6: GUI Anaconda Anaconda is a totallyfreeEnvironmenttheirsourceis really open to all for doing much. V ALGORITHM 1. Input / Load data set 2. Apply feature extraction 3. Received Extracted data as output 4. Generate Training and Testing data set ( By applying techniques: ) Python Utility PANDAS SciKit-Learn SciPy Matplotlib
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6740 5. Apply Multiple Classification algorithm to training dataset (MLR) 6. Build the Reduction Explanatory Predictor 7. Building Model using different classifier 8. Perform / Obtain validity check 9. Utilize the “test” set predictions to calculate all the performance metrics (Measure Accuracy and other parameters) 10.Compare the Accuracy among different classification algorithm. VI IMPLEMENTATION The model employs filters for faster evaluation and lesser overall time. The pre-processing methods and application of filters affect a lot in final evaluation results of classifiers (ML based models). The feature extraction methods, conversion of nominal to binaryandcleaningare few of those filters. Figure 7: Proposed Data Mining Framework for Classification In this section we have shown the output of the regression algorithm with their residual plot, train error and test error. Algorithm Test Error Train Error Ridge Regression 14.296076 12.729437 Knn 5.768323 12.492261 Bayesian Regression 0.131753 12.784852 Decision Tree 5.237878 14.264513 SVM 4.073167 5.772826 Elastic Net 14.274904 12.816194 Proposed Regression 0.131753 5.772826 Table 1: Output Results Data Set (Load Data Set) Feature Extraction Split Dataset (Training & Test Dataset) Proposed Classifier Test Classifier Evaluate Classifier Knowledge (Performance)
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6741
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 03 | Mar 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 6742 VII CONCLUSION AND FUTURE WORK In our work we have tried to minimize the train and test error. So, we have already discussed about the regression algorithms and all have their own computation strategy. Out of these regression algorithms we have observed that Bayesian regression and svm is performing better in terms of test error and train error respectively. So our approach is basically to combine the features of Bayesian and regression, so that we get combine output of both. After implementing the combine algorithm of Bayesian and svm we have shown they are giving good result in terms of test error and train error. REFERENCES [1] R. Bekkerman. The present and the future of the kdd cup competition: an outsider's perspective. [2] R. Bekkerman, M. Bilenko, and J. Langford. Scaling Up Machine Learning: Parallel and Distributed Approaches.Cambridge University Press, New York, NY, USA, 2011. [3] J. Bennett and S. Lanning. The netix prize. In Proceedings of the KDD Cup Workshop 2007, pages 3{6,New York, Aug. 2007. [4] L. Breiman. Random forests. Maching Learning, 45(1):5{32, Oct. 2001. [5] C. Burges. From ranknet to lambdarank to lambdamart:An overview. Learning, 11:23{581, 2010. [6] O. Chapelle and Y. Chang. Yahoo! Learning to Rank Challenge Overview. Journal of Machine Learning [7] T. Chen, H. Li, Q. Yang, and Y. Yu. General functional matrix factorization usinggradient boosting.InProceeding of 30th International Conference on Machine Learning (ICML'13), volume 1, pages 436{444, 2013. [8] T. Chen, S. Singh, B. Taskar, and C. Guestrin. E_cient second-order gradient boosting for conditional random _elds. In Proceeding of 18th Arti_cial Intelligence and Statistics Conference (AISTATS'15), volume 1, 2015.