SlideShare a Scribd company logo
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 908
Classification and Prediction Based Data Mining Algorithm in Weka
Tool
Renu1, Kanika2
------------------------------------------------------------------------***-----------------------------------------------------------------------
Abstract-Process of extract unseen and hidden
information from large set of data is Data Mining. Different
techniques and algorithm are used to get the meaningful
information from the large set of data. Different
classification algorithm are used just like J48, SMO, REP
tree, Naïve Bayes, Multilayer perception to extract meaning
information from large set of dataset. Predictive data
mining that use historical data, statistical modeling, data
mining technique and machine learning to make prediction
about future outcomes. Predictive analytics used in
different area to identify risks and opportunities. Weka tool
are use to predict new data using classification and
different classifier J48,SMO,REPTree,Naïve Bayes,
Multilayer Perception are classify with dataset and find
accuracy of Multilayer perception is more efficient in
accuracy.
Keywords: Data mining, Weka tool, J48 algorithm
classification, Naïve Bayes
1. Introduction
Huge amount of data is collected daily in this information
era. Analyzing huge amount of data and extract
information from that data is necessity to achieve goals.
In data mining data cleaning, incorporating earlier
knowledge on data set and interpreting perfect solution
from the pragmatic results. Data mining[1] tool weka use
to predict new data using selling house dataset. Efficiency
of different classifier is calculated using confusion matrix
and finds multilayer perception classifier has higher
accuracy.
2. Related Technique in data mining
Different data mining techniques [3] to extract insights in
data but type of data mining technique used depends on
their data and goals. To extract information from data a
wide variety of data mining technique are employed.
 Descriptive Modeling
 Clustering
 Association
 Sequential Analysis.
 Predictive Data mining Technique
 Classification
1. Decision Tree
2. Neural network.
3. Rule Induction.
 Regression.
 Prescriptive Modeling
 Pattern Mining.
 Anomaly Detection.
3. Methodology
Weka contains a collection of classifier for data analysis
with graphical user interface for easy access. Original
non-Java version of weka was a Tel/TK front-end to
modeling algorithms implemented in other programming
languages plus data preprocessing utilities in C and a
make file based system.Orignal version was design as a
tool for analyzing data from agriculture domains. Weka3
java based version developed in 1997 is used in different
application areas particularly for education purposes and
research. Several standard data mining tasks data
preprocessing, clustering, classification, regression,
visualization and feature selection supported by
weka.Input to weka is expected to be formatted according
to the attributed relational file format.
Figure 1 Weka Data Mining Tool
4. Collect Dataset and preprocessing
Collection of related items of related data accessed
individually is dataset. Process of preparing the raw data
and making it suitable for a machine learning model just
like apply filter and convert file into arff, handling missing
data etc is data preprocessing. Used data in the paper is
collected from kaggle.com.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072
Figure 2 Dataset of house
5. Predict new data based on Dataset and
Classifier
In prediction [4] use Dataset housing and classifier J48 by
supplied
Training data as dataset and Supplied test data to predict
unknown attribute.
Figure 3 Predict new data j48 Classifier
6. Performance evaluation
Different machine and deep learning measurement can be
applied on the various classifier models. The
measurements are Accuracy, Recall and Precision is the
important criterion used to assess a model performance.
The value of the confusion matrix which is generated
during the testing of the model is considered to calculate
those measurements. A confusion matrix is N*N matrix
used for evaluating the performance of classification
model. After classification confusion matrix compares the
actual target values with predicted by the machine
learning model. Confusion matrices give a better idea of a
model performance.
Accuracy=Total correctly classified/Actual
Precision=Corrected predicted/Total predicted
Recall=correctly classified/Actual
6.1. Classifier J48
Figure 4 Classifier J48
Accuracy, precision, recall of Classifier J48 using
confusion matrix
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 909
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072
Table 1 Confusion matrix J48
a b c Total
18 1 1 20
4 10 1 15
0 3 6 9
22 14 8 44
Accuracy=Total correctly classified/Actual
= ((18+10+6)/44)*100=77.27%
Precision=Corrected predicted/Total predicted
A=18/22=0.818
B=10/14=0.714
C=6/8=0.75
Recall=correctly classified/Actual
A=18/20=0.9
B=10/15=0.667
C=6/9=0.667
6.2. Classifier SMO
Figure 5 Classifier SMO
Accuracy, precision, recall of Classifier SMO using
confusion matrix
Table 2 Confusion Matrix SMO
a b c Total
18 2 0 20
9 6 0 15
6 2 1 9
33 10 1 44
Accuracy=Total correctly classified/Actual
= ((18+6+1)/44)*100 =56.81%
Precision=Corrected predicted/Total predicted
A=18/33=0.545
B=6/10=0.6
C=1/1=1
Recall=correctly classified/Actual
A=18/20=0.9
B=6/15=0.4
C=1/9=0.1
6.3. Classifier Naïve Bayes
Figure 6 Classifier Naive Bayes
Accuracy, precision, recall of Classifier Naïve Bayes using
confusion matrix
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 910
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072
Table 3 Confusion Matrix Naive Bayes
a b c Total
15 3 2 20
4 11 0 15
5 1 3 9
24 15 5 44
Accuracy=Total correctly classified/Actual
= ((15+11+3)/44)*100
=65.90%
Precision=Corrected predicted/Total predicted
A=15/24 =0.625
B=11/15 =0.733
C=3/5 =0.6
Recall=correctly classified/Actual
A=15/20 =0.75
B=11/15 =0.733
C=3/9=0.33
6.4. Classifier REPTree
Figure 7 classifier REPTree
Accuracy, precision, recall of Classifier REPTree using
confusion matrix
Table 4 Confusion Matrix REPTree
a b c Total
20 0 0 20
15 0 0 15
9 0 0 9
44 0 0 44
Accuracy=Total correctly classified/Actual
= ((20+0+0)/44)*100=45.45%
Precision=Corrected predicted/Total predicted
A=20/44=0.455
B=0/0
C=0/0
Recall=correctly classified/Actual
A=20/20 =1
B=0/15 =0
C=0/9=0
6.5. Classifier Multilayer perception
Figure 8 Classifier Multilayer Perception
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 911
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072
Accuracy, precision, recall of Classifier Multilayer
perception using confusion matrix
Table 5 Confusion Matrix Multilayer Perception
a b c Total
19 0 1 20
0 15 0 15
1 0 8 9
20 15 9 44
Accuracy=Total correctly classified/Actual
= ((19+15+8)/44)*100=95.45%
Precision=Corrected predicted/Total predicted
A=19/20=0.95
B=15/15 =1
C=8/9=0.88
Recall=correctly classified/Actual
A=19/2 =0.95
B=15/15 =1
C=8/9 =0.88
6.6. Different Classifier Analysis
Figure 9 Different Classifiers Analysis
7. Accuracy of Different Classifier
The dataset is tested and analyze with classification
algorithm [6] those are Multilayer perception, J48, Naïve
Bayes, SMO, J48 and REPTree. Comparison of accuracy of
all classifier is done it has been find that Multilayer
Perception classifier perform best with accuracy.
Accuracy is metric for evaluating classification models.
To increase the accuracy of model various method are
used. Easiest way to improve the accuracy of model is to
handle missing values. These some methods are to
increase accuracy
 Acquire more data.
 Missing value treatment.
 Outlier treatment.
 Feature Engineering.
 Applying different model.
 Cross validation.
 Ensembling methods.
 Hyperparameter tuning.
Table 6 Different classifier Accuracy
Classifier Accuracy
Multilayer Perception 95.45%
J48 77.27%
Naïve Bayes 65.90%
SMO 56.81%
Reptree 45.45%
As above Figure10 show that accuracy of Multilayer
Perception classifier is high that is 95.45% as compare to
the other classifier.
Figure 10 Accuracy of Classifier
Conclusion
In this paper classification technique J48 is used to
predict the data using housing dataset and also analysis
the various classifiers and find that multilayer perception
perform best with high accuracy.Weka data mining tool is
easy to understand and interfaced with various
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 912
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072
technique. Hence future of data mining is promising for
further research and can be applied in different areas due
to the availability of huge databases.
References
[1] https://en.wikipedia.org/wiki/Data_mining
[2] https://medium.com
[3] Jiawei Han Michelin Kamber,”Data Mining Concepts
and Techniques”, Morgan Kaufmann Publishers
[4] M.Ramaswami and R.Bhaskaran,”A CHAID Based
performance prediction model in educational data
mining,”Journal of computer science Issues
[5] Mansi Gera Shivani goel,”Data mining techniques
methods and algorithms
[6] A Michal,”IPM developers works: IBM resource for
developers and IT”
© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 913

More Related Content

PDF
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
PDF
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
PDF
Performance Analysis of Selected Classifiers in User Profiling
PDF
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
PDF
Predicting performance of classification algorithms
PDF
Performance Evaluation: A Comparative Study of Various Classifiers
PDF
Weka project - Classification & Association Rule Generation
PDF
Performance Evaluation of Different Data Mining Classification Algorithm and ...
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
Performance Analysis of Selected Classifiers in User Profiling
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
Predicting performance of classification algorithms
Performance Evaluation: A Comparative Study of Various Classifiers
Weka project - Classification & Association Rule Generation
Performance Evaluation of Different Data Mining Classification Algorithm and ...

Similar to Classification and Prediction Based Data Mining Algorithm in Weka Tool (20)

PDF
Ijetcas14 338
PDF
IRJET- Supervised Learning Classification Algorithms Comparison
PDF
IRJET- Supervised Learning Classification Algorithms Comparison
PDF
Data mining with weka
PDF
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
PDF
Comparative Analysis of Classification Algorithms using Weka
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PDF
IRJET- Performance Evaluation of Various Classification Algorithms
PPT
Data mining techniques unit iv
DOCX
Itb weka
PPTX
UNIT 3: Data Warehousing and Data Mining
PDF
Student Performance Predictor
PPTX
Predictive analytics
PDF
IJCSI-10-6-1-288-292
PDF
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
PDF
IRJET- A Detailed Study on Classification Techniques for Data Mining
PPTX
Classification in the database system.pptx
PDF
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
PPTX
lecture_3_3.pptx Classification and pred
Ijetcas14 338
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
Data mining with weka
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
Comparative Analysis of Classification Algorithms using Weka
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
Data mining techniques unit iv
Itb weka
UNIT 3: Data Warehousing and Data Mining
Student Performance Predictor
Predictive analytics
IJCSI-10-6-1-288-292
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
IRJET- A Detailed Study on Classification Techniques for Data Mining
Classification in the database system.pptx
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
lecture_3_3.pptx Classification and pred
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Ad

Recently uploaded (20)

PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
additive manufacturing of ss316l using mig welding
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Digital Logic Computer Design lecture notes
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Construction Project Organization Group 2.pptx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Well-logging-methods_new................
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
additive manufacturing of ss316l using mig welding
UNIT 4 Total Quality Management .pptx
Lecture Notes Electrical Wiring System Components
OOP with Java - Java Introduction (Basics)
Digital Logic Computer Design lecture notes
Operating System & Kernel Study Guide-1 - converted.pdf
Arduino robotics embedded978-1-4302-3184-4.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Construction Project Organization Group 2.pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Well-logging-methods_new................

Classification and Prediction Based Data Mining Algorithm in Weka Tool

  • 1. © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 908 Classification and Prediction Based Data Mining Algorithm in Weka Tool Renu1, Kanika2 ------------------------------------------------------------------------***----------------------------------------------------------------------- Abstract-Process of extract unseen and hidden information from large set of data is Data Mining. Different techniques and algorithm are used to get the meaningful information from the large set of data. Different classification algorithm are used just like J48, SMO, REP tree, Naïve Bayes, Multilayer perception to extract meaning information from large set of dataset. Predictive data mining that use historical data, statistical modeling, data mining technique and machine learning to make prediction about future outcomes. Predictive analytics used in different area to identify risks and opportunities. Weka tool are use to predict new data using classification and different classifier J48,SMO,REPTree,Naïve Bayes, Multilayer Perception are classify with dataset and find accuracy of Multilayer perception is more efficient in accuracy. Keywords: Data mining, Weka tool, J48 algorithm classification, Naïve Bayes 1. Introduction Huge amount of data is collected daily in this information era. Analyzing huge amount of data and extract information from that data is necessity to achieve goals. In data mining data cleaning, incorporating earlier knowledge on data set and interpreting perfect solution from the pragmatic results. Data mining[1] tool weka use to predict new data using selling house dataset. Efficiency of different classifier is calculated using confusion matrix and finds multilayer perception classifier has higher accuracy. 2. Related Technique in data mining Different data mining techniques [3] to extract insights in data but type of data mining technique used depends on their data and goals. To extract information from data a wide variety of data mining technique are employed.  Descriptive Modeling  Clustering  Association  Sequential Analysis.  Predictive Data mining Technique  Classification 1. Decision Tree 2. Neural network. 3. Rule Induction.  Regression.  Prescriptive Modeling  Pattern Mining.  Anomaly Detection. 3. Methodology Weka contains a collection of classifier for data analysis with graphical user interface for easy access. Original non-Java version of weka was a Tel/TK front-end to modeling algorithms implemented in other programming languages plus data preprocessing utilities in C and a make file based system.Orignal version was design as a tool for analyzing data from agriculture domains. Weka3 java based version developed in 1997 is used in different application areas particularly for education purposes and research. Several standard data mining tasks data preprocessing, clustering, classification, regression, visualization and feature selection supported by weka.Input to weka is expected to be formatted according to the attributed relational file format. Figure 1 Weka Data Mining Tool 4. Collect Dataset and preprocessing Collection of related items of related data accessed individually is dataset. Process of preparing the raw data and making it suitable for a machine learning model just like apply filter and convert file into arff, handling missing data etc is data preprocessing. Used data in the paper is collected from kaggle.com. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072 Figure 2 Dataset of house 5. Predict new data based on Dataset and Classifier In prediction [4] use Dataset housing and classifier J48 by supplied Training data as dataset and Supplied test data to predict unknown attribute. Figure 3 Predict new data j48 Classifier 6. Performance evaluation Different machine and deep learning measurement can be applied on the various classifier models. The measurements are Accuracy, Recall and Precision is the important criterion used to assess a model performance. The value of the confusion matrix which is generated during the testing of the model is considered to calculate those measurements. A confusion matrix is N*N matrix used for evaluating the performance of classification model. After classification confusion matrix compares the actual target values with predicted by the machine learning model. Confusion matrices give a better idea of a model performance. Accuracy=Total correctly classified/Actual Precision=Corrected predicted/Total predicted Recall=correctly classified/Actual 6.1. Classifier J48 Figure 4 Classifier J48 Accuracy, precision, recall of Classifier J48 using confusion matrix © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 909
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072 Table 1 Confusion matrix J48 a b c Total 18 1 1 20 4 10 1 15 0 3 6 9 22 14 8 44 Accuracy=Total correctly classified/Actual = ((18+10+6)/44)*100=77.27% Precision=Corrected predicted/Total predicted A=18/22=0.818 B=10/14=0.714 C=6/8=0.75 Recall=correctly classified/Actual A=18/20=0.9 B=10/15=0.667 C=6/9=0.667 6.2. Classifier SMO Figure 5 Classifier SMO Accuracy, precision, recall of Classifier SMO using confusion matrix Table 2 Confusion Matrix SMO a b c Total 18 2 0 20 9 6 0 15 6 2 1 9 33 10 1 44 Accuracy=Total correctly classified/Actual = ((18+6+1)/44)*100 =56.81% Precision=Corrected predicted/Total predicted A=18/33=0.545 B=6/10=0.6 C=1/1=1 Recall=correctly classified/Actual A=18/20=0.9 B=6/15=0.4 C=1/9=0.1 6.3. Classifier Naïve Bayes Figure 6 Classifier Naive Bayes Accuracy, precision, recall of Classifier Naïve Bayes using confusion matrix © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 910
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072 Table 3 Confusion Matrix Naive Bayes a b c Total 15 3 2 20 4 11 0 15 5 1 3 9 24 15 5 44 Accuracy=Total correctly classified/Actual = ((15+11+3)/44)*100 =65.90% Precision=Corrected predicted/Total predicted A=15/24 =0.625 B=11/15 =0.733 C=3/5 =0.6 Recall=correctly classified/Actual A=15/20 =0.75 B=11/15 =0.733 C=3/9=0.33 6.4. Classifier REPTree Figure 7 classifier REPTree Accuracy, precision, recall of Classifier REPTree using confusion matrix Table 4 Confusion Matrix REPTree a b c Total 20 0 0 20 15 0 0 15 9 0 0 9 44 0 0 44 Accuracy=Total correctly classified/Actual = ((20+0+0)/44)*100=45.45% Precision=Corrected predicted/Total predicted A=20/44=0.455 B=0/0 C=0/0 Recall=correctly classified/Actual A=20/20 =1 B=0/15 =0 C=0/9=0 6.5. Classifier Multilayer perception Figure 8 Classifier Multilayer Perception © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 911
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072 Accuracy, precision, recall of Classifier Multilayer perception using confusion matrix Table 5 Confusion Matrix Multilayer Perception a b c Total 19 0 1 20 0 15 0 15 1 0 8 9 20 15 9 44 Accuracy=Total correctly classified/Actual = ((19+15+8)/44)*100=95.45% Precision=Corrected predicted/Total predicted A=19/20=0.95 B=15/15 =1 C=8/9=0.88 Recall=correctly classified/Actual A=19/2 =0.95 B=15/15 =1 C=8/9 =0.88 6.6. Different Classifier Analysis Figure 9 Different Classifiers Analysis 7. Accuracy of Different Classifier The dataset is tested and analyze with classification algorithm [6] those are Multilayer perception, J48, Naïve Bayes, SMO, J48 and REPTree. Comparison of accuracy of all classifier is done it has been find that Multilayer Perception classifier perform best with accuracy. Accuracy is metric for evaluating classification models. To increase the accuracy of model various method are used. Easiest way to improve the accuracy of model is to handle missing values. These some methods are to increase accuracy  Acquire more data.  Missing value treatment.  Outlier treatment.  Feature Engineering.  Applying different model.  Cross validation.  Ensembling methods.  Hyperparameter tuning. Table 6 Different classifier Accuracy Classifier Accuracy Multilayer Perception 95.45% J48 77.27% Naïve Bayes 65.90% SMO 56.81% Reptree 45.45% As above Figure10 show that accuracy of Multilayer Perception classifier is high that is 95.45% as compare to the other classifier. Figure 10 Accuracy of Classifier Conclusion In this paper classification technique J48 is used to predict the data using housing dataset and also analysis the various classifiers and find that multilayer perception perform best with high accuracy.Weka data mining tool is easy to understand and interfaced with various © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 912
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 10 Issue: 06 | Jun 2023 www.irjet.net p-ISSN: 2395-0072 technique. Hence future of data mining is promising for further research and can be applied in different areas due to the availability of huge databases. References [1] https://en.wikipedia.org/wiki/Data_mining [2] https://medium.com [3] Jiawei Han Michelin Kamber,”Data Mining Concepts and Techniques”, Morgan Kaufmann Publishers [4] M.Ramaswami and R.Bhaskaran,”A CHAID Based performance prediction model in educational data mining,”Journal of computer science Issues [5] Mansi Gera Shivani goel,”Data mining techniques methods and algorithms [6] A Michal,”IPM developers works: IBM resource for developers and IT” © 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 913