SlideShare a Scribd company logo
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 816
COMPARATIVE STUDY OF VARIOUS
SUPERVISEDCLASSIFICATION METHODSFORANALYSING
DEFORESTATION FACTORS
S.P.Rajagopalan1
, C.Lalitha2
1
Professor, G.K.M College of Engineering &Technology, Chennai, TamilNadu
2
Research Scholar, Vels University, Chennai, TamilNadu
Abstract
In this paper, various supervised classification techniques are compared and the results are demonstrated. Here the classification
techniques like Decision tree, Bayesian method, Neural Networks and Rule Based method are discussed with regards to the data
sets given. The population, built up and agriculture are the factors that play a vital role in the development of country which
directly affects economic condition. In this paper, factors such as road , population, built up development ,agriculture and
industry are considered as drivers of deforestation in the study area which is located in the Erode District of TamilNadu, India.
Keywords: Supervised Classification, Decision tree, Bayesian method, Neural Network.
--------------------------------------------------------------------***------------------------------------------------------------------
1. INTRODUCTION
There are two types of classification, supervised and
unsupervised. Supervised methods classify the data which is
known and observed by the user specifically. Unsupervised
methods are classified unknowably. The results are obtained
with the given data sets by using the WEKA. The aim of this
paper is to compare the various supervised methods by using
the factors such as demographic, built up, road and
agriculture. The primary methods used in Data mining are
Data selection, data reduction and filtration. Data mining
examines and discovers various algorithms under several
computational efficiency. It integrates machine learning,
pattern recognition, statistics, databases, and visualization
techniques into one so that the information can be extracted
from the large databases.
The tasks of data mining are association rules mining,
classification, prediction and cluster analysis. Generally
speaking, association rule mining and classification rule
mining are the most effective and efficient techniques in
data mining. Classification rule mining is used for the
prediction future objects whose class label is not known.
Recently it has been determined that primary factor for the
degradation of ecosystem is deforestation of forests.
Classification results are basis for interpretation, analysis
and modelling for various environmental and socio-
economic applications. Data mining techniques can be
applied for generating the class association rules for
analysing the deforestation. In this paper, we applied various
supervised classification techniques with our data sets.
2. WEKA
The Waikato Environment for Knowledge Analysis
(WEKA) is a tool for machine learning algorithms which
can be used for classification and clustering. In this paper,
we use decision tree methods like J48 and Random forest.
Bayes algorithm such as Naive Bayes and Neural Network
methods like Multilayer perceptron are implemented in
WEKA. We divided our data set into 10 cross validation
folds and all the methods are tested and compared according
to the data.
3. CLASSIFICATION METHODS
3.1 Decision Tree
Decision Tree (DT) worksaccording to the processing and
deciding upon attributable data. Here attributes in DT are
considered as nodes and each leaf node as a class. J48 and
Random Forest were used in our experiments. It follows a
recursivemethod for a given set of data.It searches the
attributes as Depth-first strategy. It divides the class into
several nodes and tests each node that gives the best result.It
classifies the datasets invariably .This method is not suitable
for finding anddid not show good results to the given
datasets.Accuracy value of J48 methods is compared and
shown in Table 1 and Figure1.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 817
Random forest is also called regression trees that induces the
data from bootstrap samples of the training data.It uses
random feature selection by induction process. Itis
comparatively gives better result than CART and C4.5.
It shows a better performance, after modelling the result.
The disadvantages of DT are focus on continues attributes,
computational efficiently with growing tree size. According
to comparison provided for different classification methods
in emotion recognition, Random Forest is the best classifier
method on that group with 5 attributes and the results are
compared and shown in Table 2 and Figure 2.
3.2 Artificial Neural Networks
3.2.1 MLP
Artificial Neural Network (ANN) is the common
classification methods in data mining. Neural Network
based classifiers, Multi Layer Perceptron (MLP) and Radial
Base Function (RBF)were used in this work. MLP is a feed
forward network that makes a model to map input data to
output data. Hidden layer in MLP can include various layers
between input and output. It classifies 3 factors gradually.
The accuracy result of MLP is shown in Table 3 and Figure
3.
3.2.2RBF
RBF is another type on ANN. The input is linear and the
output is nonlinear. Here it hides several non sequential
values because of random input. At first it shows a good
result for the year 1990 and 2010, but not for 2000.So that
this algorithm is not suitable for the given datasets. The RBF
networks are divided in two feed-forward layer is shown in
Table 4 and Figure 4.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 818
3.3 Bayesian Methods
Bayesian methods are one of the solution for the
classification methods in data mining. In our work, Naive
Bayesis implemented for classification. It follows an
independent feature model with strong independence
assumptions. This method is applicable for the statistical
data.
Classification is done by appropriately to the attributes
Cropand built up. It classifies and shows the better
performance to 10% for all the years. This method shows
good results for two attributes and is displayed in Table 5
and Figure 5.
3.4 Rule Based Classification
Decision trees can be translated into a set of rules by
creating a separate rule for each path from the root to a leaf
in the tree. However, rules can also be directly induced from
training data using a variety of rule-based algorithms.
Classification accuracy of rule learning algorithms can be
improved by combining features using the background
knowledge of the user automatic feature construction
algorithms.
3.4.1 1R Algorithm
One Rule is a simple accurate, classification algorithm that
generates one rule for each predictor in the data and then
selects the rule with the smallest total error as its "one rule".
It is the simplest rule-based classification learning algorithm
for discrete attributes. It shows a gradual increase in the
years 1990 and 2000 for all the attributes but suddenly no
changes for the year 2010 because OneR produces rules
only slightly less accurate .The result of this method is
shown in Table 6 and Figure 6.
3.4.2 Prism
Prism is a greedy algorithm that finds a minimum spanning
tree for a connected weighted undirected graph. This means
it finds a subset of the edges that forms a tree that includes
every vertex, where the total weight of all the edges in the
tree is minimized. It can also be used to find the minimum
spanning forest and reflects the same result for all the three
decades , is in Table 7 and Figure 7.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 819
4 COMPARATIVE RESULTS
The Machine learning techniques such as Naïve Bayes,
Bayes network, J48, Random Forest, Multi Layer Perceptron
(MLP) and Radial Base Function (RBF) ,one R, PRISM
were used for simulation. Here we split our original dataset
of 405 samples into 66% for training purpose and remaining
34% for testing purpose. Weka incorporates k-fold cross-
validation, in which the original sample is randomly
partitioned into k subsamples. The cross-validation process
is then repeated into several subsamples .Here ,we have used
10-fold cross validation .Kappa gives a numerical rating of -
1 to 1 scale, where 1 is perfect agreement, 0 implies
expected by chance, and negative values indicate agreement
lesser than chance, Comparatively PRISM gives best result
and shows the degradation of forest. The correctly classified
Instances of various years are shown in Table8,9,10and the
accuracy rates are displayed in Figure 8,9,10.
Fig. 8-1990
Fig.9-2000
Fig. 10-2010
5. CONCLUSIONS AND FUTURE WORK
In this paper, four classifiers including Neural Network,
Naïve Bayesian, Rule based, Decision tree were tested to
determine the deforestation from the dataset of demographic
factors. All the result were classified as 1,-1 or 0. There are
many different mining and classification algorithms, and
parameter settings in each algorithm. Experimental results in
this paper are based on the default settings. Extensive
experiments with different settings are applicable in WEKA.
J48 is very simple classifier to make a decision tree, but it
gives the invariable result in the experiment. Naïve Bayesian
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 820
classifier also showed good result only for two
attributes(Crop and Built up), but RBF classified another
two attributes(Industry and Road) properly. Rule Based
classifiers such as OneR and PRISM also showed good
result than compared with J48 or Naïves Bayesian classifier.
From this experiment, we can find that a simple Random
Forest classifier can provide best classification result for
deforestation(except one attribute).It is planned to
incorporate other techniques like different ways of feature
selection, classification using ontology.
REFERENCES
[1] PeimanMamaniBarnaghi+,VahidAlizadehSahzabi
and Azuraliza Abu Bakar “A Comparative Study for
Various Methods of Classification”, 2012
International Conference on Information and
Computer Networks (ICICN 2012).
[2] LinlinXu, Jonathan Li, Alexander Brenning“A
comparative study of different classification
techniques for marine oil spill identification using
RADARSAT-1 imagery”,Remote Sensing of
Environment 141 (2014) 14–23.
[3] .Elaine Astrand1, Pierre Enel2, Guilhem Ibos1, Peter
Ford Dominey2, Pierre Baraduc1,Suliann Ben
Hamed1* “Comparison of Classifiers for Decoding
Sensory and Cognitive Information from Prefrontal
Neuronal Populations “,PLOS ONE |
www.plosone.org.
[4] S. B. Kotsiantis “Supervised Machine Learning: A
Review of Classification Techniques”Informatica31
(2007) 249-268 249.
[5] Archana Chaudhary1, Savita Kolhe2 & Raj
Kamal3”Machine Learning Classification
Techniques: A Comparative Study “,International
Journal on Advanced Computer Theory and
Engineering (IJACTE) ISSN ISSN (Print) : 2319 –
2526, Volume-2, Issue-4, 2013.
[6] Haowen You1 and George Rumbe2 “Comparative
Study of Classification Techniques on Breast Cancer
FNA Biopsy Data” International Journal of
ArtificialIntelligence and Interactive Multimedia,
Vol. 1, Nº 3.
[7] PeimanMamaniBarnaghi+,VahidAlizadehSahzabi
and Azuraliza Abu Bakar “A Comparative Study for
Various Methods of Classification” 2012
International Conference on Information and
Computer Networks (ICICN 2012) IPCSIT vol. 27
(2012) © (2012) IACSIT Press, Singapore
[8] Gaya Buddhinathand Damien Derry “A Simple
Enhancement to One Rule Classification”.
[9] SeongwookYoun and Dennis McLeod “A
Comparative Study for Email Classification”
[10] K.Saritha1, S. Jyothi2 & K. R. Manjula3 “Class
association rule mining for analyzing deforestation
factors” International Journal of Civil,
Structural,Environmental and Infrastructure
Engineering Research and Development
(IJCSEIERD) ISSN(P): 2249-6866; ISSN(E): 2249-
7978 Vol. 3, Issue 5, Dec 2013, 237-248
[11] K.R.Manjula ,Prof. S. Jyothi, S.Anand Kumar Varma
“Analysing the factors of deforestation using GIS
“www.geospatialworld.net/paper/application/Article
View.aspx?aid=30273

More Related Content

PDF
Data mining techniques
PDF
Performance Evaluation of Different Data Mining Classification Algorithm and ...
PDF
IRJET- Missing Data Imputation by Evidence Chain
PDF
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
PDF
IRJET- A Detailed Study on Classification Techniques for Data Mining
PDF
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
PDF
Hypothesis on Different Data Mining Algorithms
PDF
11.software modules clustering an effective approach for reusability
Data mining techniques
Performance Evaluation of Different Data Mining Classification Algorithm and ...
IRJET- Missing Data Imputation by Evidence Chain
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEY
IRJET- A Detailed Study on Classification Techniques for Data Mining
A ROBUST MISSING VALUE IMPUTATION METHOD MIFOIMPUTE FOR INCOMPLETE MOLECULAR ...
Hypothesis on Different Data Mining Algorithms
11.software modules clustering an effective approach for reusability

What's hot (20)

PDF
G046024851
PDF
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
PDF
Ijetcas14 338
PDF
IRJET- Evidence Chain for Missing Data Imputation: Survey
PDF
A02610104
PDF
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
PDF
Gene Selection Based on Rough Set Applications of Rough Set on Computational ...
PDF
Data Analysis and Prediction System for Meteorological Data
PDF
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMS
PDF
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
PDF
Analysis on Data Mining Techniques for Heart Disease Dataset
PDF
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PDF
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
PDF
C LUSTERING B ASED A TTRIBUTE S UBSET S ELECTION U SING F AST A LGORITHm
PDF
Research scholars evaluation based on guides view
PDF
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
PDF
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
PDF
Research scholars evaluation based on guides view using id3
G046024851
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
Ijetcas14 338
IRJET- Evidence Chain for Missing Data Imputation: Survey
A02610104
EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORM...
Gene Selection Based on Rough Set Applications of Rough Set on Computational ...
Data Analysis and Prediction System for Meteorological Data
MULTI-PARAMETER BASED PERFORMANCE EVALUATION OF CLASSIFICATION ALGORITHMS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
Analysis on Data Mining Techniques for Heart Disease Dataset
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
SURVEY ON CLASSIFICATION ALGORITHMS USING BIG DATASET
C LUSTERING B ASED A TTRIBUTE S UBSET S ELECTION U SING F AST A LGORITHm
Research scholars evaluation based on guides view
K-Medoids Clustering Using Partitioning Around Medoids for Performing Face Re...
CLUSTERING DICHOTOMOUS DATA FOR HEALTH CARE
Research scholars evaluation based on guides view using id3
Ad

Viewers also liked (20)

PDF
Localization based range map stitching in wireless sensor network under non l...
PDF
Umts femto access point for higher data rate and better quality of service to...
PDF
Design of passengers vehicle body on fire accidents
PDF
Cross language information retrieval in indian
PDF
Radiation pattern of yagi uda antenna using usrp on gnu
PDF
Compact quasi self complementary elliptical disc
PDF
Redeeming of processor for cyber physical systems
PDF
Websites using touchless interaction having graphical
PDF
Use of cloud federation without need of identity federation using dynamic acc...
PDF
Dorsal hand vein pattern authentication by hough peaks
PDF
Enhancement in power delay product by driver and interconnect optimization
PDF
Design of floating handlebar suspension
PDF
Process monitoring, controlling and load management system in an induction motor
PDF
A comparative study on the customer perception of the
PDF
Conceptual design of injection mould tool for inlet
PDF
Comparative study of slot loaded rectangular and triangular microstrip array ...
PDF
Natural and calcined clayey diatomite as cement replacement materials microst...
PDF
Product quality improved using triz a case study in increasing innovative opt...
PDF
Investigation of various parameters on the
PDF
The green aggregates for sustainable development in construction industry
Localization based range map stitching in wireless sensor network under non l...
Umts femto access point for higher data rate and better quality of service to...
Design of passengers vehicle body on fire accidents
Cross language information retrieval in indian
Radiation pattern of yagi uda antenna using usrp on gnu
Compact quasi self complementary elliptical disc
Redeeming of processor for cyber physical systems
Websites using touchless interaction having graphical
Use of cloud federation without need of identity federation using dynamic acc...
Dorsal hand vein pattern authentication by hough peaks
Enhancement in power delay product by driver and interconnect optimization
Design of floating handlebar suspension
Process monitoring, controlling and load management system in an induction motor
A comparative study on the customer perception of the
Conceptual design of injection mould tool for inlet
Comparative study of slot loaded rectangular and triangular microstrip array ...
Natural and calcined clayey diatomite as cement replacement materials microst...
Product quality improved using triz a case study in increasing innovative opt...
Investigation of various parameters on the
The green aggregates for sustainable development in construction industry
Ad

Similar to Comparative study of various supervisedclassification methodsforanalysing deforestation factors (20)

PPT
Data Mining In Market Research
PPT
Data Mining in Market Research
PPT
Data Mining In Market Research
 
PDF
IRJET- A Comparative Research of Rule based Classification on Dataset using W...
PDF
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
PDF
Predicting performance of classification algorithms
PDF
Comprehensive Survey of Data Classification & Prediction Techniques
PDF
Assessment of Decision Tree Algorithms on Student’s Recital
PDF
A Survey of Modern Data Classification Techniques
PDF
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
PDF
IJCSI-10-6-1-288-292
PDF
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
PDF
Classification Techniques: A Review
PDF
J48 and JRIP Rules for E-Governance Data
PDF
Distributed Digital Artifacts on the Semantic Web
PDF
A Decision Tree Based Classifier for Classification & Prediction of Diseases
PDF
Survey on Various Classification Techniques in Data Mining
PPTX
Data mining technique (decision tree)
PDF
Gloeocercospora sorghiGloeocercospora sorghi
PDF
A Method for Vibration Testing Decision Tree-Based Classification Systems.
Data Mining In Market Research
Data Mining in Market Research
Data Mining In Market Research
 
IRJET- A Comparative Research of Rule based Classification on Dataset using W...
PREDICTING PERFORMANCE OF CLASSIFICATION ALGORITHMS
Predicting performance of classification algorithms
Comprehensive Survey of Data Classification & Prediction Techniques
Assessment of Decision Tree Algorithms on Student’s Recital
A Survey of Modern Data Classification Techniques
Analysis of Bayes, Neural Network and Tree Classifier of Classification Techn...
IJCSI-10-6-1-288-292
A Survey Ondecision Tree Learning Algorithms for Knowledge Discovery
Classification Techniques: A Review
J48 and JRIP Rules for E-Governance Data
Distributed Digital Artifacts on the Semantic Web
A Decision Tree Based Classifier for Classification & Prediction of Diseases
Survey on Various Classification Techniques in Data Mining
Data mining technique (decision tree)
Gloeocercospora sorghiGloeocercospora sorghi
A Method for Vibration Testing Decision Tree-Based Classification Systems.

More from eSAT Publishing House (20)

PDF
Likely impacts of hudhud on the environment of visakhapatnam
PDF
Impact of flood disaster in a drought prone area – case study of alampur vill...
PDF
Hudhud cyclone – a severe disaster in visakhapatnam
PDF
Groundwater investigation using geophysical methods a case study of pydibhim...
PDF
Flood related disasters concerned to urban flooding in bangalore, india
PDF
Enhancing post disaster recovery by optimal infrastructure capacity building
PDF
Effect of lintel and lintel band on the global performance of reinforced conc...
PDF
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
PDF
Wind damage to buildings, infrastrucuture and landscape elements along the be...
PDF
Shear strength of rc deep beam panels – a review
PDF
Role of voluntary teams of professional engineers in dissater management – ex...
PDF
Risk analysis and environmental hazard management
PDF
Review study on performance of seismically tested repaired shear walls
PDF
Monitoring and assessment of air quality with reference to dust particles (pm...
PDF
Low cost wireless sensor networks and smartphone applications for disaster ma...
PDF
Coastal zones – seismic vulnerability an analysis from east coast of india
PDF
Can fracture mechanics predict damage due disaster of structures
PDF
Assessment of seismic susceptibility of rc buildings
PDF
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
PDF
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...
Likely impacts of hudhud on the environment of visakhapatnam
Impact of flood disaster in a drought prone area – case study of alampur vill...
Hudhud cyclone – a severe disaster in visakhapatnam
Groundwater investigation using geophysical methods a case study of pydibhim...
Flood related disasters concerned to urban flooding in bangalore, india
Enhancing post disaster recovery by optimal infrastructure capacity building
Effect of lintel and lintel band on the global performance of reinforced conc...
Wind damage to trees in the gitam university campus at visakhapatnam by cyclo...
Wind damage to buildings, infrastrucuture and landscape elements along the be...
Shear strength of rc deep beam panels – a review
Role of voluntary teams of professional engineers in dissater management – ex...
Risk analysis and environmental hazard management
Review study on performance of seismically tested repaired shear walls
Monitoring and assessment of air quality with reference to dust particles (pm...
Low cost wireless sensor networks and smartphone applications for disaster ma...
Coastal zones – seismic vulnerability an analysis from east coast of india
Can fracture mechanics predict damage due disaster of structures
Assessment of seismic susceptibility of rc buildings
A geophysical insight of earthquake occurred on 21 st may 2014 off paradip, b...
Effect of hudhud cyclone on the development of visakhapatnam as smart and gre...

Recently uploaded (20)

PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
web development for engineering and engineering
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPT
Project quality management in manufacturing
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
OOP with Java - Java Introduction (Basics)
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Welding lecture in detail for understanding
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Sustainable Sites - Green Building Construction
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
composite construction of structures.pdf
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Operating System & Kernel Study Guide-1 - converted.pdf
web development for engineering and engineering
Foundation to blockchain - A guide to Blockchain Tech
Project quality management in manufacturing
Lecture Notes Electrical Wiring System Components
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
OOP with Java - Java Introduction (Basics)
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Strings in CPP - Strings in C++ are sequences of characters used to store and...
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
Welding lecture in detail for understanding
additive manufacturing of ss316l using mig welding
Sustainable Sites - Green Building Construction
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
composite construction of structures.pdf
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Mitigating Risks through Effective Management for Enhancing Organizational Pe...

Comparative study of various supervisedclassification methodsforanalysing deforestation factors

  • 1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 816 COMPARATIVE STUDY OF VARIOUS SUPERVISEDCLASSIFICATION METHODSFORANALYSING DEFORESTATION FACTORS S.P.Rajagopalan1 , C.Lalitha2 1 Professor, G.K.M College of Engineering &Technology, Chennai, TamilNadu 2 Research Scholar, Vels University, Chennai, TamilNadu Abstract In this paper, various supervised classification techniques are compared and the results are demonstrated. Here the classification techniques like Decision tree, Bayesian method, Neural Networks and Rule Based method are discussed with regards to the data sets given. The population, built up and agriculture are the factors that play a vital role in the development of country which directly affects economic condition. In this paper, factors such as road , population, built up development ,agriculture and industry are considered as drivers of deforestation in the study area which is located in the Erode District of TamilNadu, India. Keywords: Supervised Classification, Decision tree, Bayesian method, Neural Network. --------------------------------------------------------------------***------------------------------------------------------------------ 1. INTRODUCTION There are two types of classification, supervised and unsupervised. Supervised methods classify the data which is known and observed by the user specifically. Unsupervised methods are classified unknowably. The results are obtained with the given data sets by using the WEKA. The aim of this paper is to compare the various supervised methods by using the factors such as demographic, built up, road and agriculture. The primary methods used in Data mining are Data selection, data reduction and filtration. Data mining examines and discovers various algorithms under several computational efficiency. It integrates machine learning, pattern recognition, statistics, databases, and visualization techniques into one so that the information can be extracted from the large databases. The tasks of data mining are association rules mining, classification, prediction and cluster analysis. Generally speaking, association rule mining and classification rule mining are the most effective and efficient techniques in data mining. Classification rule mining is used for the prediction future objects whose class label is not known. Recently it has been determined that primary factor for the degradation of ecosystem is deforestation of forests. Classification results are basis for interpretation, analysis and modelling for various environmental and socio- economic applications. Data mining techniques can be applied for generating the class association rules for analysing the deforestation. In this paper, we applied various supervised classification techniques with our data sets. 2. WEKA The Waikato Environment for Knowledge Analysis (WEKA) is a tool for machine learning algorithms which can be used for classification and clustering. In this paper, we use decision tree methods like J48 and Random forest. Bayes algorithm such as Naive Bayes and Neural Network methods like Multilayer perceptron are implemented in WEKA. We divided our data set into 10 cross validation folds and all the methods are tested and compared according to the data. 3. CLASSIFICATION METHODS 3.1 Decision Tree Decision Tree (DT) worksaccording to the processing and deciding upon attributable data. Here attributes in DT are considered as nodes and each leaf node as a class. J48 and Random Forest were used in our experiments. It follows a recursivemethod for a given set of data.It searches the attributes as Depth-first strategy. It divides the class into several nodes and tests each node that gives the best result.It classifies the datasets invariably .This method is not suitable for finding anddid not show good results to the given datasets.Accuracy value of J48 methods is compared and shown in Table 1 and Figure1.
  • 2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 817 Random forest is also called regression trees that induces the data from bootstrap samples of the training data.It uses random feature selection by induction process. Itis comparatively gives better result than CART and C4.5. It shows a better performance, after modelling the result. The disadvantages of DT are focus on continues attributes, computational efficiently with growing tree size. According to comparison provided for different classification methods in emotion recognition, Random Forest is the best classifier method on that group with 5 attributes and the results are compared and shown in Table 2 and Figure 2. 3.2 Artificial Neural Networks 3.2.1 MLP Artificial Neural Network (ANN) is the common classification methods in data mining. Neural Network based classifiers, Multi Layer Perceptron (MLP) and Radial Base Function (RBF)were used in this work. MLP is a feed forward network that makes a model to map input data to output data. Hidden layer in MLP can include various layers between input and output. It classifies 3 factors gradually. The accuracy result of MLP is shown in Table 3 and Figure 3. 3.2.2RBF RBF is another type on ANN. The input is linear and the output is nonlinear. Here it hides several non sequential values because of random input. At first it shows a good result for the year 1990 and 2010, but not for 2000.So that this algorithm is not suitable for the given datasets. The RBF networks are divided in two feed-forward layer is shown in Table 4 and Figure 4.
  • 3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 818 3.3 Bayesian Methods Bayesian methods are one of the solution for the classification methods in data mining. In our work, Naive Bayesis implemented for classification. It follows an independent feature model with strong independence assumptions. This method is applicable for the statistical data. Classification is done by appropriately to the attributes Cropand built up. It classifies and shows the better performance to 10% for all the years. This method shows good results for two attributes and is displayed in Table 5 and Figure 5. 3.4 Rule Based Classification Decision trees can be translated into a set of rules by creating a separate rule for each path from the root to a leaf in the tree. However, rules can also be directly induced from training data using a variety of rule-based algorithms. Classification accuracy of rule learning algorithms can be improved by combining features using the background knowledge of the user automatic feature construction algorithms. 3.4.1 1R Algorithm One Rule is a simple accurate, classification algorithm that generates one rule for each predictor in the data and then selects the rule with the smallest total error as its "one rule". It is the simplest rule-based classification learning algorithm for discrete attributes. It shows a gradual increase in the years 1990 and 2000 for all the attributes but suddenly no changes for the year 2010 because OneR produces rules only slightly less accurate .The result of this method is shown in Table 6 and Figure 6. 3.4.2 Prism Prism is a greedy algorithm that finds a minimum spanning tree for a connected weighted undirected graph. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. It can also be used to find the minimum spanning forest and reflects the same result for all the three decades , is in Table 7 and Figure 7.
  • 4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 819 4 COMPARATIVE RESULTS The Machine learning techniques such as Naïve Bayes, Bayes network, J48, Random Forest, Multi Layer Perceptron (MLP) and Radial Base Function (RBF) ,one R, PRISM were used for simulation. Here we split our original dataset of 405 samples into 66% for training purpose and remaining 34% for testing purpose. Weka incorporates k-fold cross- validation, in which the original sample is randomly partitioned into k subsamples. The cross-validation process is then repeated into several subsamples .Here ,we have used 10-fold cross validation .Kappa gives a numerical rating of - 1 to 1 scale, where 1 is perfect agreement, 0 implies expected by chance, and negative values indicate agreement lesser than chance, Comparatively PRISM gives best result and shows the degradation of forest. The correctly classified Instances of various years are shown in Table8,9,10and the accuracy rates are displayed in Figure 8,9,10. Fig. 8-1990 Fig.9-2000 Fig. 10-2010 5. CONCLUSIONS AND FUTURE WORK In this paper, four classifiers including Neural Network, Naïve Bayesian, Rule based, Decision tree were tested to determine the deforestation from the dataset of demographic factors. All the result were classified as 1,-1 or 0. There are many different mining and classification algorithms, and parameter settings in each algorithm. Experimental results in this paper are based on the default settings. Extensive experiments with different settings are applicable in WEKA. J48 is very simple classifier to make a decision tree, but it gives the invariable result in the experiment. Naïve Bayesian
  • 5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Special Issue: 07 |May-2014, Available @ http://www.ijret.org 820 classifier also showed good result only for two attributes(Crop and Built up), but RBF classified another two attributes(Industry and Road) properly. Rule Based classifiers such as OneR and PRISM also showed good result than compared with J48 or Naïves Bayesian classifier. From this experiment, we can find that a simple Random Forest classifier can provide best classification result for deforestation(except one attribute).It is planned to incorporate other techniques like different ways of feature selection, classification using ontology. REFERENCES [1] PeimanMamaniBarnaghi+,VahidAlizadehSahzabi and Azuraliza Abu Bakar “A Comparative Study for Various Methods of Classification”, 2012 International Conference on Information and Computer Networks (ICICN 2012). [2] LinlinXu, Jonathan Li, Alexander Brenning“A comparative study of different classification techniques for marine oil spill identification using RADARSAT-1 imagery”,Remote Sensing of Environment 141 (2014) 14–23. [3] .Elaine Astrand1, Pierre Enel2, Guilhem Ibos1, Peter Ford Dominey2, Pierre Baraduc1,Suliann Ben Hamed1* “Comparison of Classifiers for Decoding Sensory and Cognitive Information from Prefrontal Neuronal Populations “,PLOS ONE | www.plosone.org. [4] S. B. Kotsiantis “Supervised Machine Learning: A Review of Classification Techniques”Informatica31 (2007) 249-268 249. [5] Archana Chaudhary1, Savita Kolhe2 & Raj Kamal3”Machine Learning Classification Techniques: A Comparative Study “,International Journal on Advanced Computer Theory and Engineering (IJACTE) ISSN ISSN (Print) : 2319 – 2526, Volume-2, Issue-4, 2013. [6] Haowen You1 and George Rumbe2 “Comparative Study of Classification Techniques on Breast Cancer FNA Biopsy Data” International Journal of ArtificialIntelligence and Interactive Multimedia, Vol. 1, Nº 3. [7] PeimanMamaniBarnaghi+,VahidAlizadehSahzabi and Azuraliza Abu Bakar “A Comparative Study for Various Methods of Classification” 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) © (2012) IACSIT Press, Singapore [8] Gaya Buddhinathand Damien Derry “A Simple Enhancement to One Rule Classification”. [9] SeongwookYoun and Dennis McLeod “A Comparative Study for Email Classification” [10] K.Saritha1, S. Jyothi2 & K. R. Manjula3 “Class association rule mining for analyzing deforestation factors” International Journal of Civil, Structural,Environmental and Infrastructure Engineering Research and Development (IJCSEIERD) ISSN(P): 2249-6866; ISSN(E): 2249- 7978 Vol. 3, Issue 5, Dec 2013, 237-248 [11] K.R.Manjula ,Prof. S. Jyothi, S.Anand Kumar Varma “Analysing the factors of deforestation using GIS “www.geospatialworld.net/paper/application/Article View.aspx?aid=30273