SlideShare a Scribd company logo
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 429
IMPLEMENTATION OF REDUCING FEATURES TO IMPROVE CODE
CHANGE BASED BUG PREDICTION BY USING COS-TRIAGE
ALGORITHM
Veena Jadhav1
, Vandana Gaikwad2
, Netra Patil3
1
Student, Computer Engineering Department, BVDUCOE Pune-43, Maharashtra, India
2
Assistant Professor, Computer Engineering Department, BVDUCOE Pune-43, Maharashtra, India
3
Assistant Professor, Computer Engineering Department, BVDUCOE Pune-43, Maharashtra, India
Abstract
Today, we are getting plenty of bugs in the software because of variations in the software and hardware technologies. Bugs are
nothing but Software faults, existing a severe challenge for system reliability and dependability. To identify the bugs from the
software bug prediction is convenient approach. To visualize the presence of a bug in a source code file, recently, Machine
learning classifiers approach is developed. Because of a huge number of machine learned features current classifier-based bug
prediction have two major problems i) inadequate precision for practical usage ii) measured prediction time. In this paper we
used two techniques first, cos-triage algorithm which have a go to enhance the accuracy and also lower the price of bug
prediction and second, feature selection methods which eliminate less significant features. Reducing features get better the quality
of knowledge extracted and also boost the speed of computation.
Keywords: Efficiency, Bug Prediction, Classification, Feature Selection, Accuracy.
--------------------------------------------------------------------***----------------------------------------------------------------------
1. INTRODUCTION
Classifiers such as SVM and navies byes are accomplished
on previous software project information, also it can be
expended to estimate the presence of a bug in an specific
record in software, as complete in previous work[2].
To fix bug cos-triage algorithm is used. Afterward that data
is added in historical data or in log record. Afterward,
classifier is accomplished on data originate in previous log
record. Then, it can be expended to categorize a renewed
variation is either buggy or clean.
Now days, The Eclipse IDE contains a prototype displaying
server which is used to compute bug predictions [4]. It can
aslo offer accurate predictions. Bug prediction system must
offer a lesser number of incorrect variations that are
predicted to be pram but which are very new, if software
developers are to belief [3]. Suppose, if enormous numbers
of improved variations are incorrectly estimated to be pram,
engineers won’t have belief in the bug prediction.
Previous bug prediction methodology and identical work
offered by scientist which employ the removal of “features”
from the prior of variations created to a software system[2].
These features comprise completely distributed by blank
space in the code and that all were included or rejected in a
variation. Henceforth, to prepare the SVM classifier entirely
variables, comment words, numerical operatives, name of
methods, and development language keywords are used as
features which is existing in this paper.
Cost of large feature set is tremendously high. Because of
multiple interactions and noise classifiers cannot handle
such an enormous feature set. As well as number of features
increases time also increases and expanding to a number of
seconds per categorization for tens of thousands of features,
and minutes for enormous project data accounts.
Henceforth, this will influences the availability of a bug
prediction scheme.
There are several classification approaches could be
working. But in this paper we used cos-triage algorithm and
SVM.
1) This paper contributes three aspects:
2) Study of various feature selection methods to
categorize bugs in software program variations.
3) Use of cos-triage algorithm to increase accuracy and
reduce cost of bug prediction.
The rest of this paper is prepared as follows: In Section 2
primary steps involved in performing change classification
is presented. Also this section discuss about feature selection
techniques in more detail. Section 3 discuss prior work.
Section 4 contains system overview for proposed system.
Finally a conclusion is made in section 5.
2. CHANGE CLASSIFICATION PROCESS
Following steps are concerned in executing Various
categorization on distinct project. Creating a Corpus:
1) Change deltas are mining from the log records of a
project, as kept in its SCM repository.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 430
2) For each file bug fix variations are known by observing
keywords in SCM variation log messages.
3) At the record level without bug and buggy variations are
known by finding reverse in the review log record from bug
resolve.
4) Features contain both buggy and clean and that mined
from all variations. In Complete source code all expressions
are considered as features. Features are nothing but the lines
fitted in each variation, meta-data such as author, time etc.
we can calculate complexity metrics at this stage.
5) In this paper to compute a reduced set of features
combination of wrapper and filter approaches are used. In
this paper four filter based rankers are used. i) Relief-F. ii)
Chi-Squared iii) Gain Ratio iv) Significance. The wrapper
methods are depends on the SVM classifiers.
6) Classification model is accomplished by using reduced
set.
7) Skilled classifier is set to use. Whether a original
variation is more analogous to a buggy variation or a clean
variation is proven by classifier.
2.1 Detection of Buggy and Clean Changes
For bug prediction training data is fixed and used. Mining
variation log records is used to determine bug introducing
variations and to identify bug fixes. First is Examining for
keywords in old revision such as “Set” “Bug” or other
keywords possible to develop in a bug repair.second,
Examining for another situations to bug.
2.2 Feature Extraction
With the help of support vector machine algorithm, using
buggy and clean variations a classification model must be
accomplished which is used to consolidate software
variations.
The entire word in the source code programme distributed
with the help of blank space or a semicolon which is used as
a article .Example of that is method name, keyword,
variable name, comment word, function name and operator.
2.3 Feature Selection Techniques
Huge feature groups need extended working out and
estimation periods, also need lot of memory these are the
requirement to execute classification.
Feature selection is common solution to this problem In
which only the subcategory of structures that are generally
suitable for creating grouping outcomes are truly used.
2.4 Feature Selection Process
A repetitive process of choosing increasingly fewer
important group of features is finished by using Filter and
wrapper approaches.
This process starts by splitting the primary feature set in
partial which reduces processing requirements and memory
for the remaining process.
A classification model is accomplished by using the reduced
feature set. Whether a new variation is more associated to a
buggy variation or a clean variation is decided by classifier.
3. RELATED WORK
A model proposed by scientist Khoshgoftaar and Allen to
group modules corresponding to software quality factors
such as upcoming defect intensity using different phases of
multi-regression[5][6][7]. Ostrand et al. discovered two
model i) The top 20% of problematic records in a project
[11] using upcoming defect predictors ii) A linear regression
model
Totally Ordered Program Units could be transformed into a
half ordered suite list, e.g. by giving the topmost N% of
units as offered by Ostrand et al. Hassan and A caching
algorithm to compute the set of fault-prone modules, called
the topmost-10 list offered by halt [9]. The bug cache
algorithm to envisage upcoming faults built on preceding
fault areas offered by Kim et al.
Fault sessions of the Mozilla project through numerous
releases offered by Gyimothy et al. [11]. With the help of
decision trees and neural networks that use object-oriented
metrics as features. Six different feature selection techniques
when using the Naive Bayes and the C4.5 classifier
considered by Hall and Holmes [12] .There are about 100
features in each dataset. Hall and Holmes launched many of
the feature selection methods.
A general defect prediction framework which encompass a
data pre-processor, feature selection methods, and machine
learning algorithms elaborated by Song et al. [14]. They also
consider that slight variations to data illustration can have a
vast influence on the outcomes of article selection and bug
prediction. Numerous feature selection algorithms to
envisage buggy software modules for a hugeheritage
communications software offered by Gao et al.
Numerous feature selection algorithms to envisage buggy
software units for a huge legacy communications software
offered by Gao et al. There are two methods used. First,
filter based techniques and second, subset selection search
algorithm.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 431
4. SYSTEM OVERVIEW
Fig -1: System Overview of bug prediction system
Above diagram fig 1. Shows general overview of the
system. Originally, user login/register into the system. It
also encompasses training data/raw data which retains the
logs or history. Then it checks for bug. If bug present in the
system verify and report bug. Then, cos-triage algorithm is
applied on that bug. Cos-triage algorithm supports us for
fixing bug or defects. Then at the back end that output with
bug is used as input to support vector machine (SVM)
classifier .SVM classifier uses different feature selection
methods which is given above. After that by reducing the
features we gets final output without bug. SVM classifier
operate on skilled data/raw data which is kept in log record.
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 432
5. SCREENSHOTS
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 433
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 434
6. GRAPHICAL REPRESENTATION
Fig 2.Graphical representation of bug prediction on projects
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 435
Fig.3: Cos-triage algorithm graph
7. CONCLUSION
This paper has explored algorithm called Cos-triage
algorithm which helps to reduce cost and increase accuracy
of bug fixing or bug prediction. In this paper we used
feature selection method which reduces the quantity of
structures used by a support vector device classifier and
naive byes for bug prediction.
Henceforward, the usage of classifiers with feature selection
will approve fast, correct, more accurate bug predictions, if
software designers have highly developed bug prediction
methods rooted into their software development
environment. Similarly, many algorithm will build in future
which will increase accuracy of bug prediction.
REFERENCES
[1] T.L. Graves, A.F. Karr, J.S. Marron, and H. Siy,
“Predicting Fault Incidence Using Software Change
History,” IEEE Trans. Software Eng., vol. 26, no. 7,
pp. 653-661, July 2000.
[2] S. Kim, E. W. Jr., and Y. Zhang, “Classifying
Software Changes: Clean or Buggy?” IEEE Trans.
Software Eng., vol. 34, no. 2, pp. 181–196,2008.
[3] A.Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S.
Hallem, C.Henri-Gros, A. Kamsky, S. McPeak, and
D.R. Engler, “A Few Billion Lines of Code Later:
Using Static Analysis to Find Bugs in the Real
World,” Comm. ACM, vol. 53, no. 2, pp. 66-75,
2010.
[4] J. Madhavan and E. Whitehead Jr., “Predicting
Buggy Changes Inside an Integrated Development
Environment,” Proc. OOPSLA Workshop Eclipse
Technology eXchange, 2007.
[5] T. Khoshgoftaar and E. Allen, “Predicting the Order
of Fault-Prone Modules in Legacy Software,” Proc.
1998 Int’l Symp. on SoftwareReliability Eng., pp.
344–353, 1998.
[6] R. Kumar, S. Rai, and J. Trahan, “Neural-Network
Techniques for Software-Quality Evaluation,”
Reliability and Maintainability Symposium,1998.
[7] T. Ostrand, E. Weyuker, and R. Bell, “Predicting the
Location and Number of Faults in Large Software
Systems,” IEEE Trans. SoftwareEng., vol. 31, no. 4,
pp. 340–355, 2005.
[8] A.Hassan and R. Holt, “The Top Ten List: Dynamic
Fault Prediction,”Proc. ICSM’05, Jan 2005.
[9] R. Moser, W. Pedrycz, and G. Succi, “A Comparative
Analysis ofthe Efficiency of Change Metrics and
Static Code Attributes for Defect Prediction,” Proc.
30th Int’l Conf. Software Eng., pp. 181-190,2008..
[10] L.C. Briand, W.L. Melo, and J. Wu¨ st, “Assessing
the Applicability of Fault-Proneness Models Across
Object-Oriented Software Projects,” IEEE Trans.
Software Eng., vol. 28, no. 7, pp. 706-720,July 2002.
[11] M. Hall and G. Holmes, “Benchmarking Attribute
Selection Techniques for Discrete Class Data
Mining,” IEEE Trans. Knowledge and Data Eng., vol.
15, no. 6, pp. 1437-1447, Nov./Dec. 2003.
[12] J. Quinlan, C4.5: Programs for Machine Learning.
Morgan Kaufmann,1993.
[13] Q. Song, Z. Jia, M. Shepperd, S. Ying, and J. Liu, “A
General Software Defect-Proneness Prediction
Framework,” IEEE Trans. Software Eng., vol. 37, no.
3, pp. 356-370, May/June 2011.
[14] K. Gao, T. Khoshgoftaar, H. Wang, and N. Seliya,
“Choosing Software Metrics for Defect Prediction:
An Investigation on Feature Selection Techniques,”
Software: Practice and Experience,vol. 41, no. 5, pp.
579-606, 2011.
[15] A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S.
Hallem, C.Henri-Gros, A. Kamsky, S. McPeak, and
D.R. Engler, “A FewBillion Lines of Code Later:
Using Static Analysis to Find Bugs in the Real
World,” Comm. ACM, vol. 53, no. 2, pp. 66-75, 2010
IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 436
BIOGRAPHIES
Miss Veena jadhav M.tech (Computer)
BVDUCOE,
Pune
Prof.Vandana Gaikwad, Assistant Professor, Computer
department, BVDUCOE, Pune
Prof.Netra Patil, Assistant Professor, Computer department
BVDUCOE, Pune

More Related Content

PDF
Genetic algorithm based approach for
PDF
Test case prioritization using firefly algorithm for software testing
PDF
The comparison of the text classification methods to be used for the analysis...
PDF
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
PDF
Bd36334337
PDF
Verification of the protection services in antivirus systems by using nusmv m...
PDF
AUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUES
PDF
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
Genetic algorithm based approach for
Test case prioritization using firefly algorithm for software testing
The comparison of the text classification methods to be used for the analysis...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
Bd36334337
Verification of the protection services in antivirus systems by using nusmv m...
AUTOMATED BUG TRIAGE USING ADVANCED DATA REDUCTION TECHNIQUES
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS

What's hot (19)

PDF
A Hierarchical Feature Set optimization for effective code change based Defec...
PDF
Stil test pattern generation enhancement in mixed signal design
PDF
Finding Bad Code Smells with Neural Network Models
PDF
Artificial Intelligence based Pattern Recognition
PDF
IRJET - Survey on Malware Detection using Deep Learning Methods
PDF
Regression test selection model: a comparison between ReTSE and pythia
PDF
Multi step automated refactoring for code smell
PDF
Multi step automated refactoring for code smell
PDF
Comparative Performance Analysis of Machine Learning Techniques for Software ...
PDF
Threshold benchmarking for feature ranking techniques
PDF
Fuzzy Rule Base System for Software Classification
PDF
Testing embedded system through optimal mining technique (OMT) based on multi...
PDF
IRJET - Neural Network based Leaf Disease Detection and Remedy Recommenda...
PDF
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
PDF
76201929
PDF
Improving the performance of Intrusion detection systems
PDF
Code coverage based test case selection and prioritization
PDF
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
A Hierarchical Feature Set optimization for effective code change based Defec...
Stil test pattern generation enhancement in mixed signal design
Finding Bad Code Smells with Neural Network Models
Artificial Intelligence based Pattern Recognition
IRJET - Survey on Malware Detection using Deep Learning Methods
Regression test selection model: a comparison between ReTSE and pythia
Multi step automated refactoring for code smell
Multi step automated refactoring for code smell
Comparative Performance Analysis of Machine Learning Techniques for Software ...
Threshold benchmarking for feature ranking techniques
Fuzzy Rule Base System for Software Classification
Testing embedded system through optimal mining technique (OMT) based on multi...
IRJET - Neural Network based Leaf Disease Detection and Remedy Recommenda...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
76201929
Improving the performance of Intrusion detection systems
Code coverage based test case selection and prioritization
IRJET- Analysis of PV Fed Vector Controlled Induction Motor Drive
Ad

Viewers also liked (17)

DOCX
resume ivy tech
PDF
Ensighten Certified Professional - Mobile
PDF
Bryan slide
PDF
Ensighten Certified Professional - Manage Implementation
PPTX
ปณิดา เฟื่องฟุ้ง ชั้นม4-8 เลขที่ 24
PDF
Boletín n° 002 nuestras deportistas rumbo a la copa élite de baloncesto en a...
PPT
HMS Hurworth
PPT
stromen van levend water
PPTX
Test
DOCX
Jade Kelley Resume 2015
PDF
Media Advisory
PDF
Balade week 6
DOCX
Kevin Smith CV 17.02.2016
PDF
Janeiro 2015
DOCX
Fnatic
PPT
1. rehabilitación vías riobamba
resume ivy tech
Ensighten Certified Professional - Mobile
Bryan slide
Ensighten Certified Professional - Manage Implementation
ปณิดา เฟื่องฟุ้ง ชั้นม4-8 เลขที่ 24
Boletín n° 002 nuestras deportistas rumbo a la copa élite de baloncesto en a...
HMS Hurworth
stromen van levend water
Test
Jade Kelley Resume 2015
Media Advisory
Balade week 6
Kevin Smith CV 17.02.2016
Janeiro 2015
Fnatic
1. rehabilitación vías riobamba
Ad

Similar to Implementation of reducing features to improve code change based bug prediction by using cos triage algorithm (20)

PDF
Towards effective bug triage with software data reduction techniques
PDF
Towards effective bug triage with software data reduction techniques
PDF
Bug Triage: An Automated Process
PDF
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
PDF
Towards Effective Bug Triage with Software Data Reduction Techniques
PDF
Comparative performance analysis
PDF
Benchmarking machine learning techniques
PDF
A Tale of Experiments on Bug Prediction
DOCX
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
PDF
Predicting Fault-Prone Files using Machine Learning
PDF
A tale of bug prediction in software development
PDF
IRJET-Automatic Bug Triage with Software
PDF
August 2024: Top 10 Read Articles in Software Engineering & Applications Inte...
PDF
May 2024: Top 10 Read Articles in Software Engineering & Applications Interna...
PDF
May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...
PDF
A tale of experiments on bug prediction
PDF
E018132735
PDF
PPTX
software Engineering ppt containing the research work relevant in the field o...
PDF
A Survey on Bug Tracking System for Effective Bug Clearance
Towards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniques
Bug Triage: An Automated Process
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
Towards Effective Bug Triage with Software Data Reduction Techniques
Comparative performance analysis
Benchmarking machine learning techniques
A Tale of Experiments on Bug Prediction
Nature-Based Prediction Model of Bug Reports Based on Ensemble Machine Learni...
Predicting Fault-Prone Files using Machine Learning
A tale of bug prediction in software development
IRJET-Automatic Bug Triage with Software
August 2024: Top 10 Read Articles in Software Engineering & Applications Inte...
May 2024: Top 10 Read Articles in Software Engineering & Applications Interna...
May 2025: Top 10 Cited Articles in Software Engineering & Applications Intern...
A tale of experiments on bug prediction
E018132735
software Engineering ppt containing the research work relevant in the field o...
A Survey on Bug Tracking System for Effective Bug Clearance

More from eSAT Journals (20)

PDF
Mechanical properties of hybrid fiber reinforced concrete for pavements
PDF
Material management in construction – a case study
PDF
Managing drought short term strategies in semi arid regions a case study
PDF
Life cycle cost analysis of overlay for an urban road in bangalore
PDF
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materials
PDF
Laboratory investigation of expansive soil stabilized with natural inorganic ...
PDF
Influence of reinforcement on the behavior of hollow concrete block masonry p...
PDF
Influence of compaction energy on soil stabilized with chemical stabilizer
PDF
Geographical information system (gis) for water resources management
PDF
Forest type mapping of bidar forest division, karnataka using geoinformatics ...
PDF
Factors influencing compressive strength of geopolymer concrete
PDF
Experimental investigation on circular hollow steel columns in filled with li...
PDF
Experimental behavior of circular hsscfrc filled steel tubular columns under ...
PDF
Evaluation of punching shear in flat slabs
PDF
Evaluation of performance of intake tower dam for recent earthquake in india
PDF
Evaluation of operational efficiency of urban road network using travel time ...
PDF
Estimation of surface runoff in nallur amanikere watershed using scs cn method
PDF
Estimation of morphometric parameters and runoff using rs & gis techniques
PDF
Effect of variation of plastic hinge length on the results of non linear anal...
PDF
Effect of use of recycled materials on indirect tensile strength of asphalt c...
Mechanical properties of hybrid fiber reinforced concrete for pavements
Material management in construction – a case study
Managing drought short term strategies in semi arid regions a case study
Life cycle cost analysis of overlay for an urban road in bangalore
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materials
Laboratory investigation of expansive soil stabilized with natural inorganic ...
Influence of reinforcement on the behavior of hollow concrete block masonry p...
Influence of compaction energy on soil stabilized with chemical stabilizer
Geographical information system (gis) for water resources management
Forest type mapping of bidar forest division, karnataka using geoinformatics ...
Factors influencing compressive strength of geopolymer concrete
Experimental investigation on circular hollow steel columns in filled with li...
Experimental behavior of circular hsscfrc filled steel tubular columns under ...
Evaluation of punching shear in flat slabs
Evaluation of performance of intake tower dam for recent earthquake in india
Evaluation of operational efficiency of urban road network using travel time ...
Estimation of surface runoff in nallur amanikere watershed using scs cn method
Estimation of morphometric parameters and runoff using rs & gis techniques
Effect of variation of plastic hinge length on the results of non linear anal...
Effect of use of recycled materials on indirect tensile strength of asphalt c...

Recently uploaded (20)

PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
additive manufacturing of ss316l using mig welding
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Construction Project Organization Group 2.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
web development for engineering and engineering
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Well-logging-methods_new................
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
PPT on Performance Review to get promotions
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPT
Project quality management in manufacturing
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
additive manufacturing of ss316l using mig welding
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Arduino robotics embedded978-1-4302-3184-4.pdf
Embodied AI: Ushering in the Next Era of Intelligent Systems
Lesson 3_Tessellation.pptx finite Mathematics
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Construction Project Organization Group 2.pptx
CH1 Production IntroductoryConcepts.pptx
Lecture Notes Electrical Wiring System Components
web development for engineering and engineering
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Well-logging-methods_new................
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPT on Performance Review to get promotions
Foundation to blockchain - A guide to Blockchain Tech
Project quality management in manufacturing

Implementation of reducing features to improve code change based bug prediction by using cos triage algorithm

  • 1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 429 IMPLEMENTATION OF REDUCING FEATURES TO IMPROVE CODE CHANGE BASED BUG PREDICTION BY USING COS-TRIAGE ALGORITHM Veena Jadhav1 , Vandana Gaikwad2 , Netra Patil3 1 Student, Computer Engineering Department, BVDUCOE Pune-43, Maharashtra, India 2 Assistant Professor, Computer Engineering Department, BVDUCOE Pune-43, Maharashtra, India 3 Assistant Professor, Computer Engineering Department, BVDUCOE Pune-43, Maharashtra, India Abstract Today, we are getting plenty of bugs in the software because of variations in the software and hardware technologies. Bugs are nothing but Software faults, existing a severe challenge for system reliability and dependability. To identify the bugs from the software bug prediction is convenient approach. To visualize the presence of a bug in a source code file, recently, Machine learning classifiers approach is developed. Because of a huge number of machine learned features current classifier-based bug prediction have two major problems i) inadequate precision for practical usage ii) measured prediction time. In this paper we used two techniques first, cos-triage algorithm which have a go to enhance the accuracy and also lower the price of bug prediction and second, feature selection methods which eliminate less significant features. Reducing features get better the quality of knowledge extracted and also boost the speed of computation. Keywords: Efficiency, Bug Prediction, Classification, Feature Selection, Accuracy. --------------------------------------------------------------------***---------------------------------------------------------------------- 1. INTRODUCTION Classifiers such as SVM and navies byes are accomplished on previous software project information, also it can be expended to estimate the presence of a bug in an specific record in software, as complete in previous work[2]. To fix bug cos-triage algorithm is used. Afterward that data is added in historical data or in log record. Afterward, classifier is accomplished on data originate in previous log record. Then, it can be expended to categorize a renewed variation is either buggy or clean. Now days, The Eclipse IDE contains a prototype displaying server which is used to compute bug predictions [4]. It can aslo offer accurate predictions. Bug prediction system must offer a lesser number of incorrect variations that are predicted to be pram but which are very new, if software developers are to belief [3]. Suppose, if enormous numbers of improved variations are incorrectly estimated to be pram, engineers won’t have belief in the bug prediction. Previous bug prediction methodology and identical work offered by scientist which employ the removal of “features” from the prior of variations created to a software system[2]. These features comprise completely distributed by blank space in the code and that all were included or rejected in a variation. Henceforth, to prepare the SVM classifier entirely variables, comment words, numerical operatives, name of methods, and development language keywords are used as features which is existing in this paper. Cost of large feature set is tremendously high. Because of multiple interactions and noise classifiers cannot handle such an enormous feature set. As well as number of features increases time also increases and expanding to a number of seconds per categorization for tens of thousands of features, and minutes for enormous project data accounts. Henceforth, this will influences the availability of a bug prediction scheme. There are several classification approaches could be working. But in this paper we used cos-triage algorithm and SVM. 1) This paper contributes three aspects: 2) Study of various feature selection methods to categorize bugs in software program variations. 3) Use of cos-triage algorithm to increase accuracy and reduce cost of bug prediction. The rest of this paper is prepared as follows: In Section 2 primary steps involved in performing change classification is presented. Also this section discuss about feature selection techniques in more detail. Section 3 discuss prior work. Section 4 contains system overview for proposed system. Finally a conclusion is made in section 5. 2. CHANGE CLASSIFICATION PROCESS Following steps are concerned in executing Various categorization on distinct project. Creating a Corpus: 1) Change deltas are mining from the log records of a project, as kept in its SCM repository.
  • 2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 430 2) For each file bug fix variations are known by observing keywords in SCM variation log messages. 3) At the record level without bug and buggy variations are known by finding reverse in the review log record from bug resolve. 4) Features contain both buggy and clean and that mined from all variations. In Complete source code all expressions are considered as features. Features are nothing but the lines fitted in each variation, meta-data such as author, time etc. we can calculate complexity metrics at this stage. 5) In this paper to compute a reduced set of features combination of wrapper and filter approaches are used. In this paper four filter based rankers are used. i) Relief-F. ii) Chi-Squared iii) Gain Ratio iv) Significance. The wrapper methods are depends on the SVM classifiers. 6) Classification model is accomplished by using reduced set. 7) Skilled classifier is set to use. Whether a original variation is more analogous to a buggy variation or a clean variation is proven by classifier. 2.1 Detection of Buggy and Clean Changes For bug prediction training data is fixed and used. Mining variation log records is used to determine bug introducing variations and to identify bug fixes. First is Examining for keywords in old revision such as “Set” “Bug” or other keywords possible to develop in a bug repair.second, Examining for another situations to bug. 2.2 Feature Extraction With the help of support vector machine algorithm, using buggy and clean variations a classification model must be accomplished which is used to consolidate software variations. The entire word in the source code programme distributed with the help of blank space or a semicolon which is used as a article .Example of that is method name, keyword, variable name, comment word, function name and operator. 2.3 Feature Selection Techniques Huge feature groups need extended working out and estimation periods, also need lot of memory these are the requirement to execute classification. Feature selection is common solution to this problem In which only the subcategory of structures that are generally suitable for creating grouping outcomes are truly used. 2.4 Feature Selection Process A repetitive process of choosing increasingly fewer important group of features is finished by using Filter and wrapper approaches. This process starts by splitting the primary feature set in partial which reduces processing requirements and memory for the remaining process. A classification model is accomplished by using the reduced feature set. Whether a new variation is more associated to a buggy variation or a clean variation is decided by classifier. 3. RELATED WORK A model proposed by scientist Khoshgoftaar and Allen to group modules corresponding to software quality factors such as upcoming defect intensity using different phases of multi-regression[5][6][7]. Ostrand et al. discovered two model i) The top 20% of problematic records in a project [11] using upcoming defect predictors ii) A linear regression model Totally Ordered Program Units could be transformed into a half ordered suite list, e.g. by giving the topmost N% of units as offered by Ostrand et al. Hassan and A caching algorithm to compute the set of fault-prone modules, called the topmost-10 list offered by halt [9]. The bug cache algorithm to envisage upcoming faults built on preceding fault areas offered by Kim et al. Fault sessions of the Mozilla project through numerous releases offered by Gyimothy et al. [11]. With the help of decision trees and neural networks that use object-oriented metrics as features. Six different feature selection techniques when using the Naive Bayes and the C4.5 classifier considered by Hall and Holmes [12] .There are about 100 features in each dataset. Hall and Holmes launched many of the feature selection methods. A general defect prediction framework which encompass a data pre-processor, feature selection methods, and machine learning algorithms elaborated by Song et al. [14]. They also consider that slight variations to data illustration can have a vast influence on the outcomes of article selection and bug prediction. Numerous feature selection algorithms to envisage buggy software modules for a hugeheritage communications software offered by Gao et al. Numerous feature selection algorithms to envisage buggy software units for a huge legacy communications software offered by Gao et al. There are two methods used. First, filter based techniques and second, subset selection search algorithm.
  • 3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 431 4. SYSTEM OVERVIEW Fig -1: System Overview of bug prediction system Above diagram fig 1. Shows general overview of the system. Originally, user login/register into the system. It also encompasses training data/raw data which retains the logs or history. Then it checks for bug. If bug present in the system verify and report bug. Then, cos-triage algorithm is applied on that bug. Cos-triage algorithm supports us for fixing bug or defects. Then at the back end that output with bug is used as input to support vector machine (SVM) classifier .SVM classifier uses different feature selection methods which is given above. After that by reducing the features we gets final output without bug. SVM classifier operate on skilled data/raw data which is kept in log record.
  • 4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 432 5. SCREENSHOTS
  • 5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 433
  • 6. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 434 6. GRAPHICAL REPRESENTATION Fig 2.Graphical representation of bug prediction on projects
  • 7. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 435 Fig.3: Cos-triage algorithm graph 7. CONCLUSION This paper has explored algorithm called Cos-triage algorithm which helps to reduce cost and increase accuracy of bug fixing or bug prediction. In this paper we used feature selection method which reduces the quantity of structures used by a support vector device classifier and naive byes for bug prediction. Henceforward, the usage of classifiers with feature selection will approve fast, correct, more accurate bug predictions, if software designers have highly developed bug prediction methods rooted into their software development environment. Similarly, many algorithm will build in future which will increase accuracy of bug prediction. REFERENCES [1] T.L. Graves, A.F. Karr, J.S. Marron, and H. Siy, “Predicting Fault Incidence Using Software Change History,” IEEE Trans. Software Eng., vol. 26, no. 7, pp. 653-661, July 2000. [2] S. Kim, E. W. Jr., and Y. Zhang, “Classifying Software Changes: Clean or Buggy?” IEEE Trans. Software Eng., vol. 34, no. 2, pp. 181–196,2008. [3] A.Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C.Henri-Gros, A. Kamsky, S. McPeak, and D.R. Engler, “A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World,” Comm. ACM, vol. 53, no. 2, pp. 66-75, 2010. [4] J. Madhavan and E. Whitehead Jr., “Predicting Buggy Changes Inside an Integrated Development Environment,” Proc. OOPSLA Workshop Eclipse Technology eXchange, 2007. [5] T. Khoshgoftaar and E. Allen, “Predicting the Order of Fault-Prone Modules in Legacy Software,” Proc. 1998 Int’l Symp. on SoftwareReliability Eng., pp. 344–353, 1998. [6] R. Kumar, S. Rai, and J. Trahan, “Neural-Network Techniques for Software-Quality Evaluation,” Reliability and Maintainability Symposium,1998. [7] T. Ostrand, E. Weyuker, and R. Bell, “Predicting the Location and Number of Faults in Large Software Systems,” IEEE Trans. SoftwareEng., vol. 31, no. 4, pp. 340–355, 2005. [8] A.Hassan and R. Holt, “The Top Ten List: Dynamic Fault Prediction,”Proc. ICSM’05, Jan 2005. [9] R. Moser, W. Pedrycz, and G. Succi, “A Comparative Analysis ofthe Efficiency of Change Metrics and Static Code Attributes for Defect Prediction,” Proc. 30th Int’l Conf. Software Eng., pp. 181-190,2008.. [10] L.C. Briand, W.L. Melo, and J. Wu¨ st, “Assessing the Applicability of Fault-Proneness Models Across Object-Oriented Software Projects,” IEEE Trans. Software Eng., vol. 28, no. 7, pp. 706-720,July 2002. [11] M. Hall and G. Holmes, “Benchmarking Attribute Selection Techniques for Discrete Class Data Mining,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 6, pp. 1437-1447, Nov./Dec. 2003. [12] J. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann,1993. [13] Q. Song, Z. Jia, M. Shepperd, S. Ying, and J. Liu, “A General Software Defect-Proneness Prediction Framework,” IEEE Trans. Software Eng., vol. 37, no. 3, pp. 356-370, May/June 2011. [14] K. Gao, T. Khoshgoftaar, H. Wang, and N. Seliya, “Choosing Software Metrics for Defect Prediction: An Investigation on Feature Selection Techniques,” Software: Practice and Experience,vol. 41, no. 5, pp. 579-606, 2011. [15] A. Bessey, K. Block, B. Chelf, A. Chou, B. Fulton, S. Hallem, C.Henri-Gros, A. Kamsky, S. McPeak, and D.R. Engler, “A FewBillion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World,” Comm. ACM, vol. 53, no. 2, pp. 66-75, 2010
  • 8. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308 _______________________________________________________________________________________ Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 436 BIOGRAPHIES Miss Veena jadhav M.tech (Computer) BVDUCOE, Pune Prof.Vandana Gaikwad, Assistant Professor, Computer department, BVDUCOE, Pune Prof.Netra Patil, Assistant Professor, Computer department BVDUCOE, Pune