SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1552
Employee Performance Prediction System using Data Mining
Tejas Raut1, Priya Kale2, Rashmi Sonkusare3, A. K. Gaikwad4
1,2,3Student, Dept. of Computer Science & Engineering, DES’S COET, Dhamangaon Rly
4Professor, Dept. of Computer Science & Engineering, DES’S COET, Dhamangaon Rly
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - Human capital is of a high concern for
companies’ management where their most interest is in
hiring the highly qualified personnel which are expected to
perform highly as well. Human Resources Management
(HRM) has become one of theessential interestsofmanagers
and decision makers in almost all types of businesses to
adopt plans for correctly discovering highly qualified
employees. Accordingly, managements become interested
about the performance of these employees. Results show
that professional skill development programs are needed in
order to prepare employees to perform their tasks more
efficiently. The knowledge flow model of the Open source
tool is also used to frame the elements. To get a highly
accurate model, several experiments were executed based
on the previous techniques that are implemented tool for
enabling decision makers and human resources
professionals to predict and enhance the performance of
their employees.
Key Words: HRM, Employee Performance Prediction,
Data Mining, KNN, Dataset.
1. INTRODUCTION
HRM has a leading role in deciding the competitiveness
and effectiveness for better continuation. Organizations
consider HRM as “people practices”. Therefore, it becomes
the responsibility of the HRM to allocate the best employees
to the appropriate job at the right time, train and qualify
them, and build evaluation systems to monitor their
performance and an attempt to preserve the potential
talents of employees. A young yet promising of this kind is
data mining. It standouts due to its wide-array of techniques
from the different domains such as statistics, artificial
intelligence, machine learning,algorithms,databasesystems
and visualization. These influences served as groundwork
for its applications to business for which the academic
institutionisunexceptionallyclassified.Generally,regardless
of discipline, data mining has gained popularity due to its
tools with potentials to identify trends within data and turn
them out into knowledge mostly with predictive attributes
that could significantly lead to better and strong bases for
decision making with a wide range of Open source tools
availability.
Data mining is a young and promising field of information
and knowledge discovery. It started to be an interest target
for information industry, because of the existence of huge
data containing large amounts of hidden knowledge. With
data mining techniques, such knowledge can be extracted
and accessed transforming the databases tasks from storing
and retrieval to learning and extractingknowledge.Decision
tree is one of the most used techniques, since it creates the
decision tree from the data given using simple equations
depending mainly on calculation of the gain ratio, which
gives automatically some sort of weights to attributes used,
and the researcher can implicitly recognize the most
effective attributes on the predicted target. Asa resultofthis
technique, a decision tree would be built with classification
rules generated from it.
Previous studies specified several attributes affecting the
employee performance. Some of these attributes are
personal characteristics, others are educational and finally
professional attributeswerealsoconsidered.CheinandChen
(2006) used several attributes to predict the employee
performance. They specified age, gender, marital status,
experience, education, major subjects and school tires as
potential factors that might affect the performance. Then
they excluded age, gender and marital status, so that no
discrimination would exist in the process of personal
selection. As a result for their study, they found that
employee performance is highly affected by education
degree, the school tire, and thejob experience.Most recently,
the prevalence of intelligent machine learning algorithms in
the field of computer science has led to the development of
robust quantitative methodstoderiveinsightsfromindustry
data.
Supervised machine learning methods—wherein
computers learn from analyses of large-scale, historical,
labelled datasets—have been shown to garner insights in
various fields, like biology and medical sciences,
transportation, political science as well asmanyotherfields.
Owing to the advancements in information technology,
researchers have also studied numerous machine learning
approaches to improve the outcomes of human resource
(HR) management. A detailed listing of recent studies in
using supervised machine learning on employee turnover is
described in Table 1, and lists the data included and related
machine learning algorithms that were used therein,
including decision tree (DT) methods, random forest (RF)
methods, gradient boosting trees (GBT) methods, extreme
gradient boosting (XGB), logistic regression (LR), support
vector machines (SVM), neural networks (NN), linear
discriminant analysis (LDA), Naïve Bayes (NB) methods, K -
nearest neighbor (KNN), Bayesian networks (BN) and
induction rule methods (IND)
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1553
2. RELATED WORK
In general, Classification contains some steps to
complete its process. The first step is called thelearningstep
where in the model; predefinedclassesarebuiltbyanalyzing
a set of training dataset variables. Each variable is assumed
that has a relation and regards to a predefined class. The
second step is responsible for estimating the accuracy of
model or classifier (validatingthemodel)throughtesting the
model using a different dataset. If the classifier’s accuracy
was considered acceptable, the model or classifier can be
used to apply to new unseen data to give prediction about
specific unknown label class and this is considered the third
step. Therefore, the model acts as a classifier in the process
of decision-making. There are various classification
techniques have been used in the prediction process suchas
DT, Naïve Bayes, SVM, etc.
Human capital is of a high concern for companies’
management where their most interestisinhiringthe highly
qualified personnel which are expected to perform highlyas
well. Foremost, the Human Resource (HR). Management
plays the role of ensuring this by closely adhering to the
standards set by the higher management or by some
heuristic needs of applicants with distinctive qualifications
and potentials. However, oft-quoted factors that may affect
employee performance are attributed to educational
backgrounds, working experiences, as well as personal
qualities. These when converged provide a picture of how
well an employee performs his tasks. Assessment of human
resource performance is a sensitive task. To avoid partiality,
an efficient tool to deal with various data and assist
managers to make decisions and plans is of great help. In
data mining, historical data such as those attributes that
influence performance could be exploited as learning
experiences. These can be used to predict future
circumstances and rich resource of knowledge and decision
supports.
In general, data classification is a two-step process. In
the first step, which is called the learning step, a model that
describes a predetermined set of classes or concepts is built
by analyzing a set of training database instances. Each
instance is assumed to belong to a predefined class. In the
second step, the model is tested using a different data set
that is used to estimate the classification accuracy of the
model. If the accuracy of the model is considered acceptable,
the model can be used to classify future data instances for
which the class label is not known. At the end,themodel acts
as a classifier in the decision making process. There are
several techniques that can be used for classificationsuchas
decision tree, Bayesian methods, rule based algorithms, and
Neural Networks.
3. IMPLEMENTATION
Feature selection is a one of the main concepts of DM
and Machine Learning. Where, it is a process of selecting
necessary useful variables in a datasettoimprovetheresults
of machine learning and make it more accurate. At which,
Using too many numbers of variables in a dataset reduce
predictive performance. The data set may contain too many
features; some of them do not promote the prediction
accuracy, and thus make the predictive model excessively
complicated.
Therefore, unnecessary useless variables must be
avoided to make the model efficientlyworks.Deciding which
unnecessary variable to avoid can be done by a manual
manner using domain knowledge or it can be done
automatically. In the section of Business understanding with
proper endorsement and approval of some academic
administrators, questions as to how the (DM) data mining
functionalities are best applied in any Technological Institute
has been identified. Recent studies had However, oft-quoted
factors that may affectemployeesperformanceareattributedto
educational backgrounds, working experiences, as well as
personal qualities. These when converged provide a picture of
how well an employee performshistasks.Assessmentofhuman
resource performance is a sensitive task. To avoid partiality, an
efficient tool to deal with various data and assist managers to
make decisions and plans is of great help. In data mining,
historical data such as those attributes that influence
performance could be exploited as learning experiences.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1554
4. CONCLUSION
This paper has concentrated on the possibility of building a
classification model for predicting the employees’
performance. Applying the DM techniques in the different
problem domains in the HRM field is considered as an
important and urgent issue. Especially,atthepublicsectorin
Egypt. In addition, increasing the horizons of academic and
practice research on DM in HR for reaching a government
sector with a high performance. This study further outlooks
for applications of results to analyzeenhancementprograms
for senior employees of any organization and to identify
patterns affecting both teacher and student performance
using other data miningtechniquessuchasassociationrules.
REFERENCES
[1] Islam, M. J., Wu, Q. M. J., Ahmadi, M., and Sid-Ahmed,
M. A., (2010), "Investigating the Performance of
Naive- Bayes Classifiers and K- Nearest Neighbor
Classifiers" Journal of Convergence Information
Technology Volume 5, Number 2, April 2010.
[2] Al-Radaideh, Q.A., Al-Nagi, E., (2012). “Using Data
Mining Techniques to Build a Classification Model
for Predicting Employees Performance”,
International Journal of Advanced Computer
Science and Applications, 3(2), pp 144 – 151.
[3] S.Yasodha and P. S.Prakash, (2012), "Data Mining
Classification Technique for Talent Management
using SVM," the International Conference on
Computing, Electronics and Electrical Technologies,
2012.
[4] Hua Hu, Jing Ye, and Chunlai Chai, (2009), “A Talent
Classification Method Based on SVM”, in
International Symposium on Intelligent Ubiquitous
Computing and Education, Chengdu, China, 2009,
pp. 160-163.
[5] Kirimi JM, Motur CA (2016), “Application of Data
Mining Classification in Employee Performance
Prediction”. International Journal of Computer
Applications, Volume 146 – No.7, July 2016.
[6] Desouki M. S., Al-Daher J (2015),“UsingData Mining
Tools to Improve the Performance Appraisal
Procedure, HIAST Case”. International Journal of
Advanced Information in Arts, Science &
Management Vol.2, No.1, February 2015.
[7] ZHAO Xin (2008). A Study of Performance
Evaluation of HRM: Based on Data mining.
International Seminar on Future Information
Technology and Management Engineering
[8] Yan Huang (2009). Study of College Human
Resources Data Mining Based on the SOM
Algorithm. 2009 Asia Pacific Conference on
Information Processing.
[9] ChenXiaofan,and WangFengbin(2010).Application
of Data Mining on Enterprise Human Resource
Performance Management. 3rdInternational
Conference on Information Management,
Innovation Managementand Industrial Engineering
[10] Honglei Zhang (2009). Fuzzy Evaluation on the
Performance of Human Resources Management of
Commercial Banks Based on Improved Algorithm.
2009 2nd International Conference on Power
Electronics and Intelligent Transportation System.

More Related Content

PDF
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
PDF
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
PDF
IRJET- Student Placement Prediction using Machine Learning
PDF
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
PDF
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
PDF
Correlation of artificial neural network classification and nfrs attribute fi...
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
IRJET- Student Placement Prediction using Machine Learning
REVIEWING PROCESS MINING APPLICATIONS AND TECHNIQUES IN EDUCATION
IRJET - Comparative Analysis of GUI based Prediction of Parkinson Disease usi...
Correlation of artificial neural network classification and nfrs attribute fi...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
DATA AUGMENTATION TECHNIQUES AND TRANSFER LEARNING APPROACHES APPLIED TO FACI...

What's hot (18)

PDF
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
PDF
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
PDF
DEEP-LEARNING-BASED HUMAN INTENTION PREDICTION WITH DATA AUGMENTATION
PDF
IRJET- Predictive Analytics for Placement of Student- A Comparative Study
PDF
Software Bug Detection Algorithm using Data mining Techniques
PDF
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
PDF
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
PDF
Identification of important features and data mining classification technique...
PDF
Biometric Identification and Authentication Providence using Fingerprint for ...
PDF
An Architecture for Simplified and Automated Machine Learning
PDF
A Defect Prediction Model for Software Product based on ANFIS
PDF
IRJET- Medical Data Mining
PDF
Advanced Question Paper Generator using Fuzzy Logic
PDF
4113ijaia09
PDF
Performance analysis of binary and multiclass models using azure machine lear...
PDF
IRJET- E-MORES: Efficient Multiple Output Regression for Streaming Data
PDF
Clustering Prediction Techniques in Defining and Predicting Customers Defecti...
PDF
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
MOVIE SUCCESS PREDICTION AND PERFORMANCE COMPARISON USING VARIOUS STATISTICAL...
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR ML
DEEP-LEARNING-BASED HUMAN INTENTION PREDICTION WITH DATA AUGMENTATION
IRJET- Predictive Analytics for Placement of Student- A Comparative Study
Software Bug Detection Algorithm using Data mining Techniques
COMPARATIVE ANALYSIS OF DIFFERENT MACHINE LEARNING ALGORITHMS FOR PLANT DISEA...
PREDICTION OF MALIGNANCY IN SUSPECTED THYROID TUMOUR PATIENTS BY THREE DIFFER...
Identification of important features and data mining classification technique...
Biometric Identification and Authentication Providence using Fingerprint for ...
An Architecture for Simplified and Automated Machine Learning
A Defect Prediction Model for Software Product based on ANFIS
IRJET- Medical Data Mining
Advanced Question Paper Generator using Fuzzy Logic
4113ijaia09
Performance analysis of binary and multiclass models using azure machine lear...
IRJET- E-MORES: Efficient Multiple Output Regression for Streaming Data
Clustering Prediction Techniques in Defining and Predicting Customers Defecti...
Feature Selection : A Novel Approach for the Prediction of Learning Disabilit...
Ad

Similar to IRJET - Employee Performance Prediction System using Data Mining (20)

PDF
Mantle Of Ml In Human Resource Management - Phdassistance
PDF
Paper id 29201413
PDF
EMPLOYEE ATTRITION PREDICTION IN INDUSTRY USING MACHINE LEARNING TECHNIQUES
PPTX
Mantle Of Ml In Human Resource Management - Phdassistance
PPTX
AI impact on HR evaluation ppt for ADA.pptx
PDF
Predicting Employee Attrition using various techniques of Machine Learning
PDF
1130 track1 stevens
PPTX
A proposed model ppt
PDF
Using Naive Bayesian Classifier for Predicting Performance of a Student
PDF
Data mining for prediction of human
PDF
Top Cited Articles - October 2024 - Top Cited Articles in Data Mining
PDF
Top Cited Articles - September 2024 - Top Cited Articles in Data Mining
PPTX
Employee Retention Prediction: A Data Science Project by Devangi Shukla
PDF
1707.01377
PDF
Machine Learning Approach for Employee Attrition Analysis
PDF
IRJET- Performance for Student Higher Education using Decision Tree to Predic...
PDF
17 Machine Learning Approach for Employee Attrition Analysis.pdf
PDF
Developing a framework for
PDF
Analysis on Data Mining Techniques for Heart Disease Dataset
Mantle Of Ml In Human Resource Management - Phdassistance
Paper id 29201413
EMPLOYEE ATTRITION PREDICTION IN INDUSTRY USING MACHINE LEARNING TECHNIQUES
Mantle Of Ml In Human Resource Management - Phdassistance
AI impact on HR evaluation ppt for ADA.pptx
Predicting Employee Attrition using various techniques of Machine Learning
1130 track1 stevens
A proposed model ppt
Using Naive Bayesian Classifier for Predicting Performance of a Student
Data mining for prediction of human
Top Cited Articles - October 2024 - Top Cited Articles in Data Mining
Top Cited Articles - September 2024 - Top Cited Articles in Data Mining
Employee Retention Prediction: A Data Science Project by Devangi Shukla
1707.01377
Machine Learning Approach for Employee Attrition Analysis
IRJET- Performance for Student Higher Education using Decision Tree to Predic...
17 Machine Learning Approach for Employee Attrition Analysis.pdf
Developing a framework for
Analysis on Data Mining Techniques for Heart Disease Dataset
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
composite construction of structures.pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPT
Mechanical Engineering MATERIALS Selection
PPT
Project quality management in manufacturing
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Construction Project Organization Group 2.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
web development for engineering and engineering
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
composite construction of structures.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Mechanical Engineering MATERIALS Selection
Project quality management in manufacturing
Operating System & Kernel Study Guide-1 - converted.pdf
OOP with Java - Java Introduction (Basics)
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Internet of Things (IOT) - A guide to understanding
Strings in CPP - Strings in C++ are sequences of characters used to store and...
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Construction Project Organization Group 2.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
web development for engineering and engineering
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...

IRJET - Employee Performance Prediction System using Data Mining

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1552 Employee Performance Prediction System using Data Mining Tejas Raut1, Priya Kale2, Rashmi Sonkusare3, A. K. Gaikwad4 1,2,3Student, Dept. of Computer Science & Engineering, DES’S COET, Dhamangaon Rly 4Professor, Dept. of Computer Science & Engineering, DES’S COET, Dhamangaon Rly ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - Human capital is of a high concern for companies’ management where their most interest is in hiring the highly qualified personnel which are expected to perform highly as well. Human Resources Management (HRM) has become one of theessential interestsofmanagers and decision makers in almost all types of businesses to adopt plans for correctly discovering highly qualified employees. Accordingly, managements become interested about the performance of these employees. Results show that professional skill development programs are needed in order to prepare employees to perform their tasks more efficiently. The knowledge flow model of the Open source tool is also used to frame the elements. To get a highly accurate model, several experiments were executed based on the previous techniques that are implemented tool for enabling decision makers and human resources professionals to predict and enhance the performance of their employees. Key Words: HRM, Employee Performance Prediction, Data Mining, KNN, Dataset. 1. INTRODUCTION HRM has a leading role in deciding the competitiveness and effectiveness for better continuation. Organizations consider HRM as “people practices”. Therefore, it becomes the responsibility of the HRM to allocate the best employees to the appropriate job at the right time, train and qualify them, and build evaluation systems to monitor their performance and an attempt to preserve the potential talents of employees. A young yet promising of this kind is data mining. It standouts due to its wide-array of techniques from the different domains such as statistics, artificial intelligence, machine learning,algorithms,databasesystems and visualization. These influences served as groundwork for its applications to business for which the academic institutionisunexceptionallyclassified.Generally,regardless of discipline, data mining has gained popularity due to its tools with potentials to identify trends within data and turn them out into knowledge mostly with predictive attributes that could significantly lead to better and strong bases for decision making with a wide range of Open source tools availability. Data mining is a young and promising field of information and knowledge discovery. It started to be an interest target for information industry, because of the existence of huge data containing large amounts of hidden knowledge. With data mining techniques, such knowledge can be extracted and accessed transforming the databases tasks from storing and retrieval to learning and extractingknowledge.Decision tree is one of the most used techniques, since it creates the decision tree from the data given using simple equations depending mainly on calculation of the gain ratio, which gives automatically some sort of weights to attributes used, and the researcher can implicitly recognize the most effective attributes on the predicted target. Asa resultofthis technique, a decision tree would be built with classification rules generated from it. Previous studies specified several attributes affecting the employee performance. Some of these attributes are personal characteristics, others are educational and finally professional attributeswerealsoconsidered.CheinandChen (2006) used several attributes to predict the employee performance. They specified age, gender, marital status, experience, education, major subjects and school tires as potential factors that might affect the performance. Then they excluded age, gender and marital status, so that no discrimination would exist in the process of personal selection. As a result for their study, they found that employee performance is highly affected by education degree, the school tire, and thejob experience.Most recently, the prevalence of intelligent machine learning algorithms in the field of computer science has led to the development of robust quantitative methodstoderiveinsightsfromindustry data. Supervised machine learning methods—wherein computers learn from analyses of large-scale, historical, labelled datasets—have been shown to garner insights in various fields, like biology and medical sciences, transportation, political science as well asmanyotherfields. Owing to the advancements in information technology, researchers have also studied numerous machine learning approaches to improve the outcomes of human resource (HR) management. A detailed listing of recent studies in using supervised machine learning on employee turnover is described in Table 1, and lists the data included and related machine learning algorithms that were used therein, including decision tree (DT) methods, random forest (RF) methods, gradient boosting trees (GBT) methods, extreme gradient boosting (XGB), logistic regression (LR), support vector machines (SVM), neural networks (NN), linear discriminant analysis (LDA), Naïve Bayes (NB) methods, K - nearest neighbor (KNN), Bayesian networks (BN) and induction rule methods (IND)
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1553 2. RELATED WORK In general, Classification contains some steps to complete its process. The first step is called thelearningstep where in the model; predefinedclassesarebuiltbyanalyzing a set of training dataset variables. Each variable is assumed that has a relation and regards to a predefined class. The second step is responsible for estimating the accuracy of model or classifier (validatingthemodel)throughtesting the model using a different dataset. If the classifier’s accuracy was considered acceptable, the model or classifier can be used to apply to new unseen data to give prediction about specific unknown label class and this is considered the third step. Therefore, the model acts as a classifier in the process of decision-making. There are various classification techniques have been used in the prediction process suchas DT, Naïve Bayes, SVM, etc. Human capital is of a high concern for companies’ management where their most interestisinhiringthe highly qualified personnel which are expected to perform highlyas well. Foremost, the Human Resource (HR). Management plays the role of ensuring this by closely adhering to the standards set by the higher management or by some heuristic needs of applicants with distinctive qualifications and potentials. However, oft-quoted factors that may affect employee performance are attributed to educational backgrounds, working experiences, as well as personal qualities. These when converged provide a picture of how well an employee performs his tasks. Assessment of human resource performance is a sensitive task. To avoid partiality, an efficient tool to deal with various data and assist managers to make decisions and plans is of great help. In data mining, historical data such as those attributes that influence performance could be exploited as learning experiences. These can be used to predict future circumstances and rich resource of knowledge and decision supports. In general, data classification is a two-step process. In the first step, which is called the learning step, a model that describes a predetermined set of classes or concepts is built by analyzing a set of training database instances. Each instance is assumed to belong to a predefined class. In the second step, the model is tested using a different data set that is used to estimate the classification accuracy of the model. If the accuracy of the model is considered acceptable, the model can be used to classify future data instances for which the class label is not known. At the end,themodel acts as a classifier in the decision making process. There are several techniques that can be used for classificationsuchas decision tree, Bayesian methods, rule based algorithms, and Neural Networks. 3. IMPLEMENTATION Feature selection is a one of the main concepts of DM and Machine Learning. Where, it is a process of selecting necessary useful variables in a datasettoimprovetheresults of machine learning and make it more accurate. At which, Using too many numbers of variables in a dataset reduce predictive performance. The data set may contain too many features; some of them do not promote the prediction accuracy, and thus make the predictive model excessively complicated. Therefore, unnecessary useless variables must be avoided to make the model efficientlyworks.Deciding which unnecessary variable to avoid can be done by a manual manner using domain knowledge or it can be done automatically. In the section of Business understanding with proper endorsement and approval of some academic administrators, questions as to how the (DM) data mining functionalities are best applied in any Technological Institute has been identified. Recent studies had However, oft-quoted factors that may affectemployeesperformanceareattributedto educational backgrounds, working experiences, as well as personal qualities. These when converged provide a picture of how well an employee performshistasks.Assessmentofhuman resource performance is a sensitive task. To avoid partiality, an efficient tool to deal with various data and assist managers to make decisions and plans is of great help. In data mining, historical data such as those attributes that influence performance could be exploited as learning experiences.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 02 | Feb 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 1554 4. CONCLUSION This paper has concentrated on the possibility of building a classification model for predicting the employees’ performance. Applying the DM techniques in the different problem domains in the HRM field is considered as an important and urgent issue. Especially,atthepublicsectorin Egypt. In addition, increasing the horizons of academic and practice research on DM in HR for reaching a government sector with a high performance. This study further outlooks for applications of results to analyzeenhancementprograms for senior employees of any organization and to identify patterns affecting both teacher and student performance using other data miningtechniquessuchasassociationrules. REFERENCES [1] Islam, M. J., Wu, Q. M. J., Ahmadi, M., and Sid-Ahmed, M. A., (2010), "Investigating the Performance of Naive- Bayes Classifiers and K- Nearest Neighbor Classifiers" Journal of Convergence Information Technology Volume 5, Number 2, April 2010. [2] Al-Radaideh, Q.A., Al-Nagi, E., (2012). “Using Data Mining Techniques to Build a Classification Model for Predicting Employees Performance”, International Journal of Advanced Computer Science and Applications, 3(2), pp 144 – 151. [3] S.Yasodha and P. S.Prakash, (2012), "Data Mining Classification Technique for Talent Management using SVM," the International Conference on Computing, Electronics and Electrical Technologies, 2012. [4] Hua Hu, Jing Ye, and Chunlai Chai, (2009), “A Talent Classification Method Based on SVM”, in International Symposium on Intelligent Ubiquitous Computing and Education, Chengdu, China, 2009, pp. 160-163. [5] Kirimi JM, Motur CA (2016), “Application of Data Mining Classification in Employee Performance Prediction”. International Journal of Computer Applications, Volume 146 – No.7, July 2016. [6] Desouki M. S., Al-Daher J (2015),“UsingData Mining Tools to Improve the Performance Appraisal Procedure, HIAST Case”. International Journal of Advanced Information in Arts, Science & Management Vol.2, No.1, February 2015. [7] ZHAO Xin (2008). A Study of Performance Evaluation of HRM: Based on Data mining. International Seminar on Future Information Technology and Management Engineering [8] Yan Huang (2009). Study of College Human Resources Data Mining Based on the SOM Algorithm. 2009 Asia Pacific Conference on Information Processing. [9] ChenXiaofan,and WangFengbin(2010).Application of Data Mining on Enterprise Human Resource Performance Management. 3rdInternational Conference on Information Management, Innovation Managementand Industrial Engineering [10] Honglei Zhang (2009). Fuzzy Evaluation on the Performance of Human Resources Management of Commercial Banks Based on Improved Algorithm. 2009 2nd International Conference on Power Electronics and Intelligent Transportation System.