SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 606
Educational Data Mining to Analyze Students Performance – Concept
Plan
Prajakta Akerkar1, Yashoda Bhat2, Mona Deshmukh3
1,2Student, VES Institute of Technology
3 Professor, Dept. MCA, VES Institute of Technology, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract – Data mining is of great significance in the
business world as it aids in decision making and gaining
insights in the data which is stored by the organizations.
Educational institutions store a lot of data related to the
students which is retrieved as and when required by the
management but such kind of retrieval does not provide any
insights into the data and it is extremely tedious for any
human to analyse and derive any decisions from that
information. So the purpose of this paper is to suggest various
techniques that can be used to analyse and derive insights
from the existing data.
Key Words: Educational data mining (EDM), Student’s
Performance, K-means, Naïve Bayesian, Clustering,
Classification
1. INTRODUCTION
Computers can store, process and retrieve any type of
information like text, numbers, images, etc. Data mining is a
knowledge discovery process. Mining in the field of
education is known as Educational Data Mining.
Student’s attendance in class, family income, mother’s
qualification, current knowledge and motivation greatly
influence their performance in exams. [2]
Understanding and analyzing the reasons of poor
performance is necessary because,thatshouldbeavoidedby
the institutions to have a good reputation ahead and notjust
poor performance the reasons for good performanceshould
also be known so that it can be repeated ahead.
There are many data mining algorithms available which can
be applied to the raw data to get the necessary results but
the objective of this paper is to suggest an algorithm
theoretically which can be appliedperfectlytothedata to get
the results. Some of the classic EDM problems are stated as
follows [7]:-
 Classification
 Clustering
 Frequency pattern mining
 Emerging pattern mining
 Visual analytics
 Predictive Modeling.
Major applications of EDM are :
 Developing concept maps
 Social network analysis
 Analysis and visualization of data
 Predicting student performance
 Recommendations for students, teachers and
educational institutions
 Grouping students.
Fig -1: EDM Flow
2. METHODOLOGY
1. Identifying stakeholders
2.
Considering all the people involved in education the
stakeholders can be divided into three groups as follows:-
1. Primary: - These set of people are directly involved in
teaching and learning.
Example – Teachers and Students
2. Secondary: - These set of people are indirectly involved in
the process of teaching and learning they mostly contribute to
the growth of the educational institution
Example – Alumni, Parents, Trustees
Hybrid: - These people are involved in the administrative
process. They mostly do decision making for the institution.
Example – Non-teaching staff, educational planners.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 607
3. Collecting data
Related student data can be collected from the following
sources:
- Several Learning management systems (LMS) track
students information as to when the students has accessed
the learning object, how many times it has been accessed
and the time spent on the learning object each time a student
access it. [5]
- Intelligent tutoring system record data every time a student
submits a solution to a particular problem. The time of
submission, whether the answer is correct or not is captured
by the ITS.
- Offline data: - Here the data is typically captured from the
classroom interaction, student’s attendance, course
information, grades.
Fig -2 : Integrated Learning Management Systems [5]
In this step gather only those fields which are required for the
EDM process these fields are called student related variables.
The judgement parameters not only give an idea about the
marks/grades gained by the student but also the overall
personality of the student.
Following student related variables can be considered:-
Table -1: EDM Judgement Parameters
Sr
No. Judgement Parameters
1 Gender
2 Xth grade/%
3 XIIth grade/%
4 Family income
5 Father's qualification
6 Father's occupation
7 Mother's qualification
8 Mother's occupation
9 Siblings (if any)
10 Siblings qualification
11 Siblings occupation
12
Preferred time to study
(Morning/Afternoon/Night)
13 Interested in higher studies
14 Caste
15 School type (Girls/Boys/Co-ed)
16 Do you have a cell phone
17 Are you active onsocial network?
18 Hobbies
19 Favorite subject
20 Where do you stay? (area)
21
Time spent on travelling
everyday
22
Do you submit assignments on
time
23 Attendance in class
24 Attentiveness in class
25 Do you make notes in class?
26
How many reference books do
you refer for a subject?
Thus a survey can be conducted based upon the above
variables which can be used to judge a student completely.
To classify the students there have to be a predefined set of
classes as mentioned below:-
 Poor
 Satisfactory
 Average
 Good
 Excellent
4. TECHNIQUES OF IMPLEMENTATION
A. Clustering Algorithm
Clustering means grouping data into groups of similar
objects. It is significant in information retrieval and text
mining, scientific data exploration, web analysis, and many
more.
It is unsupervised and statistical data analysis technique.
Cluster analysis is used to break down a large data set into
small subsets called as clusters. Each clusterisa collectionof
similar objects.
Application of clustering to Educational Data Mining:-
K-means is the best clustering algorithm in data mining . It
proposes to partition ‘n’ objects into ‘k’ clusters. The
objective of this technique is to minimize the squared error
function or total intra-cluster variance.
Let X= {x1,x2,x3,……..xn} be the set of data points and V=
{v1,v2,v3,……..vn} be the set of centers.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 608
1) Randomly select “c” cluster centers.
2) Calculate the distance between each data point and
cluster centers using the Euclidean formula.
3) Assign the data point to the cluster center whose
distance from the cluster center is minimum of all the
cluster centers.
4) Recalculate the new cluster center using :
Where, ‘ci’ is the number of data points in ith cluster.
5) Recalculate the distance between eachdata pointandnew
obtained cluster centers.
6) If no data point was reassigned then stop, otherwise
repeat from
step (3). [4,7]
B. Classification Algorithm
Classification is the form of data mining that defines models
having important data classes. Usually decision tree
classification is employed in this technique. In this test data
sets are used to find the accuracy of the classificationrules.If
the accuracy is acceptable the rules can be applied to the
new data tuples. [7]
Naïve Bayesian Algorithm is a classification technique based
on Bayes Theorem with an assumption of independence
among the predictors.
That means it assumes that the presence of a particular
feature in a class is unrelated to the presence of any other
feature.
This algorithm is easy to build and very useful in classifying
large data sets.
Bayes theorem provides a way of calculating posterior
probability P (c|x) from P(c), P(x) and P(x|c). Following is
the equation:-
P(c|x) = P(x|c) P(c)
P(x)
P(c|x) = P(x1|c) x P(x2|c) x…x P(xn|c) x P(c)
This technique works in large student database to evaluate
the results. [1]
4. CONCLUSION
This research paper given the major idea of creating a
concept plan for every student and analyzing eachandevery
student completely through the judgement parameters.
Thus this paper gives a theoretical concept of educational
data mining using by using K-means clustering and Bayes
algorithm to collect the data by conducting a survey, apply
the algorithms and insights for every student. After the
students are classified further action can be taken on each
and every class of students to enhance their performance
and provide more detailed guidance to them.
5. FUTURE WORK
Future enhancements includes implementing this work in
XL Miner tool or KEEL.
ACKNOWLEDGEMENT
This acknowledgment is a small effort to express my
gratitude to all those who have assisted us during thecourse
of preparing this paper.
We are greatly indebted to express immense pleasure and
sense of gratitude towards our guide and mentorProf.Mona
Deshmukh for her constant support and valuable
encouragement.
We express our heart-felt gratitude to Mr. M. Prabhanath
Nair for his timely inputs.
REFERENCES
[1]Prediction of Students Performance using Educational
Data Mining Ms.Tismy Devasia1, ,Ms.Vinushree T P2,
Mr.Vinayak Hegde3 Department of Computer Science
Amrita Vishwa Vidyapeeth University,Mysuru Campus.
[2] Evolutionary Algorithm Based Rule(s) Generation for
Personalized Courseware Construction in Educational Data
Mining Sreenath. K1, Jeyakumar. G2
Department of Computer Science and Engineering, Amrita
School of Engineering, Coimbatore Amrita Vishwa
Vidyapeetham, Amrita University, India 1
[3] Comparison of applications for educational data mining
in Engineering Education Diego Buenaño Fernández
Facultad de Ingenierías y Ciencias Agropecuarias
Universidad de las Américas Quito, Ecuador Sergio Luján-
Mora Department of Software and Computing Systems
University of Alicante
[4] A Systematic Review on Educational Data Mining Ashish
Dutt, Maizatul Akmar Ismail, and Tutut Herawan.
[5] Educational Data Mining and Big Data Framework for e-
Learning Environment Prakash Kumar Udupi1, Nisha
Sharma2 , S K Jha3 Department of Computing, Middle East
College, Muscat, NIMT, Kurukshetra, Haryana, India,AIIT,
Amity University, Noida, India
[6]Data ware house design for Educational Data Mining
Oswaldo Moscoso-Zea1, Andres-Sampedro1, and Sergio
Luján-Mora2 1Equinoctial Technological University,Faculty
of Engineering, Quito, Ecuador 2University of Alicante,
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 609
Department of Software and Computing Systems, Alicante,
Spain.
[7] Educational Data Mining –Applications and Techniques
B.Namratha Assistant Professor, DepartmentofInformation
Technology, Anurag Group of Institutions,RR Dist.,
Hyderabad, Telangana, India Niteesha Sharma Assistant
Professor, Department of Information Technology, Anurag
Group of Institutions,RR Dist., Hyderabad, Telangana, India

More Related Content

PDF
Predicting students performance using classification techniques in data mining
PDF
Student Performance Evaluation in Education Sector Using Prediction and Clust...
PDF
IRJET- Using Data Mining to Predict Students Performance
PDF
A Nobel Approach On Educational Data Mining
PDF
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
PDF
Evaluation of Data Mining Techniques for Predicting Student’s Performance
PPTX
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
PDF
Predicting students' performance using id3 and c4.5 classification algorithms
Predicting students performance using classification techniques in data mining
Student Performance Evaluation in Education Sector Using Prediction and Clust...
IRJET- Using Data Mining to Predict Students Performance
A Nobel Approach On Educational Data Mining
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
Evaluation of Data Mining Techniques for Predicting Student’s Performance
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
Predicting students' performance using id3 and c4.5 classification algorithms

What's hot (19)

PDF
Predicting instructor performance using data mining techniques in higher educ...
PDF
IRJET - A Study on Student Career Prediction
PDF
Extending the Student’s Performance via K-Means and Blended Learning
PDF
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
PDF
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
PDF
IRJET- Performance for Student Higher Education using Decision Tree to Predic...
PDF
EDM_IJTIR_Article_201504020
PPTX
Data mining to predict academic performance.
PDF
Analysis on Student Admission Enquiry System
PDF
IRJET- Analysis of Student Performance using Machine Learning Techniques
PDF
Predicting student performance using aggregated data sources
PDF
Correlation based feature selection (cfs) technique to predict student perfro...
PDF
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
PDF
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
PDF
Data Mining Application in Advertisement Management of Higher Educational Ins...
PDF
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...
PDF
Ijciet 10 02_007
Predicting instructor performance using data mining techniques in higher educ...
IRJET - A Study on Student Career Prediction
Extending the Student’s Performance via K-Means and Blended Learning
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
IRJET- Performance for Student Higher Education using Decision Tree to Predic...
EDM_IJTIR_Article_201504020
Data mining to predict academic performance.
Analysis on Student Admission Enquiry System
IRJET- Analysis of Student Performance using Machine Learning Techniques
Predicting student performance using aggregated data sources
Correlation based feature selection (cfs) technique to predict student perfro...
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Data Mining Application in Advertisement Management of Higher Educational Ins...
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...
Ijciet 10 02_007
Ad

Similar to Educational Data Mining to Analyze Students Performance – Concept Plan (20)

PDF
Educational Data Mining & Students Performance Prediction using SVM Techniques
PDF
A Systematic Review On Educational Data Mining
PDF
A Systematic Review on the Educational Data Mining and its Implementation in ...
PDF
A Survey on the Classification Techniques In Educational Data Mining
PDF
A study model on the impact of various indicators in the performance of stude...
PDF
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
PDF
Data Clustering in Education for Students
PDF
IRJET- Academic Performance Analysis System
DOC
A Survey on Educational Data Mining Techniques
PDF
Vol2no2 7 copy
PDF
Study of Clustering of Data Base in Education Sector Using Data Mining
PDF
Study of Clustering of Data Base in Education Sector Using Data Mining
PDF
Study of Clustering of Data Base in Education Sector Using Data Mining
PPTX
Education data mining presentation
PDF
A SURVEY ON EDUCATIONAL DATA MINING AND RESEARCH TRENDS
PPTX
seminxxxxxxxxxxxxxxxxxxxxxxxxxar ppt(1).pptx
PDF
Data Mining Techniques in Higher Education an Empirical Study for the Univer...
PDF
Ijdms050304A SURVEY ON EDUCATIONAL DATA MINING AND RESEARCH TRENDS
PDF
Data Mining Application in Advertisement Management of Higher Educational Ins...
Educational Data Mining & Students Performance Prediction using SVM Techniques
A Systematic Review On Educational Data Mining
A Systematic Review on the Educational Data Mining and its Implementation in ...
A Survey on the Classification Techniques In Educational Data Mining
A study model on the impact of various indicators in the performance of stude...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
Data Clustering in Education for Students
IRJET- Academic Performance Analysis System
A Survey on Educational Data Mining Techniques
Vol2no2 7 copy
Study of Clustering of Data Base in Education Sector Using Data Mining
Study of Clustering of Data Base in Education Sector Using Data Mining
Study of Clustering of Data Base in Education Sector Using Data Mining
Education data mining presentation
A SURVEY ON EDUCATIONAL DATA MINING AND RESEARCH TRENDS
seminxxxxxxxxxxxxxxxxxxxxxxxxxar ppt(1).pptx
Data Mining Techniques in Higher Education an Empirical Study for the Univer...
Ijdms050304A SURVEY ON EDUCATIONAL DATA MINING AND RESEARCH TRENDS
Data Mining Application in Advertisement Management of Higher Educational Ins...
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
737-MAX_SRG.pdf student reference guides
PPTX
Sustainable Sites - Green Building Construction
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PPTX
additive manufacturing of ss316l using mig welding
PDF
Categorization of Factors Affecting Classification Algorithms Selection
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPT
introduction to datamining and warehousing
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
737-MAX_SRG.pdf student reference guides
Sustainable Sites - Green Building Construction
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Internet of Things (IOT) - A guide to understanding
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Fundamentals of safety and accident prevention -final (1).pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Safety Seminar civil to be ensured for safe working.
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
additive manufacturing of ss316l using mig welding
Categorization of Factors Affecting Classification Algorithms Selection
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
introduction to datamining and warehousing
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS

Educational Data Mining to Analyze Students Performance – Concept Plan

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 606 Educational Data Mining to Analyze Students Performance – Concept Plan Prajakta Akerkar1, Yashoda Bhat2, Mona Deshmukh3 1,2Student, VES Institute of Technology 3 Professor, Dept. MCA, VES Institute of Technology, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract – Data mining is of great significance in the business world as it aids in decision making and gaining insights in the data which is stored by the organizations. Educational institutions store a lot of data related to the students which is retrieved as and when required by the management but such kind of retrieval does not provide any insights into the data and it is extremely tedious for any human to analyse and derive any decisions from that information. So the purpose of this paper is to suggest various techniques that can be used to analyse and derive insights from the existing data. Key Words: Educational data mining (EDM), Student’s Performance, K-means, Naïve Bayesian, Clustering, Classification 1. INTRODUCTION Computers can store, process and retrieve any type of information like text, numbers, images, etc. Data mining is a knowledge discovery process. Mining in the field of education is known as Educational Data Mining. Student’s attendance in class, family income, mother’s qualification, current knowledge and motivation greatly influence their performance in exams. [2] Understanding and analyzing the reasons of poor performance is necessary because,thatshouldbeavoidedby the institutions to have a good reputation ahead and notjust poor performance the reasons for good performanceshould also be known so that it can be repeated ahead. There are many data mining algorithms available which can be applied to the raw data to get the necessary results but the objective of this paper is to suggest an algorithm theoretically which can be appliedperfectlytothedata to get the results. Some of the classic EDM problems are stated as follows [7]:-  Classification  Clustering  Frequency pattern mining  Emerging pattern mining  Visual analytics  Predictive Modeling. Major applications of EDM are :  Developing concept maps  Social network analysis  Analysis and visualization of data  Predicting student performance  Recommendations for students, teachers and educational institutions  Grouping students. Fig -1: EDM Flow 2. METHODOLOGY 1. Identifying stakeholders 2. Considering all the people involved in education the stakeholders can be divided into three groups as follows:- 1. Primary: - These set of people are directly involved in teaching and learning. Example – Teachers and Students 2. Secondary: - These set of people are indirectly involved in the process of teaching and learning they mostly contribute to the growth of the educational institution Example – Alumni, Parents, Trustees Hybrid: - These people are involved in the administrative process. They mostly do decision making for the institution. Example – Non-teaching staff, educational planners.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 607 3. Collecting data Related student data can be collected from the following sources: - Several Learning management systems (LMS) track students information as to when the students has accessed the learning object, how many times it has been accessed and the time spent on the learning object each time a student access it. [5] - Intelligent tutoring system record data every time a student submits a solution to a particular problem. The time of submission, whether the answer is correct or not is captured by the ITS. - Offline data: - Here the data is typically captured from the classroom interaction, student’s attendance, course information, grades. Fig -2 : Integrated Learning Management Systems [5] In this step gather only those fields which are required for the EDM process these fields are called student related variables. The judgement parameters not only give an idea about the marks/grades gained by the student but also the overall personality of the student. Following student related variables can be considered:- Table -1: EDM Judgement Parameters Sr No. Judgement Parameters 1 Gender 2 Xth grade/% 3 XIIth grade/% 4 Family income 5 Father's qualification 6 Father's occupation 7 Mother's qualification 8 Mother's occupation 9 Siblings (if any) 10 Siblings qualification 11 Siblings occupation 12 Preferred time to study (Morning/Afternoon/Night) 13 Interested in higher studies 14 Caste 15 School type (Girls/Boys/Co-ed) 16 Do you have a cell phone 17 Are you active onsocial network? 18 Hobbies 19 Favorite subject 20 Where do you stay? (area) 21 Time spent on travelling everyday 22 Do you submit assignments on time 23 Attendance in class 24 Attentiveness in class 25 Do you make notes in class? 26 How many reference books do you refer for a subject? Thus a survey can be conducted based upon the above variables which can be used to judge a student completely. To classify the students there have to be a predefined set of classes as mentioned below:-  Poor  Satisfactory  Average  Good  Excellent 4. TECHNIQUES OF IMPLEMENTATION A. Clustering Algorithm Clustering means grouping data into groups of similar objects. It is significant in information retrieval and text mining, scientific data exploration, web analysis, and many more. It is unsupervised and statistical data analysis technique. Cluster analysis is used to break down a large data set into small subsets called as clusters. Each clusterisa collectionof similar objects. Application of clustering to Educational Data Mining:- K-means is the best clustering algorithm in data mining . It proposes to partition ‘n’ objects into ‘k’ clusters. The objective of this technique is to minimize the squared error function or total intra-cluster variance. Let X= {x1,x2,x3,……..xn} be the set of data points and V= {v1,v2,v3,……..vn} be the set of centers.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 608 1) Randomly select “c” cluster centers. 2) Calculate the distance between each data point and cluster centers using the Euclidean formula. 3) Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers. 4) Recalculate the new cluster center using : Where, ‘ci’ is the number of data points in ith cluster. 5) Recalculate the distance between eachdata pointandnew obtained cluster centers. 6) If no data point was reassigned then stop, otherwise repeat from step (3). [4,7] B. Classification Algorithm Classification is the form of data mining that defines models having important data classes. Usually decision tree classification is employed in this technique. In this test data sets are used to find the accuracy of the classificationrules.If the accuracy is acceptable the rules can be applied to the new data tuples. [7] Naïve Bayesian Algorithm is a classification technique based on Bayes Theorem with an assumption of independence among the predictors. That means it assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. This algorithm is easy to build and very useful in classifying large data sets. Bayes theorem provides a way of calculating posterior probability P (c|x) from P(c), P(x) and P(x|c). Following is the equation:- P(c|x) = P(x|c) P(c) P(x) P(c|x) = P(x1|c) x P(x2|c) x…x P(xn|c) x P(c) This technique works in large student database to evaluate the results. [1] 4. CONCLUSION This research paper given the major idea of creating a concept plan for every student and analyzing eachandevery student completely through the judgement parameters. Thus this paper gives a theoretical concept of educational data mining using by using K-means clustering and Bayes algorithm to collect the data by conducting a survey, apply the algorithms and insights for every student. After the students are classified further action can be taken on each and every class of students to enhance their performance and provide more detailed guidance to them. 5. FUTURE WORK Future enhancements includes implementing this work in XL Miner tool or KEEL. ACKNOWLEDGEMENT This acknowledgment is a small effort to express my gratitude to all those who have assisted us during thecourse of preparing this paper. We are greatly indebted to express immense pleasure and sense of gratitude towards our guide and mentorProf.Mona Deshmukh for her constant support and valuable encouragement. We express our heart-felt gratitude to Mr. M. Prabhanath Nair for his timely inputs. REFERENCES [1]Prediction of Students Performance using Educational Data Mining Ms.Tismy Devasia1, ,Ms.Vinushree T P2, Mr.Vinayak Hegde3 Department of Computer Science Amrita Vishwa Vidyapeeth University,Mysuru Campus. [2] Evolutionary Algorithm Based Rule(s) Generation for Personalized Courseware Construction in Educational Data Mining Sreenath. K1, Jeyakumar. G2 Department of Computer Science and Engineering, Amrita School of Engineering, Coimbatore Amrita Vishwa Vidyapeetham, Amrita University, India 1 [3] Comparison of applications for educational data mining in Engineering Education Diego Buenaño Fernández Facultad de Ingenierías y Ciencias Agropecuarias Universidad de las Américas Quito, Ecuador Sergio Luján- Mora Department of Software and Computing Systems University of Alicante [4] A Systematic Review on Educational Data Mining Ashish Dutt, Maizatul Akmar Ismail, and Tutut Herawan. [5] Educational Data Mining and Big Data Framework for e- Learning Environment Prakash Kumar Udupi1, Nisha Sharma2 , S K Jha3 Department of Computing, Middle East College, Muscat, NIMT, Kurukshetra, Haryana, India,AIIT, Amity University, Noida, India [6]Data ware house design for Educational Data Mining Oswaldo Moscoso-Zea1, Andres-Sampedro1, and Sergio Luján-Mora2 1Equinoctial Technological University,Faculty of Engineering, Quito, Ecuador 2University of Alicante,
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 06 | June -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 609 Department of Software and Computing Systems, Alicante, Spain. [7] Educational Data Mining –Applications and Techniques B.Namratha Assistant Professor, DepartmentofInformation Technology, Anurag Group of Institutions,RR Dist., Hyderabad, Telangana, India Niteesha Sharma Assistant Professor, Department of Information Technology, Anurag Group of Institutions,RR Dist., Hyderabad, Telangana, India