SlideShare a Scribd company logo
CEIT 720: Learning Anaytics In Education
Predictive Methods
Beste Ulus
Next-Term Student Performance Prediction:
A Recommender Systems Approach
Sweeney, M., Rangwala, H., Lester, J., Johri, A. (2016)
Context of the Study
The purpose of the study presented in this paper was to apply state-of-the-art recommender systems
techniques to the task of student performance prediction.
Research Questions
 In the present study, they compare more models on a larger dataset and conduct a more detailed
analysis of feature importance, performance errors, and implications for educational applications.
Research Method of the Study
 Predictive Analytical Method
Data Collection - Data
 Students from a public university
 Instructor (classification, rank, tenure status)
 Student info (transfer or not)
 Courses
 Disciplines
 Letter grades
 Demographics data, such as age, race, sex,
high school CEEB code and GPA, zip code, and
1600-scale SAT scores.
Data Collection - Methods
 Simple baselines
 MF-based methods
 Common regression models.
Data Collection - Methods
 Uniform Random (UR): Randomly predict grades from a uniform distribution over the range [0, 4].
 Global Mean (GM): Predict grades using the mean of all previously observed grades.
 Mean of Means (MoM): Predict grades using an average of the global mean, the per student mean, and
the per-course mean.
Data Collection - Methods
Three methods based on Matrix Factorization (MF):
 Singular Value Decomposition (SVD)
 SVD-kNN: SVD post-processed with kNN
 Factorization Machine (FM)
Data Collection - Methods
Four different regression models:
 Random Forest (RF)
 Stochastic Gradient Descent (SGD) Regression
 k-Nearest Neighbors (kNN)
 Personalized Linear Multiple Regression (PLMR)
Data Collection - Methods
Evaluations are performed in terms of two common regression metrics:
 Root Mean Squared Error (RMSE)
 Mean Absolute Error (MAE).
Data Collection - Tools
 No information on which tool they utilized to analyze data.
Results
 The FM model outperforms all the others by a wide margin, indicating 2-way feature interactions play an
important role in predicting performance.
 FM-RF hybrid is an effective method of overcoming the cold-start limitations of FMs and is the most
effective method in general.
 When the test distribution differs from the training distribution, the FM model is liable to learn
overconfident 2-way interactions that reduce its ability to generalize.
Results
Results
Implications
 For students, we can incorporate this information into a degree planning system.
 For educators, knowledge of which students have the lowest expected grades could provide
opportunities to increase detection of at-risk students.
 For advisors, any additional information that helps them personalize their advice to each student could
potentially help thousands of students.
Thank You

More Related Content

PPTX
Nonnegative matrix-fact
PPTX
Prediction-Improving Early Warning Systems With Categorized Course Resource U...
PPTX
Student performance prediction batch 15 cse DUET.AC.BD
PPTX
Data mining to predict academic performance.
PDF
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
PPTX
Web 2.0 assessment presentation
PDF
A Study on the Relationship between Affective Learning outcome and Achievemen...
Nonnegative matrix-fact
Prediction-Improving Early Warning Systems With Categorized Course Resource U...
Student performance prediction batch 15 cse DUET.AC.BD
Data mining to predict academic performance.
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
Web 2.0 assessment presentation
A Study on the Relationship between Affective Learning outcome and Achievemen...

What's hot (17)

PPTX
CHARACTERISTICS OF QUANTITAIVE RESEARCH
DOC
AN INVESTIGATION OF THE IMPACT OF ATYPICAL PRINCIPAL PREPARATION PROGRAMS ON ...
PPTX
.Analyzing data
PDF
Alise2014 30 40 d7
PPT
Educ 190_Data Analysis and Collection Tools
PDF
B-School Selection Criteria Influencers
DOCX
The latest assigned task amendment based on the new working title.
PPTX
83341 ch23 jacobsen
PDF
Analytics of information flows and decision making in heterogeneous learning ...
PPT
Alice Research Plan
PDF
Predicting Success : An Application of Data Mining Techniques to Student Outc...
PPTX
Multi variate presentation
PPTX
83341 ch24 jacobsen
PPTX
Analyzing data
PPT
Comparative and Non-Comparative
PPT
Some Glaring Mistakes made by Researchers in Education in Statistical Analysis
CHARACTERISTICS OF QUANTITAIVE RESEARCH
AN INVESTIGATION OF THE IMPACT OF ATYPICAL PRINCIPAL PREPARATION PROGRAMS ON ...
.Analyzing data
Alise2014 30 40 d7
Educ 190_Data Analysis and Collection Tools
B-School Selection Criteria Influencers
The latest assigned task amendment based on the new working title.
83341 ch23 jacobsen
Analytics of information flows and decision making in heterogeneous learning ...
Alice Research Plan
Predicting Success : An Application of Data Mining Techniques to Student Outc...
Multi variate presentation
83341 ch24 jacobsen
Analyzing data
Comparative and Non-Comparative
Some Glaring Mistakes made by Researchers in Education in Statistical Analysis
Ad

Similar to Prediction-Next-Term Student Performance Prediction: A Recommender Systems Approach (20)

PDF
Data mining approach to predict academic performance of students
PDF
Student Performance Prediction via Data Mining & Machine Learning
PDF
Survey on Techniques for Predictive Analysis of Student Grades and Career
PDF
03 20250 classifiers ensemble
PDF
A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...
PDF
STUDENT GENERAL PERFORMANCE PREDICTION USING MACHINE LEARNING ALGORITHM
PDF
Machine learning based education data mining through student session streams
PDF
Evaluation of Data Mining Techniques for Predicting Student’s Performance
PDF
A COMPARATIVE ANALYSIS OF SELECTED STUDIES IN STUDENT PERFORMANCE PREDICTION
PPTX
student performance ppt1.pptx
DOC
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
PDF
Data Mining Techniques for School Failure and Dropout System
PDF
A comparative study of machine learning algorithms for virtual learning envir...
PDF
Predicting student performance in higher education using multi-regression models
PDF
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT’S ACADEMIC PERFORMANCE
PDF
IRJET- Student Performance Analysis System for Higher Secondary Education
PDF
IRJET- Tracking and Predicting Student Performance using Machine Learning
PDF
Data-Driven Education 2020: Using Big Educational Data to Improve Teaching an...
PDF
Identifying the Key Factors of Training Technical School and College Teachers...
PDF
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
Data mining approach to predict academic performance of students
Student Performance Prediction via Data Mining & Machine Learning
Survey on Techniques for Predictive Analysis of Student Grades and Career
03 20250 classifiers ensemble
A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...
STUDENT GENERAL PERFORMANCE PREDICTION USING MACHINE LEARNING ALGORITHM
Machine learning based education data mining through student session streams
Evaluation of Data Mining Techniques for Predicting Student’s Performance
A COMPARATIVE ANALYSIS OF SELECTED STUDIES IN STUDENT PERFORMANCE PREDICTION
student performance ppt1.pptx
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
Data Mining Techniques for School Failure and Dropout System
A comparative study of machine learning algorithms for virtual learning envir...
Predicting student performance in higher education using multi-regression models
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT’S ACADEMIC PERFORMANCE
IRJET- Student Performance Analysis System for Higher Secondary Education
IRJET- Tracking and Predicting Student Performance using Machine Learning
Data-Driven Education 2020: Using Big Educational Data to Improve Teaching an...
Identifying the Key Factors of Training Technical School and College Teachers...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
Ad

Recently uploaded (20)

PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Classroom Observation Tools for Teachers
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Lesson notes of climatology university.
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Insiders guide to clinical Medicine.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
master seminar digital applications in india
Final Presentation General Medicine 03-08-2024.pptx
Classroom Observation Tools for Teachers
PPH.pptx obstetrics and gynecology in nursing
01-Introduction-to-Information-Management.pdf
Lesson notes of climatology university.
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
GDM (1) (1).pptx small presentation for students
Insiders guide to clinical Medicine.pdf
Cell Types and Its function , kingdom of life
Supply Chain Operations Speaking Notes -ICLT Program
FourierSeries-QuestionsWithAnswers(Part-A).pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
master seminar digital applications in india

Prediction-Next-Term Student Performance Prediction: A Recommender Systems Approach

  • 1. CEIT 720: Learning Anaytics In Education Predictive Methods Beste Ulus
  • 2. Next-Term Student Performance Prediction: A Recommender Systems Approach Sweeney, M., Rangwala, H., Lester, J., Johri, A. (2016)
  • 3. Context of the Study The purpose of the study presented in this paper was to apply state-of-the-art recommender systems techniques to the task of student performance prediction.
  • 4. Research Questions  In the present study, they compare more models on a larger dataset and conduct a more detailed analysis of feature importance, performance errors, and implications for educational applications.
  • 5. Research Method of the Study  Predictive Analytical Method
  • 6. Data Collection - Data  Students from a public university  Instructor (classification, rank, tenure status)  Student info (transfer or not)  Courses  Disciplines  Letter grades  Demographics data, such as age, race, sex, high school CEEB code and GPA, zip code, and 1600-scale SAT scores.
  • 7. Data Collection - Methods  Simple baselines  MF-based methods  Common regression models.
  • 8. Data Collection - Methods  Uniform Random (UR): Randomly predict grades from a uniform distribution over the range [0, 4].  Global Mean (GM): Predict grades using the mean of all previously observed grades.  Mean of Means (MoM): Predict grades using an average of the global mean, the per student mean, and the per-course mean.
  • 9. Data Collection - Methods Three methods based on Matrix Factorization (MF):  Singular Value Decomposition (SVD)  SVD-kNN: SVD post-processed with kNN  Factorization Machine (FM)
  • 10. Data Collection - Methods Four different regression models:  Random Forest (RF)  Stochastic Gradient Descent (SGD) Regression  k-Nearest Neighbors (kNN)  Personalized Linear Multiple Regression (PLMR)
  • 11. Data Collection - Methods Evaluations are performed in terms of two common regression metrics:  Root Mean Squared Error (RMSE)  Mean Absolute Error (MAE).
  • 12. Data Collection - Tools  No information on which tool they utilized to analyze data.
  • 13. Results  The FM model outperforms all the others by a wide margin, indicating 2-way feature interactions play an important role in predicting performance.  FM-RF hybrid is an effective method of overcoming the cold-start limitations of FMs and is the most effective method in general.  When the test distribution differs from the training distribution, the FM model is liable to learn overconfident 2-way interactions that reduce its ability to generalize.
  • 16. Implications  For students, we can incorporate this information into a degree planning system.  For educators, knowledge of which students have the lowest expected grades could provide opportunities to increase detection of at-risk students.  For advisors, any additional information that helps them personalize their advice to each student could potentially help thousands of students.

Editor's Notes

  • #4: Is it a part of project? If yes name of the project?   Is the study descriptive, diagnostic, predictive or prescriptive ?
  • #7: Cold-start dyads have either a new student, a new course, or both. Appears in the prediction phase but not in previous terms training phase
  • #9: 1 randomly guessing, 2, overall central tendency, 3, row and column averages Coldstart hiç biri yoksa GM herhangi biri varsa MoM
  • #10: each grade is simply predicted as the dot product of the latent student and course feature vectors. We call this the factorized 2-way interaction of the student with the course. k-nearest neighbors (kNN) yields improved predictive performance. Pure collaborative filtering (CF) methods such as SVD and SVD-kNN are unable to make predictions for cold-start records. the FM model is able to capture all of the information captured by the simple baselines as well as the information captured by SVD. In general, it captures the global central tendency, 1-way (linear) relationships between the predictors and the grade (bias terms), and 2-way factorized interactions between each predictor and the grade:
  • #11: Once built, the tree can be used for regression of new data samples. The Random Forest then combines many of these trees in a weighted averaging approach to make decisions regarding unseen data. SGD is a gradient-based optimization technique that updates the model parameters incrementally, rather than on the entire training set at once (which is what normal gradient descent would do). This reduces overfitting and significantly improves training time. The k-Nearest Neighbors (kNN) algorithm is a classic method for clustering samples based on similarity.
  • #12: RMSE is the metric we use to compare methods. MAE allows us to understand the range of grades we might actually be predicting.