Loading [a11y]/accessibility-menu.js
Improving Predictive Analytics for Student Dropout: A Comprehensive Analysis and Model Evaluation | IEEE Conference Publication | IEEE Xplore

Improving Predictive Analytics for Student Dropout: A Comprehensive Analysis and Model Evaluation


Abstract:

This research project uses careful data preparation and machine learning model assessment to provide an in-depth analysis of a dataset of students in college or universit...Show More

Abstract:

This research project uses careful data preparation and machine learning model assessment to provide an in-depth analysis of a dataset of students in college or university. The first analysis looks at goal value distributions, economic variables, and student counts by gender. The handling of outliers, feature selection, and class imbalance are all addressed by further filtering. Using ROC curves to highlight classification strength, the study assesses several classifiers, including XGBoost, Random Forest, K-Nearest Neighbors (KNN), and Decision Tree. With the greatest AUC of 0.99, Random Forest remarkably shows excellent predictive power, closely followed by XGBoost at 0.98. XGBoost performs exceptionally well on testing and training datasets. The findings contribute valuable insights into predictive modeling for understanding and predicting student outcomes, emphasizing the potential to enhance educational support systems. This integrated approach, combining exploratory data analysis and machine learning techniques, establishes a robust framework for future research in educational data mining and predictive analytics.
Date of Conference: 28 February 2024 - 01 March 2024
Date Added to IEEE Xplore: 18 April 2024
ISBN Information:
Conference Location: New Delhi, India

I. Introduction

Educating oneself is certainly essential for allowing people to successfully navigate an increasingly complicated world, which is undoubtedly one of the pillars of social and personal development. However, an alarming issue that threatens the efficacy of educational systems globally is the persistent problem of student dropout. One major concern in the education and policy-making communities is student failure or dropout rates [1]. A large number of countries experience high dropout rates from higher education, including Spain [2], [3] the US [4], Germany [5], [6], Latvia, Liga et al. [7] in their paper finds gender, competition marks, total marks, faculty, and university programs connected to student dropout rates. Initial year dropout: 26%, 2012-2014 engineering science faculties peak at 47.6%. Mahbub Hasan et al. [8] observed that Around 0.6 million students pass the S.S.C. Examination in Bangladesh annually. Predictive analytics in data science is gaining traction for early identification of at-risk students, enabling timely support and interventions to reduce dropout rates. This research paper embarks on a comprehensive study aimed at evaluating the performance of various predictive methods in addressing the issue of student dropout. Specifically, we focus on five distinct algorithms: Synthetic Minority Over-sampling Technique (SMOTE), XGBoost, k-nearest Neighbors (KNN), Decision Tree, and Random Forest. These methods have been chosen due to their relevance and performance in a variety of predictive tasks, making them promising candidates for addressing the complex problem of student dropout. Our study is motivated by the urgency of improving dropout prediction accuracy and effectiveness.

Contact IEEE to Subscribe

References

References is not available for this document.