Manuscript received August 21, 2024; revised September 18, 2024; accepted October 14, 2024; published January 20, 2025
Abstract—This paper examines the use of Educational Data Mining (EDM) to predict the academic performance of elementary students specifically in Mathematics. It explores ten Machine Learning classifiers, comprising eight base learners (Linear SVM, Logistic Regression, Medium KNN, Wide NN, Fine Decision Trees, Bilayered NN, Fine KNN, and Medium NN) as well as two ensemble learners (Ensemble Subspace Discriminant and Ensemble Boosted Trees) within the MATLAB environment. The analysis utilizes a dataset featuring 33 academic and demographic features of 280 students. To mitigate the imbalanced distribution in class data, resampling techniques such as Random Under-Sampling Boost (RUSBoost), Synthetic Minority Oversampling Technique (SMOTE), and hybrid combinations of both are employed. The experimental outcomes demonstrate that the hybrid-sampling SMOTERUSBoosted Trees algorithm achieves the highest accuracy of 75% on testing data, indicating the efficacy of combining oversampling and under-sampling techniques for modeling imbalanced datasets. This finding underscores the potential of EDM in the elementary education sphere to bolster data-driven interventions and enhance students’ Mathematics achievement.
Keywords—educational data mining, mathematics achievement, ensemble learning, imbalanced class, resampling methods
Cite: Hendra Tjahyadi and Krismon N. L. Tude, "The Implementation of Educational Data Mining in Predicting Students’ Academic Achievement in Mathematics at a Private Elementary School," International Journal of Information and Education Technology, vol. 15, no. 1, pp. 154-163, 2025.