Abstract—The objective of this research is to employ data
mining tools and techniques on student enrollment data to
predict student retention among freshman student populations.
In particular, the goal is to identify freshman students who are
more likely to drop out of school so that preemptive actions can
be taken by the university. Through data analysis, we identify
the most relevant enrollment, performance, and financial
variables to construct learning models for retention prediction.
The experiments have been conducted using Decision Trees,
Naïve Bayes, Neural Networks, and Rule Induction models.
These models have been compared and evaluated extensively.
Our findings show that each model has its advantages and
disadvantages and among all the input variables, students’ GPA
and their financial status have bigger impact on students’
retention than other variables.
Index Terms—Classification, feature selection, freshman
retention, prediction.
The authors are with the Computer Science Department, Eastern
Washington University, Cheney, WA 99004 USA (e-mail:
adjulovic@eagles.ewu.edu, danl@ewu.edu).
Cite:Admir Djulovic and Dan Li, "Towards Freshman Retention Prediction: A Comparative Study," International Journal of Information and Education Technology vol. 3, no. 5, pp. 494-500, 2013.