Date Published: February 14, 2017
Publisher: Public Library of Science
Author(s): Sergi Rovira, Eloi Puertas, Laura Igual, J. Alberto Conejero.
Nowadays, the role of a tutor is more important than ever to prevent students dropout and improve their academic performance. This work proposes a data-driven system to extract relevant information hidden in the student academic data and, thus, help tutors to offer their pupils a more proactive personal guidance. In particular, our system, based on machine learning techniques, makes predictions of dropout intention and courses grades of students, as well as personalized course recommendations. Moreover, we present different visualizations which help in the interpretation of the results. In the experimental validation, we show that the system obtains promising results with data from the degree studies in Law, Computer Science and Mathematics of the Universitat de Barcelona.
Since Bologna Process  was introduced, most European Universities have developed a tutoring system to provide their students with mentorship. The responsibilities of the tutor may differ between institutions but his/her main role is to offer personal guidance and advice to the students. Several recent works point out that personalized mentorship is crucial to prevent dropout, and improve their academic performance (see Section 1.1). Actually, decreasing dropout is one of the main current goals in European Higher Education Area (EHEA). The European dropout rate is 30% according to the publication Education at a Glance (EAG) . In Spain, the dropout rate stands between 25% and 29% according to  and in the University of Barcelona, the dropout rate from 2009 to 2014 was around 20% accordingly to .
In this Section we explain the performed experiments for evaluating which are the best models to predict dropout among the 5 described in Section 2.2. We also explain the experiments performed to assess the performance of our recommender system predicting grades and compare it with two standard methods described in Section 2.3. Moreover, we provide useful data visualization for a better understanding and interpretation of the results.
In this paper, we have presented a data-driven system to help tutors in the early detection of dropout and the prediction of courses grade as well as courses ranking. We have compared different machine learning methods and selected the ones with the best performance. Our classification system’s results in dropout prediction are promising, obtaining a F1 score of 82%, 76% and 61% for the degrees in Computer Science, Law and Mathematics respectably. Regarding the final grade prediction, our recommender system is able to accurately predict the grade with an mean absolute error of 1.21, 1.32 and 1.34 for the degrees in Law, Computer Science and Mathematics respectively. Moreover, in order to complete the evaluation of the performances we have developed visualization tools to better understand the obtained results. In particular, these visualizations allow to interpret: where the system commits errors on dropout prediction for each degree; how the errors of predicted grades are distributed for each degree; and how correct the ranking is for each degree.