ABSTRACT VIEW
SENTIDROP: A MULTI-MODAL MACHINE LEARNING MODEL FOR PREDICTING DROPOUT IN DISTANCE
M. Mihoubi, M. Zerkouk, B. Chikhaoui
University of Téluq, Institute of Applied Artificial Intelligence (I2A Institute) (CANADA)
School dropout is a serious problem in distance learning, where early detection is crucial for effective intervention and student perseverance. Predicting student dropout using available educational data is a widely researched topic in learning analytics. Our partner’s distance learning platform highlights the importance of integrating diverse data sources, including socio-demographic data, behavioral data, and sentiment analysis, to accurately predict dropout risks. In this paper, we introduce a novel model that combines sentiment analysis of student comments using the Bidirectional Encoder Representations from Transformers (BERT) model with socio-demographic and behavioral data analyzed through Extreme Gradient Boosting (XGBoost). We fine-tuned BERT on student comments to capture nuanced sentiments, which were then merged with key features selected using feature importance techniques in XGBoost. Our model was tested on unseen data from the next academic year, achieving an accuracy of 84%, compared to 82% for the baseline model. Additionally, the model demonstrated superior performance in other metrics, such as precision and F1-score. The proposed method could be a vital tool in developing personalized strategies to reduce dropout rates and encourage student perseverance.

Keywords: School Dropout, Machine learning (ML), Sentiment analysis, Behavioral data, Sociodemographic data, Prediction.

Event: EDULEARN25
Track: Digital & Distance Learning
Session: e-Learning Experiences
Session type: VIRTUAL