Methodological Considerations for Predicting At-risk Students

Loading...
Thumbnail Image

Access rights

openAccess
acceptedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Major/Subject

Mcode

Degree programme

Language

en

Pages

9

Series

ACE '22: Australasian Computing Education Conference, pp. 105-113

Abstract

Educational researchers have long sought to increase student retention. One stream of research focusing on this seeks to automatically identify students who are at risk of dropping out. Studies tend to agree that earlier identification of at-risk students is better, providing more room for targeted interventions. We looked at the interplay of data and predictive power of machine learning models used to identify at-risk students. We critically examine the often used approach where data collected from weeks 1, 2,..., n is used to predict whether a student becomes inactive in the subsequent weeks w, w ≥ n + 1, pointing out issues with this approach that may inflate models’ predictive power. Specifically, our empirical analysis highlights that including students who have become inactive on week n or before, where n > 1, to the data used to identify students who are inactive on the following weeks is a significant cause of bias. Including students who dropped out during the first week makes the problem significantly easier, since they have no data in the subsequent weeks. Based on our results, we recommend including only active students until week n when building and evaluating models for predicting dropouts in subsequent weeks and evaluating and reporting the particularities of the respective course contexts.

Description

Keywords

Other note

Citation

Koutcheme, C, Sarsa, S, Hellas, A, Haaranen, L & Leinonen, J 2022, Methodological Considerations for Predicting At-risk Students. in ACE '22: Australasian Computing Education Conference. ACM, pp. 105-113, Australasian Computing Education Conference, Virtual, Online, Australia, 14/02/2022. https://doi.org/10.1145/3511861.3511873