G is for Generalisation: Predicting Student Success from Keystrokes
Loading...
Access rights
openAccess
acceptedVersion
URL
Journal Title
Journal ISSN
Volume Title
A4 Artikkeli konferenssijulkaisussa
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2023-03-02
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
7
Series
SIGCSE 2023 - Proceedings of the 54th ACM Technical Symposium on Computer Science Education, pp. 1028-1034
Abstract
Student performance prediction aims to build models to help educators identify struggling students so they can be better supported. However, prior work in the space frequently evaluates features and models on data collected from a single semester, of a single course, taught at a single university. Without evaluating these methods in a broader context there is an open question of whether or not performance prediction methods are capable of generalising to new data. We test three methods for evaluating student performance models on data from introductory programming courses from two universities with a total of 3,323 students. Our results suggest that using cross-validation on one semester is insufficient for gauging model performance in the real world. Instead, we suggest that where possible future work in student performance prediction collects data from multiple semesters and uses one or more as a distinct hold-out set. Failing this, bootstrapped cross-validation should be used to improve confidence in models' performance. By recommending stronger methods for evaluating performance prediction models, we hope to bring them closer to practical use and assist teachers to understand struggling students in novice programming courses.Description
Publisher Copyright: © 2023 ACM.
Keywords
computing education, educational data mining, learning analytics, predicting performance, programming process data
Other note
Citation
Pullar-Strecker, Z, Pereira, F D, Denny, P, Luxton-Reilly, A & Leinonen, J 2023, G is for Generalisation : Predicting Student Success from Keystrokes . in SIGCSE 2023 - Proceedings of the 54th ACM Technical Symposium on Computer Science Education . ACM, pp. 1028-1034, ACM Technical Symposium on Computer Science Education, Toronto, Canada, 15/03/2023 . https://doi.org/10.1145/3545945.3569824