Direct Repair Optimization : Training Small Language Models For Educational Program Repair Improves Feedback

Loading...
Thumbnail Image

Access rights

openAccess
CC BY
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Major/Subject

Mcode

Degree programme

Language

en

Pages

Series

Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025), pp. 564–581

Abstract

Locally deployed Small Language Models (SLMs) offer a promising solution for providing timely and effective programming feedback to students learning to code. However, SLMs often produce misleading or hallucinated feedback, limiting their reliability in educational settings. Current approaches for improving SLM feedback rely on existing human annotations or LLM-generated feedback. This paper addresses a fundamental challenge: Can we improve SLMs’ feedback capabilities without relying on human or LLM-generated annotations? We demonstrate that training SLMs on the proxy task of program repair is sufficient to enhance their ability to generate high-quality feedback. To this end, we introduce Direct Repair Optimization (DRO), a self-supervised online reinforcement learning strategy that trains language models to reason about how to efficiently fix students’ programs.Our experiments, using DRO to fine-tune LLaMA-3.1–3B and Qwen-2.5–3B on a large-scale dataset of Python submissions from real students, show substantial improvements on downstream feedback tasks. We release our code to support further research in educational feedback and highlight promising directions for future work.

Description

Keywords

Other note

Citation

Koutcheme, C, Dainese, N & Hellas, A 2025, Direct Repair Optimization : Training Small Language Models For Educational Program Repair Improves Feedback. in E Kochmar, B Alhafni, M Bexte, J Burstein, A Horbach, R Laarmann-Quante, A Tack, V Yaneva & Z Yuan (eds), Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025). Association for Computational Linguistics, pp. 564–581, Workshop on Innovative Use of NLP for Building Educational Applications, Vienna, Austria, 31/07/2025. < https://aclanthology.org/2025.bea-1.41/ >