Abstract View

ABSTRACT VIEW

CORRECTNESS OF CODE EVALUATION AND IMPROVEMENT USING LARGE LANGUAGE MODELS

F. Gomez-Donoso, F. Escalona, M. Cazorla, G. Gonzalez-Serrano, D. Viejo-Hernando, B. Dominguez-Dager, F. Morillas-Espejo, C. Zambrana-Navajas, S. Suescun-Ferrandiz

University of Alicante (SPAIN)

With the increasing demand for efficient and accurate code evaluation systems, this study introduces a novel approach leveraging large language models (LLMs) to assess the correctness of source code and provide targeted improvement suggestions. Traditional code review processes, particularly in educational settings, often require significant human intervention, which can be time-consuming and may lead to inconsistencies due to subjective interpretations. The proposed system addresses these challenges by utilizing LLMs trained on extensive code datasets to automatically analyze code snippets, verify their functionality, and offer optimization recommendations in real-time.

For students, this system presents a valuable learning tool, providing instant feedback on coding assignments and helping them understand common errors and best practices without needing immediate instructor intervention. By receiving detailed, actionable feedback, students can iteratively improve their code, thereby deepening their programming skills and fostering self-directed learning. Instructors, on the other hand, benefit from a scalable solution that reduces the manual workload of grading and reviewing code submissions. The system's consistent and data-driven approach also helps ensure a uniform assessment standard, which is particularly advantageous in large classrooms or online courses where personalized feedback can be challenging to deliver.

The system was evaluated across multiple programming tasks and languages to ensure adaptability and robustness, with experimental results indicating high accuracy in identifying syntactical and logical errors, as well as insightful improvement suggestions. This research not only advances automated code evaluation technology but also underscores the potential of intelligent programming assistants to enhance educational practices in computer science and software engineering courses. The proposed LLM-based system offers an innovative step towards bridging the gap between human expertise and automated feedback, supporting both students' learning experiences and educators' teaching efficiency.

Keywords: Llm, technology, education, code development, code evaluation.

Event: INTED2025
Track: STEM Education
Session: Computer Science Education
Session type: VIRTUAL