ABSTRACT VIEW
AUTOMATED ASSESSMENT OF ARTEFACTS USING AI
C. Del Gobbo, M. Birkenkrahe, N. Yonts, W. Beal
Lyon College (UNITED STATES)
This study, conducted in the summer of 2024, aims at testing the ability of generative AI to serve as an efficient non-expert grader and feedback provider in the context of school-wide curriculum assessment whereby non-expert educators would be asked to evaluate student artefacts in a wide spectrum of disciplines at Undergraduate level.

For the content, we used the conversational interface to OpenAI's ChatGPT-4o to create diverse student personas and assignments based on exam questions from two Undergraduate courses on religion and philosophy. A total of 40 artefacts were created, 20 for each question. A custom GPT model ("grader-GPT") was developed for grading and providing feedback according to an existing rubric.

Three educators without specific expertise in the content area graded and provided feedback on the synthetic artefacts using the same rubrics as the grader-GPT. Both the educators and the grader-GPT provided numerical grades and feedback in a specified format to ensure consistency, and recorded the time required to grade each artefact.

The artefacts, along with their respective grades and feedback from both the educators and the grader-GPT, were evaluated by a separate group of educators using a questionnaire designed to assess the accuracy and usefulness of the grades and the feedback.

Findings include the degree of alignment between the grades given by the grader-GPT and the educators, a quality comparison, and potential time savings. The benefits of this approach are weighed against the potential drawbacks to enable a foundation for a possible integration of generative AI in the assessment process.

Attendees of this presentation will gain insights into the practical applications of generative AI in educational settings, specifically in grading and feedback. They will learn about the methodology used in creating and evaluating synthetic student artefacts, the effectiveness of AI compared to human graders, and the potential benefits of AI augmentation in curriculum assessment.

Additionally, attendees will be equipped with knowledge on implementing AI tools in their own assessment processes, understanding both the advantages and limitations, and how to achieve a balanced approach that leverages the strengths of both AI and human oversight.

Keywords: Generative AI, educational assessment, automated grading, custom GPT-4o, AI in education.