ABSTRACT VIEW
EVALUATING THE DISCRIMINATION INDEX OF AI-GENERATED VS. HUMAN-GENERATED MULTIPLE-CHOICE QUESTIONS: ACTION RESEARCH
S. Zakareya, N. Alsaleem, A. Alnaghmaish, N. Alnaim, F. Alojail
Imam Abdulrahman Bin Faisal University (SAUDI ARABIA)
Multiple choice questions (MCQs) are a widely used assessment format in education, but they can be challenging to write effectively. One key metric for evaluating the quality of MCQs is the discrimination index, which measures how well a question differentiates between high and low performing students. This action research project aimed to explore whether artificial intelligence (AI) could increase the discrimination indices of MCQs. The researchers hypothesized that AI-generated MCQs would outperform those created by human instructors in terms of discrimination. To test their hypothesis, the researchers conducted a pilot study in the context of a research methodology course. They recruited 24 students to participate in the study, and randomly assigned them to one of two groups. One group received a set of MCQs that were generated by human instructors, while the other group answered AI-generated MCQs. The results of the pilot study were mixed, but showed some promising findings for the use of AI in MCQ generation. This suggests that while AI has potential to improve MCQ quality, further refinement and integration with human expertise may be needed to consistently achieve high-quality results. The researchers concluded that implementing AI in MCQ generation is a promising approach that warrants further investigation. The researchers proposed conducting a second cycle of research that would integrate AI generation with human instructor input.

Keywords: Discrimination index, MCQs, Artificial Intelligence.