ABSTRACT VIEW
CONSISTENCY AND ORIGINALITY IN GENERATIVE ARTIFICIAL INTELLIGENCE (GEN AI) OUTPUT IN HIGHER EDUCATION INSTITUTIONS (HEIS): CAN WE TRUST IT?
L. Zizka
EHL Hospitality Business School / HES-SO University of Applied Sciences and Arts Western Switzerland (SWITZERLAND)
The term Artificial Intelligence (AI) has been used for decades to describe ‘human’ tasks completed by a computer or robot. One type of AI is called Large Language Models (LLMs) that operate by leveraging deep learning techniques to complete a multitude of daily tasks. They are trained on articles, books , and Internet resources to produce human-like responses to natural language queries. More recently, the term Generative AI or Gen AI was coined which describes creating content in response to designated prompts replicating human conversation. While there are many examples of Gen AI, this paper will focus on one of the most popular and free versions, ChatGPT. It combines the words Chat (a chatbot created by OpenAI) and GPT (generative pre-trained transformer).

This study addresses the following questions:
RQ 1: When responding to the same prompt, is ChatGPT consistent in the quality of the output?
RQ 2: When responding to the same prompt, is ChatGPT original in the output it provides?
RQ3: When responding to the same prompt, is ChatGPT’s output trustworthy enough to submit for evaluation?

According to previous research, consistency can be improved over time as the more prompts are refined and the more often Gen AI is used, the better it should be. This paper will summarize some of the types of consistency, such as semantic, symmetric, and transitive. Further, the topic of negative inconsistency will be addressed. Prior studies have analyzed the coherence, consistency, accuracy and reliability of LLM outputs by incorporating a self consistency approach. For RQ2, studies have examined Gen AI’s inability to generate original text, music, and image content. To date, while many convincing examples have been touted on the Internet, Gen AI is not yet capable of creating original pieces of work; it must follow existing examples and replicate output from those. Finally, the topic of trustworthiness (RQ3) will be linked first and foremost to the ethics and transparency of using Gen AI. It will then be connected to the topics of consistency and originality when considering assignments submitted for evaluation.

To examine this topic, the researcher engineered one prompt that was run in free ChatGPT 100 times in one session. When patterns in the output appeared, suggesting a potential issue with this methodology of running the same prompt by the same person, the researcher then included students enrolled in an Academic Writing course. Sixty-one students responded by sharing the output they received when they entered the same prompt.

Based on early results, the output is consistent, if not overly similar. Responses often follow the same pattern and structure to respond to the same prompt. Interestingly, this finding also supports the rejection of RQ2 as it confirms that the output of a prompt that is run many times is not original. Finally, in response to RQ3, the question of trustworthiness can only be discussed within a learning environment that provides guidelines and best practices to evaluate the output that is received.

The novelty of this study resides in a student’s ability to analyze the trustworthiness of ChatGPT content through the lens of its potential to be submitted for a grade. We intend to create guidelines or best practices to help students decide to submit their work.

Keywords: Higher Education Institutions (HEIs), Generative Artificial Intelligence (Gen AI), Academic Writing, hospitality management studies, Switzerland, consistency, originality.

Event: INTED2025
Track: Innovative Educational Technologies
Session: Generative AI in Education
Session type: VIRTUAL