T. Karunaratne1, A. Jalali2, L. Assom2
Methodological attempts to accommodate data from educational technology (EdTech) systems focus to a significant extent, on optimizing the quality of information extracted over the quality of the raw data. Popular data-driven methodologies, such as learning analytics and educational data mining rely on data collected from EdTech systems as their point of departure. However, contemporary research argues that high-quality analytics systems for educational decision-making should begin by assessing the quality of raw data, including efforts to identify - or “datafy” - previously unquantified aspects of the learners and the environments. This paper argues the need to datafy educational processes to optimise the reliance of data for enhancing education. With the assumption that high-quality data leads to high-quality analytics and that the quality of raw educational data can be improved through careful datafication, this scoping review explores three research questions: How is datafication defined in the education domain? What is its purpose? How has datafication been implemented in the literature?
The methodological approach is a systematic literature review using the PRISMA framework, yielding a selection of 160 articles for analysis. Three researchers conducted the selection process: two independently screened articles, categorizing them as “yes,” “no,” or “maybe,” while the third resolved conflicts. The articles were analyzed using a qualitative coding structure focused on definition, purpose, method, and conclusion.
Findings reveal that articles define, or mostly present datafication, not necessarily with the same underlying meaning. Although the terminological point of reference for datafication is Big Data by Schoenberger, et. al., most articles presented the concept closer to the autonomous harbouring of data in EdTech systems instead. An interesting outcome was how specifically the broad concept of datafication has been positioned in the selected literature. Thus, this study mapped different interpretations (or points of view) as a summary rather than finding a unified definition, due to little to no evidence for the validity of the term qualifying as a concrete definition.
Regarding the purpose of datafication, most literature focused on evaluating analytics approaches and the quality of analytics outcomes, including the usefulness of the outcomes for learning within their units of analysis. A heavy inclination towards performance analysis of learning platforms or statistical algorithms is noted in contrast to a discussion of the quality implications of raw data. Moreover, we observed the rudimentary and often inconsistent use of the term "datafication," indicating a weak link between data and educational enhancement. This loose dependency would potentially limit the scalability of those proofs-of-concept, preventing their broader application beyond individual study scopes.
Based on the outcomes, this study brings into the spotlight, a gap in the literature regarding the quality of raw data and the role of datafication towards a sustainable and data-driven educational environment. It underscores the potential of datafication to enhance insights into teaching, learning, and education more broadly. Additionally, the outcomes call for deeper investigation into datafication’s impact on user-centricity, privacy, and ethical considerations for preserving data subject privacy when optimising processes through datafication.
Keywords: Datafication, Education, Data-driven, Education Technology.