ABSTRACT VIEW
A PRELIMINARY STUDY ON USING SVM TO ANALYZE CHANGES IN TEACHER VOICE STYLES FOR KEY DURATION DETECTION IN LECTURE ARCHIVES
L. Xiaoting1, G. Wen1, P. Siritanawan2, H. Shinobu1
1 Japan Advanced Institute of Science and Technology (JAPAN)
2 Shinshu University (JAPAN)
In the post-pandemic era, distance learning continues to play a pivotal role in the educational sector. With the increased adoption of distance learning, the volume of educational video archives has significantly expanded, making it progressively challenging for students to efficiently locate important information. Therefore, assisting students in accurately identifying important information within online classroom sessions has become a critical issue in distance learning. Additionally, it is well known that teachers often alter their voice style when delivering important information in the lecture, aiming to emphasize and convey important content more effectively. This study focuses on analyzing teachers’ voice styles through AI-driven classification to automatically identify important instructional regions in online lectures. Since existing databases do not adequately capture the diversity of teaching voice styles, a new voice style database was created. The data collection involved lecture audio recorded with ceiling microphone, which captured both teacher voices and ambient noise, as well as lavalier microphone, which provided purer teacher audio. This database consists of audio collected from 20 student participants in actual lecture environments, resulting in over 4,098 sentences and 1,024.5 minutes of recordings. Given the limited research on classifying instructional speaking styles in classroom environments, this study employs Support Vector Machine (SVM) as a widely used and reliable baseline model. As an initial exploration into this area, SVM provides a robust foundation due to its well-documented performance and interpretability in similar classification tasks. Initial classification of voice styles was performed using SVM, achieving an accuracy of 76% for audio recorded by ceiling microphones and 83% for audio recorded by lavalier microphones. These findings suggest that lavalier microphones perform better in classification, likely because they capture clearer audio, while ceiling microphones pick up environmental noise and reverberations that may reduce accuracy. Future work will focus on improving the classification accuracy for ceiling microphones. This study’s findings are expected to help students more effectively understand and review important information in distance learning, thus enhancing their learning experience and efficiency.

Keywords: Distance Learning, Voice Style Recognition, Sound Processing.

Event: INTED2025
Session: Online & Mobile Learning
Session time: Tuesday, 4th of March from 15:00 to 16:45
Session type: ORAL