C. Luna Jimenez, A. Heimerl, F. Hellmann, B. Mahesh, E. André
In the context of contemporary education, traditional pedagogical approaches based on passively attending lectures often prove inadequate in engaging learners who are increasingly immersed in a dynamic, technology-driven environment. To address the needs of this generation effectively, it is essential to invest efforts in improving instructional and tailored strategies that accommodate diverse learning styles and incorporate state-of-the-art technological advancements. The necessity of acquiring a basic knowledge of the subject matter and the need for experiential learning can be addressed through a combination of the “flipped classroom” methodology and projects that emphasize the “learning by doing” approach.
In the Interactive Machine Learning course, online lectures are combined with guided exercises and projects in the classroom, allowing students to apply the learned concepts as they fit. This course focuses on implementing the latest generations of deep learning models, such as Transformers, in addition to traditional machine learning strategies. Each project development integrates various stages, including dataset discovery and/or acquisition, data and signal processing, model training and classification, and deployment to facilitate an interactive study of the applications and the discovery of limitations. Addressing each step of their project encourages the development of coding skills and prepares them to face the industry or research after their studies, providing a robust hands-on experience. Additionally, at the end of the course, a final demo-based presentation allows students to practice their communication skills by learning how to articulate their ideas verbally and in writing, culminating in the generation of a final report on their project in the format of a conference publication.
For illustration, one of the projects was related to the high-impact topic of Sign Language Recognition. In this project, fully developed in Python, students were required to preprocess two different video-based datasets. Then, the poses of the hands and face were extracted from each image in the videos. After pre-processing and feature extraction, this pose information (or landmarks) is fed into a state-of-the-art deep learning architecture, Transformers, which is trained on the pre-processed videos. After the training stage, the model could recognize the meaning of up to 100 different signs (glosses). Finally, the model was deployed in a client-server application to enable the recording and recognition of new videos. This project primarily utilizes three libraries: OpenCV (for image processing), MediaPipe (for pose extraction), PyTorch (for deep-learning models), and the open-source web framework Litestar (for the client-server application). Other topics covered in the projects also hold significant relevance across diverse societal needs and areas, from Sign Language to Disease Identification.
During the development of the projects, students had the option to attend a weekly meeting with their assigned tutors to receive assistance through their learning path and support when unexpected problems arose. By bridging the gap between theoretical knowledge and practical application, this educational proposal provided tailored education to university students, resulting in a high degree of satisfaction. The students gave the subject a high rating for its practical load, with an average of 4.63 on a 5-point Likert scale.
Keywords: Machine learning, Deep Learning, Active learning methodology, learning-by-doing, Human-Centred Applications.