ABSTRACT VIEW
CROWD-SOURCED EXEMPLARS FOR DEVELOPING RESEARCH SOFTWARE SKILLS IN STEM
J. DesLauriers, C. Cooling, J. Pinney, L. Gao, K. Michalickova
Imperial College London (UNITED KINGDOM)
As scientific research becomes more data-intensive, STEM students supporting research projects or working on a masters project or doctoral thesis have had to learn to approach and solve problems computationally. These students come from varied backgrounds, though, and few have a formal education in computer science or software engineering. As a consequence, the area of research software training has become an important topic in academia and research, with organisations like The Carpentries and the Software Sustainability Institute leading the way in teaching best practices in software and data.

The Research Computing and Data Science (RCDS) team, situated in the Early Career Researcher Institute (ECRI) at Imperial College London, delivers short courses for postgraduate Masters and PhD students. Our portfolio of courses is varied, ranging from programming languages to the use of specific data science packages within a language, to more general topics like version control, documentation, and best practices in software engineering. Courses are typically delivered in the style of a workshop, with tutors teaching by example and frequent breaks for practise with exercises.

Short courses like the ones offered by RCDS or the Carpentries work well to equip students with the fundamental skills to get started with a programming language or software library. As learners move beyond the basics, however, such short courses become less effective and so should eventually be substituted by other approaches. Inspired by the cognitive apprenticeship model, in both research software, and general software engineering training, real-life projects and code examples are one favoured approach for supporting the transition from novice to proficient.

This paper will introduce the ReCoDE (Research Computing and Data Science Exemplars) project at Imperial College London, led by the RCDS team and supported by the central Research Software Engineering (RSE) team. ReCoDE is a collection of user-contributed code examples based on the actual research of PhD students at Imperial, open to the public for independent study. PhD students pitch their project ideas in a rolling open call, and those selected undertake paid part-time work, collaborating with an RCDS and RSE team member over three months to develop an exemplar. The output of this collaboration is an online resource to guide students through a research software project, demonstrating best practices in data science and software engineering.

We describe and detail the process and approach to selecting, developing, publishing and evaluating exemplars, which could be applied more generally to curating user-generated content in other educational settings. We discuss the free, open-source tools and platforms we use to build and host the documentation, exemplar catalogue, and general information that makes ReCoDE available to the public. Finally, we conclude with our plans for the future of the project.

Keywords: Research software, programming, education, training.

Event: EDULEARN25
Session: Computational Thinking
Session time: Monday, 30th of June from 11:00 to 12:15
Session type: ORAL