T. Baumeister, P. Grave, D. Fey
This paper presents an empirical analysis of quiz integration in "CPU Architect," a serious game we developed for teaching instruction-level parallelism through CPU pipeline construction. We hypothesized that interspersed quiz levels would create testing pressure, leading students to focus more intently on subsequent construction levels (where students build functional CPU pipelines) and thereby improve their performance.
We conducted an A/B test with 51 computer science students during 45 minute practice sessions. Upon first launch, our game randomly assigned each player to either the quiz-enabled version (n=27) or the control version without quizzes (n=24), ensuring unbiased group allocation. The quiz version included four assessment levels placed strategically after related pipeline construction tasks. These quizzes were designed to verify theoretical understanding beyond mere game completion, testing whether students grasped the underlying concepts they had just applied practically.
Statistical analysis (t-tests, α=0.05, df=49) revealed no significant differences between groups across all performance metrics. Star ratings (measuring solution quality) differed by less than 0.12 stars on most levels, with only marginal advantages for the quiz group on the final two levels (2.89 vs 2.62 and 2.60 vs 2.12 stars). Time spent per level showed no consistent differences between groups. Attempt counts were similar except for one outlier level where the quiz group averaged 14.5 attempts versus 11.1 for the control group.
Our hypothesis that testing pressure would enhance performance was not supported by the data. However, the complete absence of negative effects is noteworthy. While objective performance metrics showed no improvement, limited subjective feedback suggested potential motivational benefits that warrant further investigation with appropriate qualitative methods.
This null result contributes valuable evidence that theoretically justified gamification elements may not produce measurable performance improvements in short-term contexts. Future research should explore whether longer exposure periods or different quiz implementations might reveal performance benefits, and should systematically investigate the suggested motivational effects through validated instruments.
Keywords: Game-based learning, quiz integration, serious games, null results, performance assessment, computer architecture education.