A (Mis)Match Analysis: Examining the Alignment between Test Taker Performance in Conventional and Game-Based Assessments

A (Mis)Match Analysis: Examining the Alignment between Test Taker Performance in Conventional and Game-Based Assessments

Authors

  • Educational Testing Service, 660 Rosedale Road, Princeton, NJ 08540, USA
  • Educational Testing Service, 660 Rosedale Road, Princeton, NJ 08540, USA
  • Educational Testing Service, 660 Rosedale Road, Princeton, NJ 08540, USA

Keywords:

Argumentation Skills, Emotion, Game-Based Assessment, User Experience

Abstract

The primary goal of assessments is to accurately assess knowledge, skills, and abilities. However, to obtain a valid test score, test takers must be properly motivated to perform to the best of their abilities. Game-based assessments (GBA) have been proposed as one method to improve test taker motivation. It may not be the case, however, that GBAs serve as a onesize- fits-all solution for low motivation. In the present work, we compared test takers’ performance on both a conventional assessment (e.g., multiple-choice items) and GBA to determine if test takers perform similarly or differently on the two assessment formats across two studies. We then investigated differences in experience during the GBA (e.g., engagement, boredom). The findings revealed that test takers who performed better on the GBA than the conventional assessment had a more positive experience with the GBA, and those who performed more poorly on the GBA had a more negative experience. Test takers’ experiences with the conventional assessment are also discussed. Implications for the assignment of test takers to assessment format are discussed, as well.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Downloads

Published

2019-05-09

How to Cite

Lehman, B., Jackson, G. T., & Forsyth, C. (2019). A (Mis)Match Analysis: Examining the Alignment between Test Taker Performance in Conventional and Game-Based Assessments. Journal of Applied Testing Technology, 20(S1), 17–34. Retrieved from http://www.jattjournal.net/index.php/atp/article/view/142699

References

Attali, Y. & Arieli-Attali, M. (2015). Gamification in assessment: Do points affect test performance? Computers & Education.83, 57–63.

Bauer, M., Wylie, C., Jackson, G., Mislevy, B., Hoffman-John, E., John, M. & Corrigan, S. (2017). Why video games can be a good fit for formative assessment. Journal of Applied Testing Technology. 18, 19–31.

Bertling, M., Jackson, G., Oranje, A. & Owen, V. (2015). Measuring argumentation skills with Game-Based Assessments: Evidence for incremental validity and learning. In C. Conati, N. Heffernan, A. Mitrovic, & M. Verdejo (Eds.), Proceedings of the 17th International Conference on Artificial Intelligence in Education (pp. 545–549). New York, NY: Springer International.

Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159.

Cordova, D. & Lepper, M. (1996). Intrinsic motivation and the process of learning: Beneficial effects of contextualization, personalization, and choice. Journal of Educational Psychology. 88, 715–730.

Cronbach, L. (1960). Essentials of psychological testing (2nd Ed.). New York, NY: Harper & Row.

de Klerk, S. & Kato, P. (2017). The future value of serious games for assessment: Where do we go now? Journal of Applied Testing Technology. 18, 32–37.

DeMars, C. (2000). Test stakes and item format interactions. Applied Measurement in Education. 13, 55–77.

DiCerbo, K. (2017). Building the evidentiary argument in gamebased assessment. Journal of Applied Testing Technology. 18, 7–18.

Domínguez, A., Saenz-de-Navarrete, J., de-Marcos, L., Fernandez-Sanz, L., Pages, C. & Martínez-Herráiz, J.-J. (2013). Gamifying learning experiences: Practical implications and outcomes. Computers & Education. 63, 380–392.

Gee, J. (2003). What video games have to teach us about learning and literacy. Computers in Entertainment. 1, 1–4.

Gilleade, K. & Dix, A. (2004). Using frustration in the design of adaptive video games. In Proceedings of the 2004 ACM SIGCHI International Conference on Advances in computer entertainment technology (pp. 228–232). New York, NY: Association for Computing Machinery.

Graesser, A., Witherspoon, A., McDaniel, B., D’Mello, S., Chipman, P. & Gholson, B. (2006). Detection of emotions during learning with AutoTutor. In E. Son (Ed.) Proceedings of the 28th Annual Meeting of the Cognitive Science Society (pp. 285–290). Mahwah, NJ: Erlbaum.

Hedges, L. (1981). Distribution theory for glass’s estimator of effect size and related estimators. Journal of Educational Statistics. 6, 107–128.

Jackson, G.,Graesser, A. & McNamara, D. (2009). What students expect may have more impact than what they know or feel. In V. Dimitrova, R.Mizoguchi, B. du Boulay, & A.Graesser (Eds.), Artificial intelligence in education; Building learning systems that care; from knowledge representation to affective modeling (pp. 73–80). Amsterdam, The Netherlands: IOS Press.

Jackson, G., Lehman, B., Forsyth, C. & Grace, L. (2017). GameBased Assessment: Investigating relations between skill assessment, game performance, and user experience. Manuscript submitted for publication.

Jackson, G. & McNamara, D. (2013). Motivation and performance in a game-based intelligent tutoring system. Journal of Educational Psychology. 105, 1036–1049.

Kato, P. & de Klerk, S. (2017). Serious games for assessment: Welcome to the jungle. Journal of Applied Testing Technology. 18, 1–6.

Kirriemuir, J. & McFarlane, A. (2004). Literature review in games and learning. Bristol, England: A NESTA Futurelab Research report - report 8.

Kolen, M. & Brennan, R. (2014). Test equating, scaling, and linking: Methods and practices (3rd Ed.). New York, NY: Springer.

Lazzaro, N. (2004). Why we play games: four keys to more emotion without story. Paper presented at Game Developers Conference. San Jose, California.

Lehman, B. & Zapata-Rivera, D. (2018a, April). Examining convergent and discriminant validity evidence for a self-report test-taking motivation measure. Paper presented at the meeting of the American Educational Research Association.New York, New York.

Lehman, B. & Zapata-Rivera, D. (2018b). Student emotions in conversation-based assessments. IEEE Transactions on Learning Technologies. 11, 1–13.

Lumsden, J., Edwards, E., Lawrence, N., Coyle, D. & Munafo, M.(2016). Gamification of cognitive assessment and cognitive training: A systematic review of applications and efficacy. JMIR Serious Games. 4, e11.

Malone, T. & Lepper, M. (1987). Making learning fun: A taxonomy of intrinsic motivations for learning. In R. Snow & M. Farr (Eds.) Aptitude, learning, and instruction: III, Conative & Affective Process Analysis (pp. 223–253). Hillsdale, NJ: Erlbaum.

McQuiggan, S., Robison, J. & Lester, J. (2010). Affective transitions in narrative-centered learning environments. Educational Technology & Society. 13, 40–53.

Mislevy, R., Behrens, J., Dicerbo, K., Frezzo, D. & West, P. (2012). Three things game designers need to know about assessment. In D. Ifenthaler, D. Eseryel & X. Ge (Eds.). Assessment in game-based learning: Foundations, innovations, and perspectives (pp. 59–84). New York, NY: Springer.

Nylund, A. & Landfors, O. (2015). Frustration and its effect on immersion in games: A developer viewpoint on the good and bad aspects of frustration. (Unpublished master’s thesis). Umea University, Sweden.

Ocumpaugh, J., Baker, R. & Rodrigo, M. (2015) Baker Rodrigo Ocumpaugh Monitoring Protocol (BROMP) 2.0 Technical and Training Manual.(Technical Report). New York, NY: Teachers College, Columbia University. Manila, Philippines: Ateneo Laboratory for the Learning Sciences.

Rios, J. A. & Liu, O. L. (2017). Improving test-taking effort in low-stakes group-based educational testing: A meta-analysis of interventions. Manuscript submitted for publication.

Rodrigo, M. & Baker, R. (2011). Comparing the incidence and persistence of learners’ affect during interactions with different educational software packages. In R. Calvo & S.D’Mello, (Eds.). New perspectives on affect and learning technologies (pp. 183–202). New York, NY: Springer.

Rodrigo, M. & Baker, R. (2014). Comparing learners’ affect while using an intelligent tutor and an educational game. Research and Practice in Technology Enhanced Learning. 6, 43–66.

Snow, E., Jackson, G., Varner, L. & McNamara, D. (2013). Expectations of technology: A factor to consider in game-based learning environments. Proceedings of the 16th International Conference on Artificial Intelligence in Education (AIED), (pp. 359–368). Heidelberg, Germany: Springer.

Song, Y., Deane, P., Graf, E. & van Rijn, P. (2013). Using argumentation learning progressions to support teaching andassessments of English language arts. R&D Connections. 22, 1–14.

Stein, N. & Levine, J. (1991). Making sense out of emotion: The representation and use of goal-structured knowledge. In W. Kessen, A. Ortony, & F. Craik (Eds.), Memories, thoughts, and emotions: Essays in honor of George Mandler (pp. 295– 322). Hillsdale, NJ: Erlbaum.

Thorndike, E. (1904). An introduction to the theory of mental and social measurements. New York, NY: Science Press.

Wise, S. (2006). An investigation of the differential effort received by items on a low-stakes, computer-based test. Applied Measurement in Education. 19, 93–112.

Wise, S., Bhola, D. & Yang, S. (2006). Taking the time to improve the validity of low-stakes tests: The effort-monitoring CBT. Educational Measurement: Issues and Practice. 25(2), 21–30.

Wise, S. & Kong, X. (2005). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education. 18, 163–183.

Wise, S., Pastor, D. & Kong, X. (2009). Understanding correlates of rapid-guessing behavior in low stakes testing: Implications for test development and measurement practice. Applied Measurement in Education. 22, 185–205.

Wise, S. & Smith, L. (2011). A model of examinee test-taking effort. In J. Bovaird, K. Geisinger, & C. Buckendal (Eds.), High-stakes testing in education: Science and practice in K-12 settings (pp. 139–153). Washington, DC: American Psychological Association.

Wise, S. & Smith, L. (2016). The validity of assessment when students don’t give good effort. In G. Brown & L. Harris (Eds.), Handbook of human and social conditions in assessment (pp. 204–220). New York, NY: Routledge.

Wolf, L., Smith, J. & Birnbaum, M. (1995). Consequence of performance, test motivation, and mentally taxing items. Applied Measurement in Education. 8, 341–351.

Wouters, P., van Nimwegen, C., van Oostendorp, H. & van der Spek, E. (2013). A meta-analysis of the cognitive and motivational effects of serious games. Journal of Educational Psychology. 105, 249–265.

Zapata-Rivera, D., Jackson, G. & Katz, I. (2015). Authoring conversation-based assessment scenarios. In R. Sottilare, A. Graesser, X. Hu, & K. Brawner (Eds.), Design recommendations for intelligent tutoring systems, Volume 3: Authoring tools and expert modeling techniques (pp. 169–178). U.S. Orlando, FL: Army Research Laboratory.

Loading...