The Use of Data Imputation when Investigating Dimensionality in Sparse Data from Computerized Adaptive Tests

The Use of Data Imputation when Investigating Dimensionality in Sparse Data from Computerized Adaptive Tests


  • Centre for Research in Applied Measurement and Evaluation, University of Alberta, 6-110 Education Centre North, 11210 87 Ave NW, Edmonton, AB T6G 2G5
  • National Council of State Boards of Nursing, Chicago, Illinois


Computerized Adaptive Testing, CART, Imputation, Mice, Sparseness


The development of a Computerized Adaptive Test (CAT) for operational use begins with several important steps, such as creating a large-size item bank, piloting the items on a sizable and representative sample of examinees, dimensionality assessment of the item bank, and estimation of item parameters. Among these steps, testing the dimensionality of the item bank is particularly important because the subsequent analyses depend on the confirmation of the hypothesized factor structure (e.g., unidimensionality). After the CAT becomes operational, it is still important to periodically assess the dimensionality of the item bank because both the examinee population and the item bank may change over time. However, extreme sparseness of the response data returned from the CAT makes the test of dimensionality very difficult. This study investigated whether data imputation can be a feasible solution to the sparseness problem when examining test dimensionality in sparse data returned from CATs. Sparse data with unidimensional, multidimensional, and bi-factor test structures were simulated based on real data from a large-scale, operational CAT. Two-way imputation and Multivariate Imputation with Chain Equations (MICE) methods were used to replace missing responses in the data. Using confirmatory factor analysis, imputed datasets were analyzed to examine whether the true test structure was retained after imputations. Results indicated that MICE with classification and regression trees (MICE-CART) produced highly accurate results in retaining the true structure, whereas the performances of other imputation methods were quite poor. Data imputation with MICE-CART appears to be promising solution to data sparsity when examining test dimensionality for CATs.


Download data is not yet available.


Metrics Loading ...




How to Cite

Bulut, O., & Kim, D. (2021). The Use of Data Imputation when Investigating Dimensionality in Sparse Data from Computerized Adaptive Tests. Journal of Applied Testing Technology. Retrieved from





Adams, R. J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1-23.

Allison, P. D. (2003). Missing data techniques for structural equation modeling. Journal of Abnormal Psychology, 112, 545-557. PMid:14674868

Azur, M. J., Stuart, E., Frangakis, C., & Leaf, P. (2011). Multiple imputation by chained equations: What is it and how does it work? International Journal of Methods in Psychiatric Research, 20(1), 40-49. PMid:21499542 PMCid:PMC3074241

Ban, J., Hanson, B.A., Yi, Q., & Harris, D. (2001). Data sparseness and online pretest calibration/scaling methods in CAT. Paper presented at the annual meeting of the American Educational Research Association, Seattle, WA.

Bernaards, C. A., & Sijtsma, K. (2000). Influence of simple imputation and EM methods on factor analysis when item nonresponse in questionnaire data is no ignorable. Multivariate Behavioral Research, 35(3), 321364. PMid:26745335

Birnbaum, A. (1968). Some latent trait models. In F.M. Lord & M.R. Novick, (Eds.), Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

Bock, D., Gibbons, R., & Muraki, E. (1988). Fullinformation item factor analysis. Applied Psychological Measurement, 12, 261-280.

Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford.

Bulut, O., & Kan, A. (2012). Application of computerized adaptive testing to Entrance Examination for Graduate Studies in Turkey. Eurasian Journal of Educational Research, 49, 61-80.

Burgette, L. F., & Reiter, J. P. (2010). Multiple imputation for missing data via sequential regression trees. American Journal of Epidemiology, 172(9), 1070-1076. PMid:20841346

Cappaert, K. J., Wen, Y., & Chang, Y. F. (2018). Evaluating CAT-adjusted approaches for suspected item parameter drift detection. Measurement: Interdisciplinary Research and Perspectives, 16(4), 226-238. /15366367.2018.1511199

Enders, C. K., & Bandalos, D. L. (2001). The relative performance of full information maximum likelihood estimation for missing data in structural equation models. Structural Equation Modeling, 8(3), 430-457.

Finch, H. (2011). The use of multiple imputation for missing data in uniform DIF analysis: Power and type I error rates. Applied Measurement in Education, 24(4), 281301.

Glas, C. A. W. (2006). Violations of ignorability in computerized adaptive testing. (LSAC research report series; No. 04-04). Newton, PA, USA: Law School Admission Council.

Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549-576. PMid:18652544

Hallquist, M. N. & Wiley, J. F. (2018). Mplus Automation: An R package for facilitating large-scale latent variable analyses in Mplus. Structural Equation Modeling: A Multidisciplinary Journal, 25(4), 621-638. doi: 10.1080/10705511.2017.1402334. 10.1080/10705511.2017.1402334 PMid:30083048 PMCid:PMC6075832

Han, K. T., & Guo, F. (2014). Impact of violation of the missing-at-random assumption on full-information maximum likelihood method in multidimensional adaptive testing. Practical Assessment, Research & Evaluation, 19(2).

Harmes, J. C., Kromney, J. D., & Parshall, C. G. (2001). Online item parameter recalibration: Application of missing data treatments to overcome the effects of sparse data conditions in a computerized adaptive version of the MCAT. Report submitted to the Association of American Medical Colleges, Section for the MCAT. Retrieved from ha01-01.pdf

Harrison, D. A. (1986). Robustness of IRT parameter estimation to violations of the unidimensionality assumption. Journal of Educational Statistics, 11(2), 91-115.

Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.

Ito, K., & Sykes, R.C. (1994). The effect of restricting ability distributions in the estimation of item difficulties: Implications for a CAT implementation. Paper presented at the annual meeting of the National Council on Measurement in Education, New Orleans.

Kadengye, D. T., Cools, W., Ceulemans, E., & Van den Noortgate, W. (2012). Simple imputation methods versus direct likelihood analysis for missing item scores in multilevel educational data. Behavior research methods, 44(2), 516-531. PMid:22002637

Kingsbury, G. G. (2009). Adaptive item calibration: A process for estimating item parameters within a computerized adaptive test. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. Retrieved from

Leite, W. L., & Beretvas, S. N. (2004). The performance of multiple imputation for Likert-type items with missing data. Paper presented at the annual meeting of the American Educational Research Association, San Diego, CA.

Linacre, J. M. (2011). Rasch measures and unidimensionality. Rasch Measurement Transactions, 24(4), 1310.

Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Hoboken, NJ: Wiley.

Liu, C., Han, K. T., & Li, J. (2019). Compromised item detection for computerized adaptive testing. Frontiers in psychology, 10, 829. fpsyg.2019.00829 PMid:31105612 PMCid:PMC6499181

Lorenzo-Seva, U., & Van Ginkel, J. R. (2016). Multiple imputation of missing values in exploratory factor analysis of multidimensional scales: estimating latent trait scores. Annals of Psychology, 32(2), 596-608.

Makransky, G., & Glas, C. A. (2014). An automatic online calibration design in adaptive testing. Journal of Applied Testing Technology, 11(1), 1-20.

McDonald, R. P. (1999). Test theory: A unified treatment. Mahwah, NJ: Lawrence Erlbaum.

Mislevy, R. J., & Wu, P.-K. (1996). Missing responses and IRT ability estimation: Omits, choice, time limits, and adaptive testing. ETS Research Report Series, 2, i-36.

Muthén, L. K., & Muthén, B. O. (1998-2015). Mplus User’s Guide Seventh Edition. Los Angeles, CA: Muthén & Muthén. Nydick, S. W., & Weiss, D. J. (2009). A hybrid simulation procedure for the development of CATs. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. Retrieved from http://

O’Neill, T., & Reynolds, M. (2006). Assessing the unidimensionality of the NCLEX-RN. Retrieved from Assessing_the_Unidimensionality_of_the_NCLEX-RN.pdf

Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74, 525-556.

R Core Team (2019). R: A language and environment for statistical computing [Computer software]. Vienna, Austria: R Foundation for Statistical Computing.

Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. Chicago: The University of Chicago Press.

Rässler, S., Rubin, D. B., & Zell, E. R. (2013). Imputation. WIREs Computational Statistics, 5(1), 20-29.

Ren, H., van der Linden, W. J., & Diao, Q. (2017). Continuous online item calibration: Parameter recovery and item utilization. Psychometrika, 82(2), 498-522. PMid:28290109

Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.

Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147-177. PMid:12090408

Segall, D. O. (2005). Computerized adaptive testing. In K. Kempf-Leonard (Ed.), Encyclopedia of social measurement (pp. 429-438). Boston: Elsevier Academic.

Smith, R. M. (1996). A comparison of methods for determining dimensionality in Rasch measurement.

Structural Equation Modeling, 3, 25-40.

Thompson, N. A., & Weiss, D. A. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research, and Evaluation, 16(1). doi:

Trendafilov, N., Kleinsteuber, M., & Zou, H. (2014). Sparse matrices in data analysis. Computational Statistics, 29(3), 403-405.

Van Buuren, S. (2018). Flexible imputation of missing data (2nd Ed.). Boca Raton, FL: Chapman & Hall/CRC.

van Buuren, S., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1-67.

Wainer H., & Mislevy R. J. (2000). Item response theory, item calibration, and proficiency estimation. In H. Wainer (Ed.), Computer adaptive testing: A primer (pp. 65-102). Hillsdale, NJ: Lawrence Erlbaum.

Wang, S., Jiao, H., & Xiang, Y. (2013, April). The effect of nonignorable missing data in computerized adaptive test on item fit statistics for polytomous item response models. In annual meeting of the National Council on Measurement in Education, San Francisco, CA.

Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 70-84.

Wright, B. D. (1997). Rasch factor analysis. In M. Wilson, G. Engelhard, & K. Draney (Eds.), Objective measurement: Theory into practice (Vol. 4) (pp. 113-137). Norwood, NJ: Ablex.

Yu, C. Ho., Popp, S. O., DiGangi, S., & Jannasch-Pennell, A. (2007). Assessing unidimensionality: A comparison of Rasch modeling, parallel analysis, and TETRAD. Practical Assessment Research & Evaluation, 12(14).