Using Machine Learning to Predict Bloom’s Taxonomy Level for Certification Exam Items

Using Machine Learning to Predict Bloom’s Taxonomy Level for Certification Exam Items


  • Chief Psychometrician, Certiverse, 4803 N. Milwaukee Avenue, Suite B, Unit 103, Chicago, IL 60630
  • Psychometrician, Certiverse, 4803 N. Milwaukee Avenue, Suite B, Unit 103, Chicago, IL 60630


Naïve Bayesian classifier, Bloom's Taxonomy, Machine Learning


This study fit a Naïve Bayesian classifier to the words of exam items to predict the Bloom’s taxonomy level of the items. We addressed five research questions, showing that reasonably good prediction of Bloom’s level was possible, but accuracy varies across levels. In our study, performance for Level 2 was poor (Level 2 items were misclassified and other items were classified as Level 2), but the performance of the model in distinguishing Level 1 from all other levels was quite good. Applying a model developed on an IT certification exam domain to a more diverse set of items showed poor performance, suggesting that models may generalize poorly. Finally, we showed what features of items the classifier was using. Examples and implications for practice are discussed.


Download data is not yet available.


Metrics Loading ...




How to Cite

Mead, A. D., & Zhou, C. (2022). Using Machine Learning to Predict Bloom’s Taxonomy Level for Certification Exam Items. Journal of Applied Testing Technology, 23, 53–71. Retrieved from





Anderson, L. W., Krathwohl, D. R., Airasian, P. W., Cruikshank, K. A., Mayer, R. E., Pintrich, P. R., Raths, J., & Wittrock, M. C. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s Taxonomy of Educational Objectives (Abridged Edition). New York: Longman.

Athanassiou, N., McNett, J., & Harvey C. (2003). Critical thinking in the management classroom: Bloom’s taxonomy as a learning tool. Journal of Management Education, 27(5), 555-575.

Billings, M. S., DeRuchie, K., Hussie, K., Kulesher, A., Merrell, J., Morales, A., Paniagua, M. A., Sherlock, J., Swygert, K.A., & Tyson, J. (2020). Constructing Written Test Questions for the Basic and Clinical Sciences (6th ed). Philadelphia, PA: National Board of Medical Examiners

Bloom, B. S., Engelhart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (1956). Taxonomy of educational objectives; the classification of educational goals; Handbook I: Cognitive domain. New York, NY: Longmans, Green.

Buckwalter, J. A., Schumacher, R., Albright, J. P., & Cooper, R. R. (1981). Use of an educational taxonomy for evaluation of cognitive performance. Journal of Medical Education, 56(2), 115-21.

Chang, W., & Chung, M. (2009). Automatic applying Bloom’s taxonomy to classify and analysis the cognition level of English question items. 2009 Joint Conferences on Pervasive Computing (JCPC), 727-734.

Cilliers, F. J., Schuwirth, L. W., Herman, N., Adendorff, H. J., & van der Vleuten, C. P. (2012). A model of the preassessment learning effects of summative assessment in medical education. Advances in Health Sciences Education: Theory and Practice, 17(1), 39-53.

Crowe, A., Dirks, C., & Wenderoth, M. (2008). Biology in bloom: Implementing Bloom’s taxonomy to enhance student learning in biology. CBE Life Sciences Education, 7(4), 368-381.

Haris, S. S., & Omar, N. (2015). Bloom’s taxonomy question categorization using rules and n-gram approach. Journal of Theoretical and Applied Information Technology, 7(3), 401-407.

Jensen, J. L., McDaniel, M. A., Woodard, S. M., & Kummer, T. (2014). Teaching to the test… or testing to teach: Requiring higher order thinking skills encourage greater conceptual understanding. Educational Psychology Review, 26, 307-329.

Jurafsky, D., & Martin, J. H. (2000). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River, NJ: Prentice Hall.

Karpen, S. C., & Welch, A. C. (2016). Assessing the interrater reliability and accuracy of pharmacy faculty’s Bloom’s Taxonomy classifications. Currents in Pharmacy Teaching and Learning, 8, 885-888. https://doi.


Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159–174.

Manning, C. D., Raghavan, P., & Schütze, H. (2009). An introduction to information retrieval (Online Edition). Cambridge University Press: Cambridge, England. irbookonlinereading.pdf

Mesic, V., & Muratovic, H. (2011). Identifying predictors of physics item difficulty: A linear regression approach. Physical Review Special Topics - Physics Education Research, 7. PhysRevSTPER.7.010110

Mohammed, M., & Omar, N. (2020). Question classification based on Bloom’s taxonomy cognitive domain using modified TF-IDF and word2vec. PLoS ONE. 15(3).

Nkanginieme, K. E. O. (1997). Clinical diagnosis as a dynamic cognitive process: Application of Bloom’s Taxonomy for educational objectives in the cognitive domain. Medical Education Online, 2:1. https://doi.


Orrù, G., Monaro, M., Conversano, C., Gemignani, A., & Sartori, G. (2020). Machine learning in psychometrics and psychological research. Frontiers in Psychology, 10, 2970.

Plack, P.M., Driscoll, M., Marquez, M., Cuppernull, L., Maring, J., & Greenberg, L. (2007). Assessing reflective writing on a pediatric clerkship by using a modified Bloom’s Taxonomy. Ambulatory Pediatrics, 7, 285-291.

Porter, M.F. (1980). An algorithm for suffix stripping. Program, 14, 130-137.

Putka, D. J., Beatty, A. S., & Reeder, M. C. (2018). Modern prediction methods: New perspectives on a common problem. Organizational Research Methods, 21, 689-732.

Rosca, C. V. (2004). What makes a science item difficult? A study of TIMSS -R items using regression and the linear logistic test model. Unpublished doctoral dissertation, Boston College.

Sinharay, S. (2016). An NCME Instructional Module on Data Mining Methods for Classification and Regression. Educational Measurement: Issues and Practice, 35, 38-54.

Stephens, C. R., Huerta, H. F., & Linares, A. R. (2018). When is the Naïve Bayes approximation not so naïve? Machine Learning, 107, 397-441.

Tan, Y. T., & Othman, A. (2013). The relationship between complexity (taxonomy) and difficulty. AIP Conference Proceedings, 1522, 596.

Thompson, E., Luxton-Reilly, A., Whalley, J., Hu, M. & Robbins, P. (2008). Bloom’s taxonomy for CS assessment. Paper presented at Conference on Australasian computing education.

Yahya, A. A., Osman, A., Taleb, A., & Alattab, A. A. (2013). Analyzing the cognitive level of classroom questionsn using machine learning techniques. Procedia – Social and Behavioral Sciences, 97, 587-595.

Yahya, A. A., & Osman, A. (2011). Automatic classification of questions into Bloom’s cognitive levels using support vector machines. Proceedings of the International Arab Conference on Information Technology. Riyadh, Saudi Arabia, 335-342.

Yusof, N., & Hui, C.J. (2010). Determination of Bloom’s cognitive level of question items using artificial neural network. 2010 10th International Conference on Intelligent Systems Design and Applications, 866-870.

Zaidi, N. L. B., Grob, K. L., Monrad, S. M., Kurtz, J. B., Tai, A., Ahmed, A. Z., Gruppen, L. D., & Santen, S. A. (2018). Pushing critical thinking skills with multiple-choice questions: Does Bloom’s Taxonomy work? Academic Medicine, 93(6), 856-859.