Bridging the Standard Setting Gap via Assessment Engineering

Bridging the Standard Setting Gap via Assessment Engineering

Authors

  • American Board of Pediatrics

Keywords:

Assessment Engineering, Principled Assessment, Standard Setting

Abstract

Standard setting is the process of identifying the point(s) on a scale that serve to differentiate between individuals of distinct proficiency levels. While standard setting is ultimately a policy decision, most of the process is carried out by subject matter experts who are tasked with reconciling item-level or examinee-level information (e.g. content of the item, examinee data, performance data) with performance level descriptors, borderline candidate conceptualizations, etc. The ability of subject matter experts to recommend a cut score that accurately differentiates between levels of proficiency is directly impacted by the level of alignment between the conceptualization of the construct and the more concrete test or examinee content upon which they are making judgments. Assessment engineering is a principled assessment framework that, in addition to infusing manufacturing and quality control principles into the test development cycle, builds in a high level of connectivity between the claims we wish to make about examinees and the items they are administered. This paper focuses on the role assessment engineering infrastructure can play in helping standard setting-panelists bridge the gap between the claims testing bodies wish to make about examinees and the item-level information they are typically provided, and, in doing so, strengthen the validity argument for an instrument’s score interpretations.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Downloads

Published

2019-05-29

How to Cite

Furter, R. T. (2019). Bridging the Standard Setting Gap via Assessment Engineering. Journal of Applied Testing Technology, 20(S2), 60–70. Retrieved from http://www.jattjournal.net/index.php/atp/article/view/143676

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association.

Bejar, I. I., Braun, H. I., & Tannenbaum, R. J. (2007). A prospective, predictive, and progressive approach to standard setting. In R. Lissitz (Ed.), Assessing and Modeling Cognitive Development in School: Intellectual Growth and Standard Setting. Maple Grove, MN: JAM Press.

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.). Statistical theories of mental test scores (397472). Reading, MA: Addison-Wesley.

Cizek, G. J. (2013). An introduction to contemporary standard setting: Concepts, characteristics, and contexts. In. G. J.

Cizek (Ed.). Setting Performance Standards: Foundations, Methods, and Innovations. (3-14). New York: Routledge.

Cizek, G. J., & Earnest, D. S. (2016). Setting performance standards on tests. In S. Lane, M. R. Raymond, & T. M. Haladyna (Eds.). The Handbook of Test Development (221-237). New York: Routledge.

Davis-Becker, S. (2013). Construct maps: Do they make the unclear clear? Measurement: Interdisciplinary Research and Perspectives. 11(4), 174–76.

Ebel, R. L. (1962). Content Standard Test Scores. Educational and Psychological Measurement, 22(1).

Ferrara, S., Lai, E., Reilly, A., & Nichols, P. D. (2017). Principled approaches to assessment design, development, and implementation. In A. A. Rupp & J. P. Leighton (Eds.). The Handbook of Cognition and Assessment (1st ed., pp. 41-74). West Sussex, UK: Wiley.

Hambleton, R. K., & Pitoniak, M. (2006). Setting performance standards. In R. L. Brennan (Ed.), Educational Measurement (4th ed., pp. 433-70). Westport, CT: Praeger.

Hendrickson, A., Huff, K., & Luecht, R. (2010). Claims, evidence, and achievement level descriptors as a foundation for item design and test specifications. Applied Measurement in Education, 23, 358–77.

Huff, K., & Plake, B. (Eds) (2010). Evidenceâ€centered assessment design in practice [Special issue]. Applied Measurement in Education, 23(4).

Kane, M. (1994). Validating the performance standards associated with passing scores. Review of Educational Research, 64(3), 425–461.

Kane, M. (2001). So much remains the same: Conception and status of validation in setting standards. In G. J. Cizek (Ed.), Standard Setting: Concepts, Methods, and Perspectives (53-88). Mahwah, NJ: Erlbaum.

Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking: Methods and practices. New York, NY: Springer.

Lai, H., & Gierl, M. J. (2013). Generating items under the assessment engineering framework. In M. J. Gierl & T. M. Haladyna (Eds.). Automatic Item Generation: Theory and Practice (pp. 77-101). New York, NY: Routledge.

Lane, S., Raymond, M. R., Haladyna, T. M., & Downing, S. M. (2016). Test development process. In S. Lane, M. R. Raymond, & T. M. Haladyna (Eds.). The Handbook of Test Development (3-18). New York: Routledge.

Lewis, D. M., Mitzel, H. C., & Green, D. R. (1996, June). Standard setting: A bookmark approach. In D. R. Green (Chair), IRT-based standard-setting procedures using behavioral anchoring. Symposium conducted at the Council of Chief State School Officers National Conference on Large Scale Assessment, Phoenix, AZ.

Luecht, R. M. (2006). Engineering the test: From principled item design to automated test assembly. Invited paper presented at the annual meeting of the Society for Industrial and Organizational Psychology, Dallas, TX.

Luecht, R. M. (2013). Assessment engineering task model maps, task models and templates as a new way to develop and implement test specifications. Journal of Applied Testing Technology, 14.

Luecht, R. M. (2017). Professional certification and licensure examinations. In A. A. Rupp & J. P. Leighton (Eds.). The Handbook of Cognition and Assessment: Frameworks, Methodologies, and Applications (1st ed., pp. 446-71). West Sussex, UK: Wiley.

Leighton, J. P., & Gierl, M. J. (2007). Defining and evaluating models of cognition used in educational measurement to make inferences about examinees’ thinking processes. Educational Measurement: Issues and Practice.

Mislevy, R. J. (1994). Evidence and inference in educational assessment. Psychometrika, 59, 439-83.

Mislevy, R. J.; & Riconscente, M. M. (2006). Evidence-centered assessment design. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of Test Development, (61-90). Mahwah, NJ: Lawrence Erlbaum.

Pant, H. A., Rupp, A. A., Tiffin-Richards, S. P., & Koller, O. (2009). Validity issues in standard-setting studies. Studies in Educational Evaluation, 35, 95–101.

Papageorgiou, S. & Tannenbaum, R. J. (2016). Situating standard setting within argument-based validity. Language Assessment Quarterly, 13(2), 109–123.

Plake, B. S., Huff, K., & Reshetar, R. (2010). Evidence-centered assessment design as a foundation for achievement-level descriptor development and for standard setting. Applied Measurement in Education, 23(4), 342–357.

Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. (Copenhagen, Danish Institute for Educational Research), expanded edition (1980) with foreword and afterword by B. D. Wright. Chicago: The University of Chicago Press.

William, D. (1996). Meanings and consequences in standard setting. Assessment in Education: Principles, Policy and Practice, 3(3).

Wilson, M. (2005). Constructing Measures. Mahway, NJ: Erlbaum.

Wyse, A. E. (2013). Construct maps as a foundation for standard setting. Measurement: Interdisciplinary Research and Perspectives, 11(4), 139–170.

Loading...