Document Type : Research article


1 Department of English Language and Literature, Faculty of Literature, Alzahra University, Tehran, Iran

2 M.A. graduate, Department of English Language and Literature, Faculty of Literature, Alzahra University, Tehran, Iran


Considering validity as a unitary concept, this study investigated the construct validity of the Iranian Ministry of Health Language Exam (MHLE). To meet this objective, we first conducted item analysis and reliability analysis, and verified KR20-if-item-deleted indices on the scores of 987 MHLE test takers before running factor analysis. Though the test was found to enjoy a high level of reliability, it suffered from 28 problematic items flagged through item analysis and KR20-if-item-deleted indices. Next, we ran factor analysis on the data, screened through item analysis, by implementing Horn’s parallel analysis and Velicer’s minimum average partial (MAP) tests. Parallel analysis resulted in overfactoring. The MAP test, however, produced results with two to seven factors.  Though the 4-factor result of the MAP test seemed to be more logical at first glance, the overall results were rather disappointing. Nineteen items did not load significantly on any factor and a clear pattern of item loading was not found for many items. These findings can be viewed as evidence detracting from the validity of MHLE.


Alavi, M. (1997). An investigation of the construct validity of reading comprehension test in an academic context: Using rhetorical structure theory. Paper presented in BALEAP Conference, University of Wales, Swansea, UK.
Alavi, S. M., & Ghaemi, H. (2011). Application of structural equation modeling in EFL testing: A report of two Iranian studies. Language Testing in Asia1(3), 22-35.
Alavi, S. M., Kaivanpanah, S., & Masjedlou, A.P. (2018) Validity of the listening module of international English language testing system: Multiple sources of evidence. Language Testing Asia, 8.
Alderson, C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge University Press.  
Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Brooks/Cole.
Anastasi, A. (1986). Evolving concepts of test validation. Annual Reviews of Psychology, 37(1), 1-15.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press.
Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary test. Language Testing, 16(2), 131-162.
Brindley, G. (1998). Assessing listening abilities. Annual Review of Applied Linguistics, 18, 171-191.
Brooks, G. P., & Johanson, G. A. (2003). TAP: Test analysis program. Applied Psychological Measurement, 27(4),303-304.
Brown, J. D. (2005). Testing in language programs. McGraw Hill.
Brown, T. (2010). Construct validity: A unitary concept for occupational therapy assessment and measurement. HKJOT, 20(1), 30-42.
Bryman, A., & Cramer, D. (1990). Quantitative data analysis for social scientists. Routledge.
Buck, G. (2001). Assessing listening. Cambridge University Press.
Colliver, J. A., Conlee, M. J., & Verhulst, S. J. (2012). From test validity to construct validity… and back? Medical Education46(4), 366-371.
Cox, T. L., & Malone, M. E. (2018). A validity argument to support the ACTFL Assessment of Performance Toward Proficiency in Languages (AAPPL). Foreign Language Annals51(3), 548-574.
Crocker, L. M., & Algina, J. (1986). Introduction to classical and modern test theory. Holt, Rinehart, and Winston.
Downing, S. M., & Haladyna, T. M. (Eds.) (2006). Handbook of test development. Lawrence Erlbaum Associates Publishers.
Falvey, P., Holbrook, J., & Coniam, D. (1994). Assessing students. Longman.
Field, A (2009). Discovering statistics using SPSS. Sage Publications.
Freedle, R., & Kostin, I. (1999). Does the text matter in a multiple-choice test of comprehension? The case for the construct validity of TOEFL's minitalks. Language Testing, 16(1), 2-32.
Fulcher, G., & Davidson, F. (2007). Language testing and assessment: An advanced resource book. Routledge.
Hale, G. A., Rock, D. A., & Jirele, T. (1989). Confirmatory factor analysis of the Test of English as a Foreign Language (TOEFL Research Report No. 32). Educational Testing Service.
Hale, G. A., Stansfield, C. W., Rock, D. A., Hicks, M. M., Butler, F. A., & Oller, Jr., J. W.  (1988). The relation of multiple-choice cloze items to the Test of English as a Foreign Language. Language Testing, 6(1), 47-76.
Hatch, E., & Farhady, H. (1982). Research design and statistics for applied linguistics. Newbury House.
Hinton, P. R., Brownlow, C., McMurray, I., & Cozens, B. (2004). SPSS explained. Taylor & Francis.
Hughes, A. (2003). Testing for language teachers. Cambridge University Press.
In'nami, Y, & Koizumi, R. (2011). Structural equation modelling in language testing and learning research: A review. Language Assessment Quarterly8(3), 250-273.
Jackson, T. R., Draugalis, J. R., Slack, M. K., Zachry, W. M., & D'Agpstino, J. (2002). Validation of authentic performance assessment: A process suited for Rasch modeling. American Journal of Pharmaceutical Education66(3), 233-243.
Kerlinger, N. F. (1979). Behavioral research: A conceptual approach. Holt, Rinehart and Winston.
Khine, M. S. (2013). Application of structural equation modeling in educational research and practice. Sense Publishers.
Kim, Y. M., & Kim, M. (2017). Validations of an English placement test for a general English language program at the tertiary level. JLTA (Japan Language Testing Association) Journal, 20, 17-34.
Kyle, K., Crossley, S. A., & McNamara, D. S. (2016). Construct validity in TOEFL iBT speaking tasks: Insights from natural language processing. Language Testing33(3), 319-340.
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35(10), 12-27.
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5-11.
Messick, S. (1990). Validity of test interpretation and use (Research Report No. 90.11). Educational Testing Service.
Messick, S. (2005). Standards of validity and the validity of standards in performance assessment. Educational Measurement: Issues and Practice, 14(4), 5-8.
Meyers, L. S., Gamst, G., & Guarino, A. J. (2006). Applied multivariate research: Design and interpretation. Sage Publications.
Moses, M. S., & Nanna, M. J. (2007). The testing culture and the persistence of high stakes testing reforms. Education and Culture, 23(1), 55-72.
Ockey, G, & Choi, I. (2015). Structural equation modeling reporting practices for language assessment. Language Assessment Quarterly12(3), 305-319.
Reyment, R., & Joreskog, K. G. (1993). Applied factor analysis in the natural sciences. Cambridge University Press.
Roever, C. (2001). Web-based language testing. Language Learning and Technology, 5(2), 84-94.
Saito, K. (2019). To what extent does long-term foreign language education improve spoken second language lexical proficiency? TESOL Quarterly,53(1), 82-101.
Salehi, M. (2011). On the construct validity of the reading section of the University of Tehran English Proficiency Test. Journal of English Language Teaching and Learning, 222, 129-159.
Sawaki, Y (2012). Factor analysis: The encyclopedia of applied linguistics. Blackwell Publishing Ltd.
Sawaki, Y., Stricker, L. J., & Oranje, A. H. (2009).  Factor structure of the TOEFL Internet-based test. Language Testing, 26(1)5-30.
Schmitt, T. (2011). Current methodological considerations in exploratory and confirmatory factor analysis. Journal of Psycho-educational Assessment29(4), 304-321.
Shaw, S. D., & Weir, C. J. (2007). Examining writing: Research and practice in assessing second language writing. Cambridge University Press.
Shepard, L. A. (1993). Evaluating test validity. Review of Research in Education, 19, 405-450.
Spolsky, B. (1995). Measured words: The development of objective language testing. Oxford University Press.
Stapleton. C. D. (1997, January). Basic concepts in exploratory factor analysis (EFA) as a tool to evaluate score validity: A right-brained approach [Paper presentation]. The annual meeting of the Southwest Educational Research Association, Austin.
Stoker, H. W., & Impara, J. C. (1995). Basic psychometric issues in licensure testing.  In J. C. Impara (Ed.), Licensure testing: Purposes, procedures, and practices (pp. 167-186). Nebraska Series on Measurement and Testing.
Stricker, L. J., Rock, D. A., & Lee, Y.-W. (2005). Factor structure of the language test across language groups (TOEFL Monograph Series MS-32). Educational Testing Service.
Tabachnick, B. G., & Fidell, L. S. (2008). Using multivariate statistics (5th Ed.). Pearson.
Van der Walt, J. L., & Steyn, F. (2008). The validation of language tests. Stellenbosch Papers in Linguistics, 38, 191-204.
Weir, C. J. (2005). Language testing and validation. Palgrave McMillan.