Alavi, M. (1997). An investigation of the construct validity of reading comprehension test in an academic context: Using rhetorical structure theory. Paper presented in BALEAP Conference, University of Wales, Swansea, UK.
Alavi, S. M., & Ghaemi, H. (2011). Application of structural equation modeling in EFL testing: A report of two Iranian studies.
Language Testing in Asia,
1(3), 22-35.
https://doi.org/10.1186/2229-0443-1-3-22.
Alavi, S. M., Kaivanpanah, S., & Masjedlou, A.P. (2018) Validity of the listening module of international English language testing system: Multiple sources of evidence. Language Testing Asia, 8. https://doi.org/10.1186/s40468-018-0057-4
Alderson, C., Clapham, C., & Wall, D. (1995). Language test construction and evaluation. Cambridge University Press.
Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Brooks/Cole.
Anastasi, A. (1986). Evolving concepts of test validation. Annual Reviews of Psychology, 37(1), 1-15.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press.
Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary test. Language Testing, 16(2), 131-162.
Brindley, G. (1998). Assessing listening abilities. Annual Review of Applied Linguistics, 18, 171-191.
Brooks, G. P., & Johanson, G. A. (2003). TAP: Test analysis program. Applied Psychological Measurement, 27(4),303-304.
Brown, J. D. (2005). Testing in language programs. McGraw Hill.
Brown, T. (2010). Construct validity: A unitary concept for occupational therapy assessment and measurement. HKJOT, 20(1), 30-42.
Bryman, A., & Cramer, D. (1990). Quantitative data analysis for social scientists. Routledge.
Buck, G. (2001). Assessing listening. Cambridge University Press.
Colliver, J. A., Conlee, M. J., & Verhulst, S. J. (2012). From test validity to construct validity… and back? Medical Education, 46(4), 366-371.
Cox, T. L., & Malone, M. E. (2018). A validity argument to support the ACTFL Assessment of Performance Toward Proficiency in Languages (AAPPL). Foreign Language Annals, 51(3), 548-574.
Crocker, L. M., & Algina, J. (1986). Introduction to classical and modern test theory. Holt, Rinehart, and Winston.
Downing, S. M., & Haladyna, T. M. (Eds.) (2006). Handbook of test development. Lawrence Erlbaum Associates Publishers.
Falvey, P., Holbrook, J., & Coniam, D. (1994). Assessing students. Longman.
Field, A (2009). Discovering statistics using SPSS. Sage Publications.
Freedle, R., & Kostin, I. (1999). Does the text matter in a multiple-choice test of comprehension? The case for the construct validity of TOEFL's minitalks. Language Testing, 16(1), 2-32.
Fulcher, G., & Davidson, F. (2007). Language testing and assessment: An advanced resource book. Routledge.
Hale, G. A., Rock, D. A., & Jirele, T. (1989). Confirmatory factor analysis of the Test of English as a Foreign Language (TOEFL Research Report No. 32). Educational Testing Service.
Hale, G. A., Stansfield, C. W., Rock, D. A., Hicks, M. M., Butler, F. A., & Oller, Jr., J. W. (1988). The relation of multiple-choice cloze items to the Test of English as a Foreign Language. Language Testing, 6(1), 47-76.
Hatch, E., & Farhady, H. (1982). Research design and statistics for applied linguistics. Newbury House.
Hinton, P. R., Brownlow, C., McMurray, I., & Cozens, B. (2004). SPSS explained. Taylor & Francis.
Hughes, A. (2003). Testing for language teachers. Cambridge University Press.
Jackson, T. R., Draugalis, J. R., Slack, M. K., Zachry, W. M., & D'Agpstino, J. (2002). Validation of authentic performance assessment: A process suited for Rasch modeling. American Journal of Pharmaceutical Education, 66(3), 233-243.
Kerlinger, N. F. (1979). Behavioral research: A conceptual approach. Holt, Rinehart and Winston.
Khine, M. S. (2013). Application of structural equation modeling in educational research and practice. Sense Publishers.
Kim, Y. M., & Kim, M. (2017). Validations of an English placement test for a general English language program at the tertiary level. JLTA (Japan Language Testing Association) Journal, 20, 17-34.
Kyle, K., Crossley, S. A., & McNamara, D. S. (2016). Construct validity in TOEFL iBT speaking tasks: Insights from natural language processing. Language Testing, 33(3), 319-340.
Messick, S. (1980). Test validity and the ethics of assessment. American Psychologist, 35(10), 12-27.
Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher, 18(2), 5-11.
Messick, S. (1990). Validity of test interpretation and use (Research Report No. 90.11). Educational Testing Service.
Messick, S. (2005). Standards of validity and the validity of standards in performance assessment. Educational Measurement: Issues and Practice, 14(4), 5-8.
Meyers, L. S., Gamst, G., & Guarino, A. J. (2006). Applied multivariate research: Design and interpretation. Sage Publications.
Moses, M. S., & Nanna, M. J. (2007). The testing culture and the persistence of high stakes testing reforms. Education and Culture, 23(1), 55-72.
Reyment, R., & Joreskog, K. G. (1993). Applied factor analysis in the natural sciences. Cambridge University Press.
Roever, C. (2001). Web-based language testing. Language Learning and Technology, 5(2), 84-94.
Saito, K. (2019). To what extent does long-term foreign language education improve spoken second language lexical proficiency? TESOL Quarterly,53(1), 82-101.
Salehi, M. (2011). On the construct validity of the reading section of the University of Tehran English Proficiency Test. Journal of English Language Teaching and Learning, 222, 129-159.
Sawaki, Y., Stricker, L. J., & Oranje, A. H. (2009). Factor structure of the TOEFL Internet-based test. Language Testing, 26(1), 5-30.
Schmitt, T. (2011). Current methodological considerations in exploratory and confirmatory factor analysis.
Journal of Psycho-educational Assessment,
29(4), 304-321.
https://doi.org/10.1177/0734282911406653.
Shaw, S. D., & Weir, C. J. (2007). Examining writing: Research and practice in assessing second language writing. Cambridge University Press.
Shepard, L. A. (1993). Evaluating test validity. Review of Research in Education, 19, 405-450.
Spolsky, B. (1995). Measured words: The development of objective language testing. Oxford University Press.
Stapleton. C. D. (1997, January). Basic concepts in exploratory factor analysis (EFA) as a tool to evaluate score validity: A right-brained approach [Paper presentation]. The annual meeting of the Southwest Educational Research Association, Austin.
Stoker, H. W., & Impara, J. C. (1995). Basic psychometric issues in licensure testing. In J. C. Impara (Ed.), Licensure testing: Purposes, procedures, and practices (pp. 167-186). Nebraska Series on Measurement and Testing.
Stricker, L. J., Rock, D. A., & Lee, Y.-W. (2005). Factor structure of the language test across language groups (TOEFL Monograph Series MS-32). Educational Testing Service.
Tabachnick, B. G., & Fidell, L. S. (2008). Using multivariate statistics (5th Ed.). Pearson.
Van der Walt, J. L., & Steyn, F. (2008). The validation of language tests. Stellenbosch Papers in Linguistics, 38, 191-204.
Weir, C. J. (2005). Language testing and validation. Palgrave McMillan.