The Impact of Test Length on Raters’ Mental Processes During Scoring Test-Takers’ Writing Performance

Document Type : Research article

Authors

1 PhD Candidate, Department of ELT, Faculty of Literature and Foreign Languages, Karaj Branch, Islamic Azad University, Karaj, Iran

2 Assistant Professor, Department of ELT, Faculty of Literature and Foreign Languages, Karaj Branch, Islamic Azad University, Karaj, Iran

Abstract

Different factors such as the writing genre, writing prompt, and/or test length can influence the raters’ mental processes while scoring writing tests. Accordingly, whether an increase or a decrease in test length has any impact on how raters evaluate test-takers’ writing performance was the motive underlying this research. For this purpose, 12 EFL students who scored between 5.5 to 7.5 on the writing section of a mock IELTS test were selected based on availability sampling. The participants wrote three argumentative essays (the original, longer, and shorter versions). The three versions from each test-taker were then scored by three raters using IELTS task 2 writing band descriptors. Meanwhile, the raters provided verbal protocols explaining in detail the reasons underlying their scores to each test-taker’s essay. Then, the verbal protocols were transcribed and content analyzed using Nvivo version 11 to extract the themes mentioned by the raters in scoring each writing test. The results showed that the raters paid more attention to certain factors in the band descriptors and ignored some other factors. However, there was a similar pattern among the raters in scoring the three writing tests. The results did not show any significant differences in the raters’ mental processes while scoring each of the three writing tests. The conclusion was that test length is not a determining factor influencing the mental processes of raters in writing tests. Therefore, raters and test developers do not need to worry about test length influencing the raters’ scoring.

Keywords


  1. Ackerman, P. L., & Kanfer, R. (2009). Test length and cognitive fatigue: An empirical examination of effects on performance and test-taker reactions. Journal of Experimental Psychology: Applied, 15(2), 163-181. https://doi.org/10.1037/a0015719
  2. Ahmad, B. A. D. (2021). Effect of time allotment on test scores for academic writing of Indonesian learners of English. Multicultural Education, 7(1), 134-141.
  3. Alderson, J. C., & Banerjee, J. (2002). Language testing and assessment (Part 2). Language Teaching, 35(2), 79-113. https://doi.org/10.1017/S0261444802001751
  4. Asadi Vahdat, Z., & Tavassoli, K. (2019). A comparison of the effects of task repetition and elicitation techniques on EFL learners’ expository and descriptive writing. Language Horizons, 3(1), 243-267. https://doi.org/10.22051/lghor.2019.27890.1173
  5. Attali, Y. (2016). A comparison of newly-trained and experienced raters on a standardized writing assessment. Language Testing, 33(1), 1-17. https://doi.org/10.1177/0265532215582283
  6. Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford University Press.
  7. Barkaoui, K. (2019). Examining sources of variability in repeaters’ L2 writing scores: The case of the PTE Academic writing section. Language Testing, 36(1), 3-25. https://doi.org/10.1177/0265532217750692
  8. Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? TESOL Quarterly, 45(1), 5-35. https://doi.org/10.5054/tq.2011.244483
  9. Bijani, H. (2018). The investigation of rater expertise in oral language proficiency assessment: A multifaceted Rasch analysis. Language Horizons, 2(2), 103-124. https://doi.org/10.22051/lghor.2019.26072.1123
  10. Brown, J. D. (2005). Testing in language programs (2nd ed.). McGraw-Hill College.
  11. Coe, K., & Scacco, J. M. (2017). Content analysis, quantitative. In C. S. Davis & R. F. Potter (Eds.), The international encyclopedia of communication research methods (pp. 1-11). John Wiley & Sons, Inc.
  12. Cohen, A. D. (2003). Learner strategy training in the development of pragmatic ability. In A. Martinez Flor, E. Usó Juan, & A. Fernández Guerra (Eds.), Pragmatic competence and foreign language teaching (pp. 93-108). Publicacions de la Universitat Jaume I.
  13. Dörnyei, Z. (2007). Research methods in applied linguistics: Quantitative, qualitative, and mixed methodologies. Oxford University Press.
  14. Ducasse, A. M. (2010). Interaction in paired oral proficiency assessment in Spanish: Rater and candidate input into evidence-based scale development and construct definition. Peter Lang.
  15. Duijm, K., Schoonen, R., & Hulstijn, J. H. (2018). Professional and non-professional raters’ responsiveness to fluency and accuracy in L2 speech: An experimental approach. Language Testing, 35(4), 501-527. https://doi.org/10.1177/0265532217712553
  16. Eckes, T. (2012). Operational rater types in writing assessment: Linking rater cognition to rater behavior. Language Assessment Quarterly, 9(3), 270-292. https://doi.org/10.1080/15434303.2011.649381
  17. Esfandiari, R., & Noor, P. (2019). Iranian EFL raters’ cognitive processes in rating IELTS speaking tasks: The effect of expertise. Journal of Modern Research in English Language Studies, 5(2), 41-76. https://doi.org/10.30479/jmrels.2019.9383.1248
  18. Fazilatfar, A., Kasiri, F., & Nowbakht, M. (2020). The comparative effects of planning time and task conditions on the complexity, accuracy, and fluency of L2 writing by EFL learners. Iranian Journal of Language Teaching Research, 8(1), 93-110.
  19. Han, Q. (2016). Rater cognition in L2 speaking assessment: A review of the literature. Teachers College, Columbia University Working Papers in TESOL & Applied Linguistics, 16(1), 1-24. https://doi.org/10.7916/D82R53MF
  20. Hoora, E. (2019). The effect of the duration of IELTS speaking test on examiners’ evaluation of candidates’ performance [Unpublished Master’s thesis]. Karaj Islamic Azad University, Iran.
  21. Huang, J., & Foote, C. J. (2010). Grading between the lines: What really impacts professors’ holistic evaluation of ESL graduate student writing? Language Assessment Quarterly, 7(3), 219-233. https://doi.org/10.1080/15434300903540894
  22. Humphrey-Murto, S., Shaw, T., Touchie, C., Pugh, D., Cowley, L., & Wood, T.J. (2021). Are raters influenced by prior information about a learner? A review of assimilation and contrast effects in assessment. Advances in Health Sciences Education, 25(2), 4-24. https://doi.org/10.1007/s10459-021-10032-3
  23. Kang, H. S., & Veitch, H. (2017). Mainstream teacher candidates’ perspectives on ESL writing: The effects of writer identity and rater background. TESOL Quarterly, 51(2), 249-274. https://doi.org/10.1002/tesq.289
  24. Llosa, L., & Malone, M. E. (2019). Comparability of students’ writing performance on TOEFL iBT and in required university writing courses. Language Testing, 36(2), 235 263.
  25. Lumley, T. (2005). Assessing second language writing: The rater’s perspective. Peter Lang.
  26. Marcoulides, G. A., & Ing, M. (2014). The use of generalizability theory in language assessment. In A. J. Kunnan (Ed.), The companion to language assessment: Evaluation, methodology and interdisciplinary themes (Vol. 3, pp. 124-141). John Wiley & Sons, Inc.
  27. May, L. (2011). Interactional competence in a paired speaking test: Features salient to raters. Language Assessment Quarterly, 8(2), 127-145. https://doi.org/10.1080/15434303.2011.565845
  28. Oh, S. (2020). Second language learners’ use of writing resources in writing assessment. Language Assessment Quarterly, 17(1), 60-84. https://doi.org/10.1080/15434303.2019.1674854
  29. Plakans, L. (2014). Written discourse. In A. J. Kunnan (Ed.), The companion to language assessment: Evaluation, methodology and interdisciplinary themes (Vol. 3, pp. 305317). John Wiley & Sons, Inc.
  30. Purpura, J. E. (2004). Assessing grammar. Cambridge University Press.
  31. Purpura, J. E. (2014). Cognition and language assessment. In A. J. Kunnan (Ed.), The companion to language assessment (Vol. 3, pp.1452-1476). John Wiley & Sons, Inc.
  32. Sahin, A., & Anil, D. (2017). The effects of test length and sample size on item parameters in 1452item response theory. Educational Sciences: Theory & Practice, 17(1), 321-335.  https://doi.org/10.12738/estp.2017.1.0270
  33. Sahragard, R., & Mallahi, O. (2014). Relationship between Iranian EFL learners’ language learning styles, writing proficiency, and self-assessment. Social and Behavioral Sciences, 98(1), 1611-1620.   https://doi.org/10.1016/j.sbspro.2014.03.585
  34. Sawaki, Y. (2007). Construct validation of analytic rating scales in a speaking assessment: Reporting a score profile and a composite. Language Testing, 24(3), 355-390.  https://doi.org/10.1177/0265532207077205
  35. Shirai, Y., & Vercellotti, M. L. (2014). Language acquisition and language assessment. In A. J. Kunnan (Ed.), The companion to language assessment: Evaluation, methodology and interdisciplinary themes (Vol. 3, pp. 300-314). John Wiley & Sons, Inc.
  36. University of Cambridge ESOL examinations. (2015). Cambridge English IELTS 10 with answers: Authentic examination papers from Cambridge English language assessment. Cambridge University Press. ‏
  37. Van Batenburg, E. S., Oostdam, R. J., Van Gelderen, A. J., & De Jong, N. H. (2018). Measuring L2 speakers’ interactional ability using interactive speech tasks. Language Testing, 35(1), 75-100. https://doi.org/10.1177/0265532216679452
  38. Weigle, S. C. (2002). Assessing writing. Cambridge University Press.
  39. Wind, S. A. (2019). A nonparametric procedure for exploring differences in rating quality across test-taker subgroups in rater-mediated writing assessments. Language Testing, 36(4), 595-616. https://doi.org/10.1177/0265532219838014