Turkish-to-English short story translation by DeepL: Human evaluation by trainees and translation professionals vs. automatic evaluation

Authors

DOI:

https://doi.org/10.33919/esnbu.25.1.2

Keywords:

literary translation, machine translation evaluation, human evaluation, automatic evaluation, BLEU

Abstract

This mixed-methods study aims to evaluate the quality of Turkish-to-English literary machine translation by DeepL, incorporating both human and automatic evaluation metrics while engaging translation trainees and professional translators. Raw MT output of two short stories, Mendil Altında and Kabak Çekirdekçi, evaluated by both groups via TAUS DQF tool and evaluators wrote reports on the detected errors. Additionally, BLEU was employed for automatic evaluation. The results indicate a consensus between trainees and professionals in assessing MT accuracy and fluency. Accuracy rates were 80.59% and 80.50% for Mendil Altında, and 73.08% and 82.35% for Kabak Çekirdekçi. Fluency rates were similarly close, 71.96% and 72.32% for Mendil Altında, and 66.81% and 62.09% for Kabak Çekirdekçi. Bleu scores, particularly 1-gram results, align with the human evaluators’ results. Furthermore, reports show that trainees provided more detailed analysis, frequently using meta-language, suggesting that increased exposure to metrics enhances trainees’ ability to identify fine-grained MT errors.

Author Biography

Halise Gülmüş Sırkıntı, Marmara University

Halise Gülmüş Sırkıntı is an Assistant Professor in the Department of Translation and Interpreting at Marmara University, Türkiye. She holds a BA, MA, and PhD in Translation and Interpreting Studies from Fatih Sultan Mehmet University, Türkiye. Her research focuses on literary translation, travel writing, and translation criticism.

References

Aslan, E. (2024). Yapay zekâ destekli çeviri araçlarının edebi çevirideki yeterlilikleri üzerine karşılaştırmalı bir inceleme [A Comparative Study on the Adequacy of Artificial Intelligence-Assisted Translation Tools in Literary Translation]. Istanbul University Journal of Translation Studies, 20, 32-45. https://doi.org/10.26650/iujts.2024.1426435

Ayık Akça, T. (2022). Edebi metinlerde ve uzmanlık alan metinlerinde makine çevirisinin olanakları/olanaksızlığı: Çevirmenin değişen görev tanımlarına yeniden bakmak [The im/possibility of machine translation in literary and specialized texts: Rethinking translators' changing job descriptions]. RumeliDE Dil ve Edebiyat Araştırmaları Dergisi, (30), 1321-1343. https://doi.org/10.29000/rumelide.1188804

Adıvar, H. E. (1973). Pumpkin seed seller. In A. Alparslan (Ed.), An anthology of Turkish short stories (T. S. Halman, Trans.). RCD Cultural Institution.

Adıvar, H. E. (2001). Gündelik adamlar: Kabak çekirdekçi [Pumpkin Seed Seller]. In Dağa çıkan kurt [The Wolf on the Mountain] (pp. 33-38). Özgür Yayınları.

Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In Proceedings of the 6th International Conference on Learning Representations. San Diego, USA. https://doi.org/10.48550/arXiv.1409.0473

Bentivogli, L., Bisazza, A., Cettolo, M., & Federico, M. (2016). Neural versus phrase-based machine translation quality: A case study. In J. Su, K. Duh, & X. Carreras (Eds.), Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 257-267). Association for Computational Linguistics. https://doi.org/10.18653/v1/D16-1025

Birkan Baydan, E. (2016). Edebiyat çevirisinde sahneler ve aktörler [The Scenes and Actors of Literary TranslationT]. Diye.

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3, 77-101. https://doi.org/10.1191/1478088706qp063oa

Callison-Burch, C., Osborne, M., & Koehn, P. (2006). Re-evaluating the role of BLEU in machine translation research. In Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics (pp. 249–256). Association for Computational Linguistics.

Castilho, S., O'Brien, S., Gaspari, F., Moorkens, J., & Way, A. (2018). Approaches to human and machine translation quality assessment. In J. Moorkens, S. Castilho, F. Gaspari, & A. Way (Eds.), Translation quality assessment (pp. 9-38). Springer. https://doi.org/10.1007/978-3-319-91241-7_2

Chatzikoumi, E. (2020). How to evaluate machine translation: A review of automated and human metrics. Natural Language Engineering, 26(2), 137–161. https://doi.org/10.1017/S1351324919000469

Chunyu, K., & Wong Tak-Ming, B. (2015). Evaluation in machine translation and computer-aided translation. In S.-W. Chan (Ed.), Routledge encyclopedia of translation technology (pp. 213–237). Routledge.

Creswell, J. W. (2010). Mapping the developing landscape of mixed methods research. In A. Tashakkori & C. Teddlie (Eds.), SAGE handbook of mixed methods in social and behavioral research (2nd ed., pp. 45-68). Sage. https://doi.org/10.4135/9781506335193.n2

Çetiner, C. (2021). Sustainability of translation as a profession: Changing roles of translators in light of the developments in machine translation systems. RumeliDE Dil ve Edebiyat Araştırmaları Dergisi, 9, 575-586. https://doi.org/10.29000/rumelide.985014

Dallı, H., Dursun, O., Balal, Z., Hodjikj, E., Gürses, S., Güngör, T., & Şahin, M. (2024). Giving a translator’s touch to the machine: Reproducing translator style in literary machine translation. Palimpsestes, 38, 15–56. https://doi.org/10.4000/12sp6

Dillinger, M. (2014). Introduction. In S. O'Brien, L. Winther Balling, M. Carl, et al. (Eds.), Post-editing of machine translation: Processes and applications (pp. ix–xv). Cambridge Scholars Publishing.

Doğru, G. (2022). Translation quality regarding low-resource, custom machine translations: A fine-grained comparative study on Turkish-to-English statistical and neural machine translation systems. İstanbul Üniversitesi Çeviribilim Dergisi, 17, 95–115. https://doi.org/10.26650/iujts.2022.1182687

Ekinci, S. (2022). The effect of error annotation on post-editing effort and post-edited product: An experimental study on machine-translated subtitles of educational content [unpublished Master’s thesis]. 29 Mayıs University, Istanbul.

Esendal, M. Ş. (1973). Under the handkerchief. In A. Alparslan (Ed.), An anthology of Turkish short stories (T. S. Halman, Trans.). RCD Cultural Institution.

Esendal, M. Ş. (2023). Mendil altında. In Mendil altında (pp. 117–120). İletişim Yayıncılık.

García, I. (2014). Training quality evaluators. Revista Tradumàtica, 12, 430-436. https://doi.org/10.5565/rev/tradumatica.64

Giménez, J., & Màrquez, L. (2010). Asiya: An open toolkit for automatic machine translation (meta)evaluation. The Prague Bulletin of Mathematical Linguistics, 94, 77-86. https://doi.org/10.2478/v10108-010-0022-6

Gu, L. (2022). Translation of Japanese literature language and natural language environment understanding based on artificial neural network. Journal of Environmental and Public Health, 22, 1-12. https://doi.org/10.1155/2022/2015763

Guerberof Arenas, A., & Toral, A. (2022). CREAMT: Creativity and narrative engagement of literary texts translated by translators and NMT. In H. Moniz, L. Macken, A. Rufener, et al. (Eds.), Proceedings of the 23rd Annual Conference of the European Association for Machine Translation (pp. 357–358). European Association for Machine Translation.

Guerberof-Arenas, A., & Moorkens, J. (2019). Machine translation and post-editing training as part of a master’s programme. JoSTrans: The Journal of Specialised Translation, 31, 217-238. https://jostrans.soap2.ch/issue31/art_guerberof.php

Gürses, S., Şahin, M., Hodjikj, E., Güngör, T., Dallı, H., & Dursun, O. (2024). Çeviribilim çalışmalarında çevirmenin üslubu ve makinenin üslubu [The style of the translator and the style of the machine in translation studies]. Çeviribilim ve Uygulamaları Dergisi, 36, 100-124. https://doi.org/10.37599/ceviri.1468718

Hutchins, J. (1995). Machine translation: A brief history. In E. F. K. Koerner & R. E. Asher (Eds.), Concise history of the language sciences: From the Sumerians to the cognitivists (pp. 431–445). Pergamon Press. https://doi.org/10.1016/B978-0-08-042580-1.50066-0

Hutchins, J. (2015). Machine translation: History of research and applications. In S.-W. Chan (Ed.), Routledge encyclopedia of translation technology (pp. 120–137). Routledge.

Jiang, Y., & Niu, J. (2022). How are neural machine-translated Chinese-to-English short stories constructed and cohered? An exploratory study based on theme-rheme structure. Lingua, 273, 103318. https://doi.org/10.1016/j.lingua.2022.103318

Junczys-Dowmunt, M., Dwojak, T., & Hoang, H. (2016). Is neural machine translation ready for deployment? A case study on 30 translation directions. In M. Cettolo, J. Niehues, S. Stüker, et al. (Eds.), Proceedings of the 9th International Workshop on Spoken Language Translation. International Workshop on Spoken Language Translation.

Kenny, D., & Doherty, S. (2014). Statistical machine translation in the translation curriculum: Overcoming obstacles and empowering translators. The Interpreter and Translator Trainer, 8(2), 276–294. https://doi.org/10.1080/1750399X.2014.936112

Klubička, F., Toral, A., & Sánchez-Cartagena, V. M. (2017). Fine-grained human evaluation of neural versus phrase-based machine translation. The Prague Bulletin of Mathematical Linguistics, 108, 121-132. https://doi.org/10.1515/pralin-2017-0014

Mah, S.-H. (2020). Defining language-dependent post-editing guidelines for specific content: The case of the English-Korean pair to improve literature machine translation styles. Babel, 66(4–5), 811–828. https://doi.org/10.1075/babel.00174.mah

Melby, A. K. (2020). Future of machine translation: Musings on Weaver’s memo. In M. O’Hagan (Ed.), The Routledge handbook of translation and technology (pp. 419–436). Routledge. https://doi.org/10.4324/9781315311258-25

Miller, G. A., & Beebe-Center, J. G. (1956). Some psychological methods for evaluating the quality of translations. Mechanical Translation, 3(3), 73–80.

O'Brien, S., Winther Balling, L., & Carl, M., et al. (2014). Foreword. In S. O'Brien, L. Winther Balling, M. Carl, et al. (Eds.), Post-editing of machine translation: Processes and applications. Cambridge Scholars Publishing.

O’Hagan, M. (2020). Translation and technology: Disruptive entanglement of human and machine. In M. O’Hagan (Ed.), The Routledge handbook of translation and technology (pp. 26–59). Routledge.

Öner Bulut, S. (2019). Integrating machine translation into translator training: Towards ‘human translator competence’? TransLogos Translation Studies Journal, 2(2), 1–26. https://doi.org/10.29228/transLogos.11

Öner Bulut, S., & Alimen, N. (2023). Translator education as a collaborative quest for insights into the re-positioning of the human translator (educator) in the age of machine translation: The results of a learning experiment. The Interpreter and Translator Trainer, 17(3), 375–392. https://doi.org/10.1080/1750399X.2023.2237837

Papineni, K., Roukos, S., & Ward, T., et al. (2002). BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (pp. 311–318). Philadelphia. https://doi.org/10.3115/1073083.1073135

Poibeau, T. (2017). Machine translation. MIT Press Essential Knowledge series. https://doi.org/10.7551/mitpress/11043.001.0001

Quah, C. K. (2006). Translation and technology. Palgrave Macmillan. https://doi.org/10.1057/9780230287105

Shterionov, D., Superbo, R., Nagle, P., Casanellas, L., O’Dowd, T., Way, A. (2018). Human versus automatic quality evaluation of NMT and PBSMT. Machine Translation, 32(3), 217-235. https://doi.org/10.1007/s10590-018-9220-z

Sin-Wai, C. (2015). The development of translation technology. In S.-W. Chan (Ed.), Routledge encyclopedia of translation technology (pp. 3–32). Routledge.

Smith, A., Hardmeier, C., & Tiedemann, J. (2016). Climbing Mont BLEU: The strange world of reachable high-BLEU translations. In Proceedings of the 19th Annual Conference of the European Association for Machine Translation (EAMT 2017) (pp. 269–281). European Association for Machine Translation.

Şahin, M., & Gürses, S. (2021). English-Turkish literary translation through human-machine interaction. Tradumàtica: Tecnologies de la Traducció, 19, 171–203. https://doi.org/10.5565/rev/tradumatica.284

Taivalkoski-Shilov, K. (2019). Ethical issues regarding machine(-assisted) translation of literary texts. Perspectives, 27(5), 689–703. https://doi.org/10.1080/0907676X.2018.1520907

Trojszczak, M. (2022). Translator training meets machine translation - Selected challenges. In Language use, education, and professional contexts (pp. 179-192). Springer International Publishing. https://doi.org/10.1007/978-3-030-96095-7_11

Tymoczko, M. (2014). Why literary translation is a good model for translation theory and practice. In J. Boase-Beier, A. Fawcett, & P. Wilson (Eds.), Literary translation: Redrawing the boundaries (pp. 11–31). Palgrave Macmillan. https://doi.org/10.1057/9781137310057_2

Wang, H., Wu, H., He, Z., et al. (2022). Progress in machine translation. Engineering, 18, 143–153. https://doi.org/10.1016/j.eng.2021.03.023

Way, A. (2018). Quality expectations of machine translation. In J. Moorkens, S. Castilho, F. Gaspari, et al. (Eds.), Translation quality assessment (pp. 159–178). Springer. https://doi.org/10.1007/978-3-319-91241-7_8

Webster, R., Fonteyne, M., Tezcan, A., et al. (2020). Gutenberg goes neural: Comparing features of Dutch human translations with raw neural machine translation outputs in a corpus of English literary classics. Informatics, 7(32). https://doi.org/10.3390/informatics7030032

Yang, L., & Min, Z. (2015). Statistical machine translation. In S.-W. Chan (Ed.), The Routledge encyclopedia of translation technology (pp. 201–213). Routledge.

Downloads

Published

2025-06-18

How to Cite

Gülmüş Sırkıntı, H. (2025). Turkish-to-English short story translation by DeepL: Human evaluation by trainees and translation professionals vs. automatic evaluation. English Studies at NBU, 11(1), 17–42. https://doi.org/10.33919/esnbu.25.1.2

Issue

Section

Articles