Extraction of anglicisms from a corpus of Macedonian magazine texts
DOI:
https://doi.org/10.33919/esnbu.25.1.8Keywords:
Anglicisms, anglicisms extraction, corpus linguistics, corpus analysis toolsAbstract
The present article is a description of the stages involved in compiling a specialized corpus of Macedonian magazine texts and the software tools employed to extract anglicisms from the corpus. The texts were collected from the magazine Kapital and cover two distinct periods: the years 2000 and 2020. The size of the corpus is about 2 million tokens and 141,852 types. The software employed produced word lists that later in combination with other statistical techniques produced a refined Anglicism headword list from which new anglicisms were extracted. In addition to the software tools, careful manual inspection was necessary in both the extraction and analysis stages. As a result of the research, a total of 220 completely new anglicisms have been identified. Most of these new anglicisms are not yet included in existing Macedonian dictionaries.
References
Anthony, L. (2024a). AntConc (Version 4.3.1) [Computer Software]. Waseda University https://www.laurenceanthony.net/software/AntConcAnthony
Anthony, L. (2024b). TagAnt (Version 2.1.1) [Computer Software]. Waseda University. https://www.laurenceanthony.net/software/TagAnt
Andersen, G. (2005). Assessing algorithms for automatic extraction of anglicisms in Norwegian texts. Corpus Linguistics 2005.
Andersen, G. (2011). Corpora as lexicographical basis: the case of anglicisms in Norwegian. Methodological and Historical Dimensions of Corpus Linguistics (Studies in Variation, Contacts and Change in English 6), ed. by P. Rayson, S. Hoffmann & G. Leech. VARIENG. https://varieng.helsinki.fi/series/volumes/06/andersen
Andersen, G. (2012). Semi-automatic approaches to Anglicism detection in Norwegian corpus data. The anglicization of European lexis, 10, 111-130. https://doi.org/10.1075/z.174.09and DOI: https://doi.org/10.1075/z.174.09and
Andersen, G. (2021). On a daily basis… a comparative study of phraseological borrowing. In R. Marti Solano & P. Ruano San Segundo (Eds.), Anglicisms and Corpus Linguistics: Corpus-Aided Research into the Influence of English on European Languages (pp. 13-30). Peter Lang. https://www.peterlang.com/document/1184575
Furiassi, C. G. (2008). Non-adapted Anglicisms in Italian: Attitudes, frequency counts, and lexicographic implications. In R. Fischer, & H. Pulaczewska (Eds.), Anglicisms in Europe. Linguistic Diversity in a Global Context (pp. 313-327). Cambridge Scholars Publishing. https://hdl.handle.net/2318/100769
Furiassi, C., & Hofland, K. (2007). The retrieval of false anglicisms in newspaper texts. In R. Facchinetti (Ed.), Corpus linguistics 25 years on (pp. 347-363). Brill. https://doi.org/10.1163/9789401204347_020 DOI: https://doi.org/10.1163/9789401204347_020
Görlach, M. (Ed.). (2001). A dictionary of European anglicisms: A usage dictionary of anglicisms in sixteen European languages. Oxford University Press.
Gottlieb, H. (2004). Danish echoes of English. Nordic Journal of English Studies, 3(2), 39-65. https://doi.org/10.35360/njes.161 DOI: https://doi.org/10.35360/njes.161
Khoutyz, I. (2010). The pragmatics of anglicisms in modern Russian discourse. In R. Facchinetti, D. Crystal, & Barbara Seidlhofer (Eds.), From international to local English – and back again (pp. 197-208). Peter Lang.
Losnegaard, G. S., & Lyse, G. I. (2012). A data-driven approach to anglicism identification in Norwegian. In G. Andersen (ed.), Exploring Newspaper Language: Using the web to create and investigate a large corpus of modern Norwegian, Studies in Corpus Linguistics vol. 49 (pp. 131-154). John Benjamins. https://doi.org/10.1075/scl.49.07los DOI: https://doi.org/10.1075/scl.49.07los
Noguerolez, E. E. N. (2017). The Use of Anglicisms in Various Thematic Fields: An Analysis Based on the Corpus de Referencia del Español Actual. ANGLICA-An International Journal of English Studies, 26(2), 123-149. https://doi.org/10.7311/0860-5734.26.2.08 DOI: https://doi.org/10.7311/0860-5734.26.2.08
Honnibal, M., & Montani, I. (2025). spaCy. https://spacy.io/models/mk
Winter-Froemel, E., & Onysko, A. (2012). Proposing a pragmatic distinction for lexical Anglicisms. In C. Furiassi, V. Pulcini, & F. R. González (Eds.), The anglicization of European lexis (pp. 43-64.). John Benjamins. https://doi.org/10.1075/z.174.06win DOI: https://doi.org/10.1075/z.174.06win
Mańczak-Wohlfeld, E., & Witalisz, A. (2019). Anglicisms in the National Corpus of Polish: Assets and limitations of corpus tools. Studies in Polish Linguistics, 14(4), 171-190. https://doi.org/10.4467/23005920SPL.19.019.11337 DOI: https://doi.org/10.4467/23005920SPL.19.019.11337
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Lina Miloshevska

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
All published articles in the ESNBU are licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). This license lets others remix, tweak, and build upon your work non-commercially, and although their new works must also acknowledge you and be non-commercial, they don't have to license their derivative works on the same terms.
In other words, under the CC BY-NC 4.0 license users are free to:
Share - copy and redistribute the material in any medium or format
Adapt - remix, transform, and build upon the material
Under the following terms:
Attribution (by) - All CC licenses require that others who use your work in any way must give you credit the way you request, but not in a way that suggests you endorse them or their use. If they want to use your work without giving you credit or for endorsement purposes, they must get your permission first.
NonCommercial (nc) - You let others copy, distribute, display, perform, and modify and use your work for any purpose other than commercially unless they get your permission first.
If the article is to be used for commercial purposes, we suggest authors be contacted by email.
If the law requires that the article be published in the public domain, authors will notify ESNBU at the time of submission, and in such cases the article shall be released under the Creative Commons 1 Public Domain Dedication waiver CC0 1.0 Universal.
Copyright
Copyright for articles published in ESNBU are retained by the authors, with first publication rights granted to the journal. Authors retain full publishing rights and are encouraged to upload their work to institutional repositories, social academic networking sites, etc. ESNBU is not responsible for subsequent uses of the work. It is the author's responsibility to bring an infringement action if so desired by the author.
Exceptions to copyright policy
Occasionally ESNBU may co-publish articles jointly with other publishers, and different licensing conditions may then apply.