Extraction of anglicisms from a corpus of Macedonian magazine texts
DOI:
https://doi.org/10.33919/esnbu.25.1.8Keywords:
Anglicisms, anglicisms extraction, corpus linguistics, corpus analysis toolsAbstract
The present article is a description of the stages involved in compiling a specialized corpus of Macedonian magazine texts and the software tools employed to extract anglicisms from the corpus. The texts were collected from the magazine Kapital and cover two distinct periods: the years 2000 and 2020. The size of the corpus is about 2 million tokens and 141,852 types. The software employed produced word lists that later in combination with other statistical techniques produced a refined Anglicism headword list from which new anglicisms were extracted. In addition to the software tools, careful manual inspection was necessary in both the extraction and analysis stages. As a result of the research, a total of 220 completely new anglicisms have been identified. Most of these new anglicisms are not yet included in existing Macedonian dictionaries.
References
Anthony, L. (2024a). AntConc (Version 4.3.1) [Computer Software]. Waseda University https://www.laurenceanthony.net/software/AntConcAnthony
Anthony, L. (2024b). TagAnt (Version 2.1.1) [Computer Software]. Waseda University. https://www.laurenceanthony.net/software/TagAnt
Andersen, G. (2005). Assessing algorithms for automatic extraction of anglicisms in Norwegian texts. Corpus Linguistics 2005.
Andersen, G. (2011). Corpora as lexicographical basis: the case of anglicisms in Norwegian. Methodological and Historical Dimensions of Corpus Linguistics (Studies in Variation, Contacts and Change in English 6), ed. by P. Rayson, S. Hoffmann & G. Leech. VARIENG. https://varieng.helsinki.fi/series/volumes/06/andersen
Andersen, G. (2012). Semi-automatic approaches to Anglicism detection in Norwegian corpus data. The anglicization of European lexis, 10, 111-130. https://doi.org/10.1075/z.174.09and DOI: https://doi.org/10.1075/z.174.09and
Andersen, G. (2021). On a daily basis… a comparative study of phraseological borrowing. In R. Marti Solano & P. Ruano San Segundo (Eds.), Anglicisms and Corpus Linguistics: Corpus-Aided Research into the Influence of English on European Languages (pp. 13-30). Peter Lang. https://www.peterlang.com/document/1184575
Furiassi, C. G. (2008). Non-adapted Anglicisms in Italian: Attitudes, frequency counts, and lexicographic implications. In R. Fischer, & H. Pulaczewska (Eds.), Anglicisms in Europe. Linguistic Diversity in a Global Context (pp. 313-327). Cambridge Scholars Publishing. https://hdl.handle.net/2318/100769
Furiassi, C., & Hofland, K. (2007). The retrieval of false anglicisms in newspaper texts. In R. Facchinetti (Ed.), Corpus linguistics 25 years on (pp. 347-363). Brill. https://doi.org/10.1163/9789401204347_020 DOI: https://doi.org/10.1163/9789401204347_020
Görlach, M. (Ed.). (2001). A dictionary of European anglicisms: A usage dictionary of anglicisms in sixteen European languages. Oxford University Press.
Gottlieb, H. (2004). Danish echoes of English. Nordic Journal of English Studies, 3(2), 39-65. https://doi.org/10.35360/njes.161 DOI: https://doi.org/10.35360/njes.161
Khoutyz, I. (2010). The pragmatics of anglicisms in modern Russian discourse. In R. Facchinetti, D. Crystal, & Barbara Seidlhofer (Eds.), From international to local English – and back again (pp. 197-208). Peter Lang.
Losnegaard, G. S., & Lyse, G. I. (2012). A data-driven approach to anglicism identification in Norwegian. In G. Andersen (ed.), Exploring Newspaper Language: Using the web to create and investigate a large corpus of modern Norwegian, Studies in Corpus Linguistics vol. 49 (pp. 131-154). John Benjamins. https://doi.org/10.1075/scl.49.07los DOI: https://doi.org/10.1075/scl.49.07los
Noguerolez, E. E. N. (2017). The Use of Anglicisms in Various Thematic Fields: An Analysis Based on the Corpus de Referencia del Español Actual. ANGLICA-An International Journal of English Studies, 26(2), 123-149. https://doi.org/10.7311/0860-5734.26.2.08 DOI: https://doi.org/10.7311/0860-5734.26.2.08
Honnibal, M., & Montani, I. (2025). spaCy. https://spacy.io/models/mk
Winter-Froemel, E., & Onysko, A. (2012). Proposing a pragmatic distinction for lexical Anglicisms. In C. Furiassi, V. Pulcini, & F. R. González (Eds.), The anglicization of European lexis (pp. 43-64.). John Benjamins. https://doi.org/10.1075/z.174.06win DOI: https://doi.org/10.1075/z.174.06win
Mańczak-Wohlfeld, E., & Witalisz, A. (2019). Anglicisms in the National Corpus of Polish: Assets and limitations of corpus tools. Studies in Polish Linguistics, 14(4), 171-190. https://doi.org/10.4467/23005920SPL.19.019.11337 DOI: https://doi.org/10.4467/23005920SPL.19.019.11337
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Lina Miloshevska

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Access Policy and Content Licensing
All published articles on the ESNBU site are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. It allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, even for commercial purposes. The terms on which the article is published allow the posting of the published article (Version of Record) in any repository by the author(s) or with their consent.
Note that prior to, and including, Volume 10, Issue 2, 2024, articles were licensed under the Non-commercial (CC BY-NC 4.0) license. The transition to CC BY 4.0 is effective as of Volume 11, Issue 1, 2025.
In other words, under the CC BY 4.0 license users are free to
Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Under the following terms:
Attribution (by) - You must give appropriate credit (Title, Author, Source, License), provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notice: No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
If the law requires that the article be published in the public domain, authors will notify ESNBU at the time of submission, and in such cases the article shall be released under the Creative Commons Public Domain Dedication waiver CC0 1.0 Universal.
Copyright
Copyright for articles published in ESNBU are retained by the authors, with first publication rights granted to the journal. Authors retain full publishing rights and are encouraged to upload their work to institutional repositories, social academic networking sites, etc. ESNBU is not responsible for subsequent uses of the work. It is the author's responsibility to bring an infringement action if so desired by the author.
Exceptions to copyright policy
Occasionally ESNBU may co-publish articles jointly with other publishers, and different licensing conditions may then apply.