Comparative Analysis of Anglicism Distribution in Russian Social Network Texts

Fenogenova Alena; Karpov Ilia; Kazorin Viktor; Lebedev I.

?

Comparative Analysis of Anglicism Distribution in Russian Social Network Texts

P. 65–74.

Fenogenova Alena, Karpov Ilia, Kazorin Viktor, Lebedev I.

In the process of globalization, the number of English words in other languages has rapidly increased. In automatic speech recognition systems, spell-checking, tagging, and other software in the field of natural language processing, loan words are not easily recognized and should be evaluated separately. In this paper we present a corpora-based approach to the automatic detection of anglicisms in Russian social network texts. Proposed method is based on the idea of simultaneous scripting, phonetics, and semantics similarity of the original Latin word and its Cyrillic analogue. We used a set of transliteration, phonetic transcribing, and morphological analysis methods to find possible hypotheses and distributional semantic models to filter them. Resulting list of borrowings, gathered from approximately 20 million LiveJournal texts, shows good intersection with manually collected dictionary. Proposed method is fully automated and can be applied to any domain–specific area.

Language: English

Full text

Text on another site

Keywords: матрично-векторное представление англицизмы заимствованные англицизмы anglicisms distributive semantics social media texts vector representation сетевой текст

Publication based on the results of:

Applied network research with big data and new technological advances (2018)

In book

Computational Linguistics and Intellectual Technologies. International Conference "Dialogue 2017" Proceedings

Vol. 1. Issue 16 (23). , M.: -, 2017.

Историческая семантика концепта «класс» в академической литературе: опыт количественного анализа

Korotaev S., Экономическая социология 2024 Т. 25 № 3 С. 13–50

The paper is devoted to the quantitative analysis of semantic changes in the meaning of a concept of “class” in the sociological texts from the second third of the twentieth century to the present. The relevance of the study is conditioned by the lack of clarity and multiple meanings of the concept in question. This ...

Added: June 12, 2024

Parameter-Efficient Tuning of Transformer Models for Anglicism Detection and Substitution in Russian

Daniil Lukichev, Kryanina Darya, Anastasia Bystrova et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог». Вып. 22.Вып. 22.: [б.и.], 2023. P. 295–306.

Added: April 25, 2024

Parameter-Efficient Tuning of Transformer Models for Anglicism Detection and Substitution in Russian

Daniil Lukichev, Kryanina D., Anastasia Bystrova et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог». Вып. 22.Вып. 22.: [б.и.], 2023. P. 295–306.

This article is devoted to the problem of Anglicisms in texts in Russian: the tasks of detection and automatic rewriting of the text with the substitution of Anglicisms by their Russian-language equivalents. Within the framework of the study, we present a parallel corpus of Anglicisms and models that identify Anglicisms in the text and replace ...

Added: September 22, 2023

The Language of Positive Mental Health: Findings From a Sample of Russian Facebook Users

Bogolyubova O., Panicheva P., Ledovaya Y. et al., Sage Open 2020 Vol. 10 No. 2 P. 1–8

*Реализация соц. сети Facebook запрещена на территории России по основаниям осуществления экстремистской деятельности. Positive mental health is considered to be a significant predictor of health and longevity; however, our understanding of the ways in which this important characteristic is represented in users’ behavior on social networking sites is limited. The goal of this study was to ...

Added: May 23, 2020

Analysis of neural networks efficiency for determining positions of corrupted bytes

Slastnikov S., Лупанов В. Э., Journal of Physics: Conference Series 2019 Vol. 1163 No. 12048 P. 1–6

A lot of files and data, in general, are transferred throughout the networks. But the data may be corrupted by intrusions or package loss so, the executable files may be marked as non-executable and violate the local network policy. Thus, it’s necessary to detect such files. In this paper, we present a novel method for ...

Added: October 19, 2018

Automatic morphological analysis on the material of Russian social media texts

Fenogenova A., Kazorin V., Karpov I. et al., , in: Proceedings of Third Workshop "Computational linguistics and language science"Issue 4.: Manchester: EasyChair, 2019. P. 11–17.

Automatic morphological analysis is one of the fundamental and significant tasks of NLP (Natural Language Processing). Due to special features of Internet texts, as they can be both normative texts (news, fiction, nonfiction) and less formal texts (such as blogs and texts from social networks), the morphological tagging has become non-trivial and an actual task. ...

Added: October 5, 2018

A General Method Applicable to the Search for Anglicisms in Russian Social Network Texts

Fenogenova A., Karpov I., Kazorin V., , in: Proceedings of the Artificial Intelligence and Natural Language AINL FRUCT 2016 Conference, Saint-Petersburg, Russia, 10-12 November 2016.: FRUCT Oy, 2016. P. 31–36.

With the process of globalization the number of borrowings from English has rapidly increased in languages all over the world. In systems of automatic speech recognition, spell-checking, tagging and other tasks in the field of natural language processing the loan words frequently cause problems and should be treat separately. In this paper we present a ...

Added: October 19, 2016

Что англизируется: русский язык или русское языковое сознание? (Взгляд переводчика-психолингвиста на заимствованные англицизмы с позиций профессиональной коммуникации)

Vlasenko S. V., В кн.: Система языка и языковое мышление: сб. науч. ст.: М.: Книжный дом "ЛИБРОКОМ", 2009. С. 35–51.

Статья посвящена разным аспектам англизации русского языка, динамично протекающей с момента провозглашения независимости России, и с обусловленной этим процессом динамикой языкового сознания русскоязычных. Заимствования и кальки с иностранных языков, преимущественно с английского и его американского варианта, представляют собой слова, термины, терминологические сочетания или обороты речи, построенные по иноязычной модели средствами русского языка. Избыточное употребление ...

Added: February 3, 2015

Проблема графического освоения заимствованной лексики

Lebedeva N. M., Крылова Л.К., Вестник Новосибирского государственного университета. Серия: История, филология 2014 Т. 13 № 9 С. 76–82

This article considers the problem of graphic assimilation of the English words in the Russian language, a great number of which have appeared in Russian since the end of the 20th century. Many loanwords are still at the stage of graphic fluctuation and this article treats the main tendencies of assimilating the words. The words ...

Added: October 24, 2014

Английские заимствования в русском и немецком языках сети Интернет

Balakina Y. V., Н. Новгород: Нижегородский государственный технический университет им. Р.Е. Алексеева, 2014.

Монография посвящена проблеме интеграции заимствованных слов в русском и немецком языках на примере лексики Интернета. В связи с огромным числом заимствований в виртуальной среде, в некоторых слу-чаях наблюдается превосходство англицизмов над единицами родного языка. В книге рассматриваются существующие способы интеграции анг-лицизмов в русском и немецком языках с точки зрения графической, грамматической, семантической и стилистической адаптации, ...

Added: October 18, 2014

Англицизмы в русском языке

Baibikova T., В кн.: Актуальные проблемы развития речи и межкультурной коммуникации. Сборник материалов VI Кирилло-Мефодиевских чтений в Международном гуманитарно-лингвистическом институте. 14 мая 2013.: М.: МФЮА, 2013. С. 161–171.

В статье рассматривается процесс заимствования англоязычной лексики русским языком. Обсуждаются причины заимствований, принадлежность заимствованной лексики к различным сферам деятельности людей. Также рассматриваются некоторые словообразовательные единицы заимствованной лексики и возможности ее адаптации в русском языке. ...

Added: September 24, 2013

Об основных приемах современной англо-русской языковой игры

Rivlina A. A., В кн.: Homo Loquens: актуальные вопросы лингвистики и методики преподавания иностранных языков (2011)Вып. 3.: СПб.: НИУ ВШЭ - Санкт-Петербург, 2011. С. 86–96.

Глобализация английского языка дает возможность носителям контактирующих с ним языков, в частности, носителям русского языка, использовать англицизмы как дополнительный инструмент выражения разнообразных эстетических и эмоционально-экспрессивных смыслов, например, в процессе шутливого языкового обыгрывания. В статье описываются и систематизируются основные приемы англо-русской языковой игры, такие как игровое заимствование и кодовое смешение/переключение («кодовая интертекстуальность»), словообразовательная англо-русская гибридизация, графогибридизация ...

Added: April 15, 2013

Dominant, Weakly Stable, Uncovered Sets: Properties and Extensions

Subochev A., / NRU Higher School of Economics. Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2008. No. 3.

Twelve sets, proposed as social choice solution concepts, are compared: the core, five versions of the uncovered set, two versions of the minimal weakly stable sets, the uncaptured set, the untrapped set, the minimal undominated set (strong top cycle) and the minimal dominant set (weak top cycle). The main results presented are the following. A ...

Added: December 26, 2012

Matrix-vector representation of various solution concepts

Aleskerov F. T., Subochev A., / NRU Higher School of Economics. Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2009. No. 3.

A unified matrix-vector representation is developed of such solution concepts as the core, the uncovered, the uncaptured, the minimal weakly stable, the minimal undominated, the minimal dominant and the untrapped sets. We also propose several new versions of solution sets. ...

Added: December 26, 2012

Modeling optimal social choice: matrix-vector representation of various solution concepts based on majority rule

Aleskerov F. T., Subochev A., Journal of Global Optimization 2013 Vol. 56 No. 2 P. 737–756

Various Condorcet consistent social choice functions based on majority rule (tournament solutions) are considered in the general case, when ties are allowed: the core, the weak and strong top cycle sets, versions of the uncovered and minimal weakly stable sets, the uncaptured set, the untrapped set, classes of k-stable alternatives and k-stable sets. The main ...

Added: October 25, 2012