?
Automatic detection of grammatical aspect of Russian verbs based on their morphological properties
.
Petrunina U., Filip H.
In book
Dubrovnik: Croatian Language Technology Society, 2023.
Anna Moskvina, Margarita Kirina, , in: 27th International Conference, IMS 2024, St. Petersburg, Russia, June 24–26, 2024, Selected Papers. Internet and Modern Society. Human-Computer Communication. CCIS, volume 2534Vol. 2534.: Springer, 2025. P. 113–129.
The paper presents an investigation of the emotional aspect of the Russian short story of the 20th century. Our study is two-fold: firstly, we delve into emotional representation at the lexical level, building upon previous work on utilizing vector models to quantify emotional content. In this study, we introduce an annotated corpus where words are ...
Added: November 29, 2024
Where Is Happily Ever After? A Study of Emotions and Locations in Russian Short Stories of 1900–1930
Moskvina A., Kirina M., , in: Digital Geography: Proceedings of the International Conference on Internet and Modern Society (IMS 2023).: Springer, 2023. P. 123–135.
The paper tackles the problem of the automatic detection of emotions in literary texts using distributional semantics techniques. The experiment was carried out on the material of Russian short stories from the 1900-1930s. We investigated the emotional lexis distribution across different locations in narratives. At first, we calculated the semantic association score between each word ...
Added: December 9, 2023
Sherstinova T., Moskvina A., Kirina M. et al., В кн.: Труды международной конференции «Корпусная лингвистика — 2023».: СПб.: Издательство Санкт-Петербургского государственного университета, 2024. С. 232–240.
In the experimental study, the results of three different approaches to the evaluation of the tonality of literary texts are compared: dictionary-based, machine learning, and distributional semantics. The material for analysis was a selection of 210 stories by Russian writers from the first three decades of the 20th century. The research showed that the correlation ...
Added: December 9, 2023
Moskvina A., Kirina M., В кн.: Труды международной конференции «Корпусная лингвистика — 2023».: СПб.: Издательство Санкт-Петербургского государственного университета, 2024. С. 156–166.
The paper presents the results of experiments investigating the distribution of emotional vocabulary in Russian short stories of the beginning of the 20th century. The emotionality of words and texts is determined automatically using the methods of distributive semantics, which does not require the use of dictionaries or preliminary data annotation. The results include data ...
Added: December 9, 2023
Razzhigaev A., Nikolay Arefyev, Panchenko A., , in: Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021).: Association for Computational Linguistics, 2021. P. 157–162.
In this paper, we present a system for the solution of the cross-lingual and multilingual word-in-context disambiguation task. Task organizers provided monolingual data in several languages, but no cross-lingual training data were available. To address the lack of the officially provided cross-lingual training data, we decided to generate such data ourselves. We describe a simple ...
Added: September 23, 2021
Davletov A., Nikolay Arefyev, Gordeev D. et al., , in: Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021).: Association for Computational Linguistics, 2021. P. 780–786.
This paper presents our approaches to SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation task. The first approach attempted to reformulate the task as a question answering problem, while the second one framed it as a binary classification problem. Our best system, which is an ensemble of XLM-R based binary classifiers trained with data augmentation, ...
Added: September 23, 2021
Rachinskiy Maxim, Arefyev Nikolay, , in: Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021).: Association for Computational Linguistics, 2021. P. 756–762.
Added: September 23, 2021
Bakarov A., Gureenkova O., Lecture Notes in Computer Science 2018 P. 16–21
This study considers the problem of automated detection of non-relevant posts on Web forums and discusses the approach of resolving this problem by approximation it with the task of detection of semantic relatedness between the given post and the opening post of the forum discussion thread. The approximated task could be resolved through learning the ...
Added: December 12, 2020
Bakarov A., PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA (CLIB '18) 2018 P. 153–161
Swivel (Submatrix-WIse Vector Embedding Learner) is a distributional semantic model based on counting point-wise mutual information values, capable of capturing word-context co-occurrences in the PMI matrix that were not noted in the training corpus. This model outperforms mainstream word embedding training algorithms such as Continuous Bag-of-Words, GloVe and Skip-Gram in word similarity and word analogy ...
Added: December 12, 2020
Panicheva P., Litvinova T., , in: The Fifth Saint Petersburg Winter Workshop on Experimental Studies of Speech and Language (Night Whites 2019).: St. Petersburg: Центр научно-информационных технологий "Астерион", 2019. P. 81–81.
Added: October 29, 2020
Panicheva P., Litvinova T., , in: Proceedings of the 25th Conference of Open Innovations Association FRUCT, University of Helsinki, Helsinki, Finland.: Helsinki: IEEE, 2019. P. 241–249.
Schizophrenia is widely known to manifest in language disturbance. Namely, speech incoherence, tangentiality, derailment are indicative of thought disorder characteristic of schizophrenia. Recent advances in distributional semantics have made it possible to measure coherence in text in a unified and objective manner. It has been shown that semantic coherence measures based on distributional semantic models ...
Added: October 29, 2020
Badryzlova Y., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 17 июня — 20 июня 2020 г.)Вып. 19(26).: М.: Изд-во РГГУ, 2020. P. 33–47.
The paper presents a method for computing indexes of semantic concreteness and abstractness in two languages (Russian and English). These indexes are used in metaphor identification experiments in both languages; the results are either comparable to or surpass pervious work and the baselines. We analyze the obtained indexes of concreteness and abstractness to see how ...
Added: August 24, 2020
Ryzhova D., СПб.: Алетейя, 2020.
Лексическая типология – область лингвистики, которая занимается сопоставительным анализом значений слов в разных языках, – на сегодняшний день добилась больших успехов: разработаны методики сбора и анализа материала, описан целый ряд семантических полей. Однако некоторые методологические ограничения по-прежнему не преодолены: процесс сбора данных очень трудоемок, что сказывается либо на объемах и представительности языковых выборок, либо на ...
Added: June 2, 2020
Panicheva P., Litvinova T., , in: Statistical Language and Speech Processing. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Volume 11816 LNAIVol. 11816: Statistical Language and Speech Processing 7th International Conference, SLSP 2019, Ljubljana, Slovenia, October 14–16, 2019, Proceedings.: Springer Publishing Company, 2019. P. 299–310.
Recent demands in authorship attribution, specifically, cross-topic authorship attribution with small numbers of training samples and very short texts, impose new challenges on corpora design, feature and algorithm development. In the current work we address these challenges by performing authorship attribution on a specifically designed dataset in Russian. We present a dataset of short written ...
Added: October 28, 2019
Badryzlova Y., Lyashevskaya O., Panicheva P., , in: Когнитивные исследования языка. Вып. XXXVII: Интегративные процессы в когнитивной лингвистике: материалы международного конгресса по когнитивной лингвистикеТ. XXXVII: Интегративные процессы в когнитивной лингвистике: материалы международного конгресса по когнитивной лингвистике.: Деком, 2019. Ch. IV P. 609–615.
The paper provides linguistic explanations to the results of the supervised machine learning experiments for identification of verbal metaphor in Russian texts. We look at the classification accuracy of models based on different features (distributional semantics and lexical and morphosyntactic co-occurrence, etc.) and explore the behavior of verb constructions and wider context in order to investigate the reasons behind the ...
Added: October 23, 2019
Fedotov M., Вопросы языкознания 2019 № 3 С. 7–44
The paper discusses two related aspectological topics. First section examines the ‘completive’ — i. e. ‘attainment of the internal limit’ — meaning (together with its counterpart ‘incompletive’, i. e. ‘non-attainment of the internal limit’). Its localization in the semantic structure of the utterance is determined: between aspect proper and actionality proper. Also, ‘completive’ can be included under ...
Added: September 28, 2019
Paperno D., Ryzhova D., , in: Methodological Tools for Linguistic Description and TypologyIssue 16.: University of Hawaii Press, 2019. Ch. 5 P. 45–61.
Questionnaires constitute a crucial tool in linguistic typology and language description. By nature, a Questionnaire is both an instrument and a result of typological work: its purpose is to help the study of a particular phenomenon cross-linguistically or in a particular language, but the creation of a Questionnaire is in its turn based on the ...
Added: August 30, 2019
Panicheva P., Protopopova E., Bukia G. et al., , in: Analysis of Images, Social Networks and Texts. 5th International Conference, AIST 2016, Yekaterinburg, Russia, April 7-9, 2016, Revised Selected Papers. Communications in Computer and Information ScienceVol. 661.: Switzerland: Springer, 2017. P. 236–247.
In the paper vector-space semantic models based on Word2Vec word embeddings algorithm and a count-based association-oriented algorithm are evaluated and compared by measuring association strength between Russian nouns and adjectives. A dataset of nouns and associated adjectives is used as the test set for pseudodisambiguation task. Models are trained with corpora of Russian fiction. A ...
Added: February 18, 2019
Panicheva P., Mirzagitova A., Ledovaya Y., , in: Artificial Intelligence and Natural Language, 6th Conference, AINL 2017, St. Petersburg, Russia, September 20–23, 2017, Revised Selected PapersIssue 789.: Switzerland: Springer, 2018. Ch. 1 P. 3–15.
*Реализация соц. сети Facebook запрещена на территории России по основаниям осуществления экстремистской деятельности.
The goal of the current work is to evaluate semantic feature aggregation techniques in a task of gender classification of public social media texts in Russian. We collect Facebook posts of Russian-speaking users and apply them as a dataset for two topic modelling ...
Added: February 18, 2019
Bogolyubova O., Panicheva P., Tikhonov R. et al., Computers in Human Behavior 2018 Vol. 78 P. 151–159
*Реализация соц. сети Facebook запрещена на территории России по основаниям осуществления экстремистской деятельности.
The goal of this paper was to assess the connection between dark personality traits and engagement in harmful online behaviors in a sample of Russian Facebook users, and to describe the language they use in online communication. A total of 6724 individuals participated ...
Added: February 18, 2019