RUSSE2018: a Shared Task on Word Sense Induction for the Russian Language

Panchenko A.; A. Lopukhina; Ustalov D.; Lopukhin K.; Arefyev N.; Leontyev A.; Loukachevitch N.

?

RUSSE2018: a Shared Task on Word Sense Induction for the Russian Language

Компьютерная лингвистика и интеллектуальные технологии. 2018. No. 17. P. 547–564.

Panchenko A., Lopukhina A., Ustalov D., Lopukhin K., Arefyev N., Leontyev A., Loukachevitch N.

The paper describes the results of the first shared task on word sense induction (WSI) for the Russian language. While similar shared tasks were conducted in the past for some Romance and Germanic languages, we explore the performance of sense induction and disambiguation methods for a Slavic language that shares many features with other Slavic languages, such as rich morphology and virtually free word order. The participants were asked to group contexts of a given word in accordance with its senses that were not provided beforehand. For instance, given a word “bank” and a set of contexts for this word, e.g. “bank is a financial institution that accepts deposits” and “river bank is a slope beside a body of water”, a participant was asked to cluster such contexts in the unknown in advance number of clusters corresponding to, in this case, the “company” and the “area” senses of the word “bank”. For the purpose of this evaluation campaign, we developed three new evaluation datasets based on sense inventories that have different sense granularity. The contexts in these datasets were sampled from texts of Wikipedia, the academic corpus of Russian, and an explanatory dictionary of Russian. Overall, 18 teams participated in the competition submitting 383 models. Multiple teams managed to substantially outperform competitive state-of-the-art baselines from the previous years based on sense embeddings.

Research target: Computer Science Philology and Linguistics

Priority areas: humanitarian IT and mathematics

Language: English

Text on another site

Keywords: lexical semantics polysemy homonymy word sense induction

No ‘iota’ type-shifter in Kazym Khanty

Tiutiunnikova V., Mikhailov Stiopa, Golosov F., Proceedings of Sinn und Bedeutung (Германия) 2025 No. 29 P. 1593–1608

In this paper, we present new challenging data from Kazym Khanty (a Uralic language spoken in Western Siberia, Russia): in this articleless language, bare singular and bare dual NPs in argument positions can receive indefinite readings on par with definite ones, contradicting the predictions of the classic neo-Carlsonian approach (Chierchia, 1998; Dayal, 2004). We argue ...

Added: January 30, 2026

Употребление порядковых числительных в разных семантических контекстах (на материале параллельных переводов Нового Завета)

Nasledskova P., Известия РАН. Серия литературы и языка 2025 Т. 84 № 6 С. 88–102

Работа посвящена сравнению употребления порядковых конструкций в разных семантических контекстах в пяти языках: русском, английском, испанском, индонезийском и рутульском. Сравнение проведено на материале параллельных переводов Нового завета. Из шести книг Нового Завета (канонические Евангелия, Деяния апостолов и Откровение Иоанна Богослова) были выбраны стихи, в которых хотя бы в одном из языков выборки употреблены порядковые числительные. ...

Added: January 29, 2026

Метод преобразования речевого сигнала для улучшения разборчивости речи

Savchenko L., Савченко В. В., Радиотехника и электроника 2025 Т. 70 № 8 С. 753–760

The problem of improving speech intelligibility in voice communication systems is considered. The acute issue of speaker recognition when applying known methods for solving this problem is highlighted. To overcome the specified problem, a new method for transforming the speech signal based on an autoregressive model of the vocal tract and the principle of frequency-selective ...

Added: January 29, 2026

Specification Tests for Jump-Diffusion Models Based on the Characteristic Function

Belomestny D., Grobler G. L., Meintanis S. G. et al., International Statistical Review 2026 P. 1–31

Goodness-of-fit tests are suggested for several popular jump-diffusion processes. The suggested test statistics utilise the marginal characteristic function of the model and its L2-type discrepancy from an empirical counterpart. Model parameters are estimated either by minimising the aforementioned L2-type discrepancy or by maximum likelihood. A hybrid estimation method that uses moment estimation is also proposed ...

Added: January 29, 2026

Применение технологий ИИ в обучении студентов в рамках дисциплины «Академическое письмо на английском языке»

Gabrielova E., Магия ИННО 2025 Т. 7 № 1 С. 165–172

Artificial intelligence (AI) technologies are rapidly developing and are being widely applied in various fields, including education. The use of AI carries certain risks; however, one cannot completely reject it in student education. The article presents the experience of using AI in teaching English to 34 fourth-year students and 26 post-graduate students within the discipline ...

Added: January 29, 2026

Explorations in Applied Ethnolinguistics: Words, Cultures, and Global Perspectives

Palgrave Macmillan, 2025.

This volume contributes to the growing body of cutting-edge research into the Natural Semantic Metalanguage (NSM) approach in linguistics. It explores the broad range of possible applications enabled by the NSM approach, from linguistic studies of semantics and culture to cross-cultural studies, psychology and childhood education. The volume builds on previous studies, bringing a diversity ...

Added: January 28, 2026

Эпос о Гильгамеше. Перевод Николая Гумилева. Предисловие Е. Маркиной. Введение В. Шилейко.

Markina E., Манн, Иванов и Фербер, 2025.

Аннотация издателя: «Эпос о Гильгамеше» — древнейший памятник мировой литературы, дошедший до нас из глубин шумерской и аккадской цивилизаций. Поэма повествует о приключениях могущественного царя города Урука и его друга Энкиду. Это история о силе и дружбе, гордыне и смирении, страхе смерти и жажде бессмертия. Поэма издается в переводе поэта-акмеиста Николая Гумилева с пояснительной статьей ассириолога и современника поэта Владимира Шилейко, ...

Added: January 28, 2026

An Analysis of Sequential Patterns in Datasets for Evaluation of Sequential Recommendations

Klenitskiy A., Володкевич А. А., Pembek A. et al., ACM Transactions on Recommender Systems 2026

Sequential recommender systems are an important and in-demand area of research. These systems aim to use the order of interactions in a user’s history to predict future interactions. The premise is that the order of interactions and sequential patterns play an essential role. Therefore, it is crucial to use datasets that exhibit a sequential structure ...

Added: January 28, 2026

Semi-fake indexicals in Russian

Тискин Д. Б., Типология морфосинтаксических параметров 2025 Vol. 8 No. 1 P. 112–129

There are several rival theories of fake indexicals, i.e. bound indexicals (prominently pronouns) whose φ-features do not semantically contribute to focus alternatives (e.g. Only Mary did her homework, John didn’t do his). According to Minimal Pronoun theories (such as Kratzer’s or Wurmbrand’s), bound pronouns are Merged without φ-features and acquire them under binding via agreement-like ...

Added: January 26, 2026

Некоторые модификации к теории связанных употреблений индексальных выражений И. Басси

Тискин Д. Б., Типология морфосинтаксических параметров 2024 Т. 7 № 1 С. 107–123

Fake indexicals (FIs), or bound-variable uses of e.g. 1st - and 2 nd -person pronouns, have been analysed by Bassi (2021) as arising from a post-syntactic process of inspecting the features of the referent. This leads to a peculiar analysis of the syntax and semantics of relative clauses containing FIs. I argue for a more ...

Added: January 26, 2026

Autoregressive generation strategies for Top-K sequential recommendations

Klenitskiy A., Гусак Д. И., Володкевич А. А. et al., User Modelling and User-Adapted Interaction 2025 No. 35 Article 13

The goal of modern sequential recommender systems is often formulated in terms of next-item prediction. In this paper, we explore the applicability of transformer-based generative models for the Top-K sequential recommendation task, where the goal is to predict items that a user is likely to interact with in the “near future.” This goal aligns with ...

Added: January 26, 2026

Искусство (не)простого юридического письма. Учебное пособие

Knutov A., Chaplinskiy A., Мищенко П. А. et al., М.: Проспект, 2026.

Учебное пособие содержит рекомендации к стилю юридического письма, следование которым поможет сделать его более понятным для читателей. Первая глава систематизирует накопившиеся знания об общих стилевых особенностях языка права и его месте в речевой системе русского языка. Последующие главы посвящены отдельным видам юридических документов: языку законов, языку процессуальных документов, языку договоров и языку юридических аналитических документов. ...

Added: January 26, 2026

Marchenko–Pastur Law for Spectra of Random Weighted Bipartite Graphs

Nadutkina A., Tikhomirov A., Timushev D., Siberian Advances in Mathematics 2025 Vol. 34 P. 146–153

We study the spectra of random weighted bipartite graphs. We establish that, under specific assumptions on the edge probabilities, the symmetrized empirical spectral distribution function of the graph’s adjacency matrix converges to the symmetrized Marchenko-Pastur distribution function. ...

Added: January 26, 2026

Из переписки Е. А. Миллиор с Я. М. Боровским (1946–1960)

Ermakova L., Вестник Удмуртского университета. Серия История и филология 2025 Т. 35 № 6 С. 1403–1422

The article publishes and analyzes the correspondence between the historian of antiquity Elena A. Millior (1900–1978) and the classical philologist Yakov M. Borovsky (1896–1994), covering the years 1946–1960 and preserved in the archives of the Institute of Russian Literature (Pushkin House) of the Russian Academy of Sciences and the Bibliotheca Classica Petropolitana in St. Petersburg. ...

Added: January 26, 2026

Творчество Д.Н. Мамина-Сибиряка и современный мир

М., Екатеринбург: Кабинетный ученый, 2024.

В монографии рассматривается творчество классика уральской и общерусской литературы XIX в. Д. Н. Мамина-Сибиряка. Исследуются и описываются различные аспекты его художественного мира: аксиологическая и этическая проблематика, имеющие как универсальный, так и национальный характер, вопросы гео- и этнопоэтики, особенности нарративной организации текстов и художественного языка писателя, родословие Мамина и прикладные моменты его творчества, включая представление наследия писателя современной аудитории. Издание снабжено указателем произведений Мамина-Сибиряка. Книга предназначена для ...

Added: January 26, 2026

«Философия права» Гегеля и дело Коцебу: культурно-политический контекст

Lagutina I., Философические письма. Русско-европейский диалог 2025 Т. 8 № 4 С. 165–201

This article examines the assassination of the playwright August von Kotzebue by the theology student K. L. Sand as an event reflecting the ideological and philosophical tensions of early nineteenth-century Germany. It analyzes G. W. F. Hegel’s response to this historical episode in the context of his “Philosophy of Right”, which criticizes ethical and religious ...

Added: January 25, 2026

Iterative Ricci-Foster Curvature Flow with GMM-Based Edge Pruning: A Novel Approach to Community Detection

Sorokin K., Beketov M., Онучин А. et al., / arxiv.org. Серия cs.SI "Social and Information Networks ". 2025.

Community detection in complex networks is a fundamental problem, open to new approaches in various scientific settings. We introduce a novel community detection method, based on Ricci flow on graphs. Our technique iteratively updates edge weights (their metric lengths) according to their (combinatorial) Foster version of Ricci curvature computed from effective resistance distance between the ...

Added: January 15, 2026

Implementing Transport Coding in OMNeT++ for Message Delay Reduction

Petrovanov I., Sergeev A., / Series Computer Science "arxiv.org". 2025. No. 2512.18332.

Transport coding reduces message delay in packet-switched networks by introducing controlled redundancy at the transport layer: original packets are encoded into coded packets, and the message is reconstructed after the first successful deliveries, effectively shifting latency from the maximum packet delay to the -th order statistic. We present a concise, reproducible discrete-event implementation of transport coding in OMNeT++, including ...

Added: December 24, 2025

Hessian-based lightweight neural network for brain vessel segmentation on a minimal training dataset

Меньшиков И. А., Бернадотт А. К., Elvimov N. S., / Series arXie "Statistical mechanics". 2025.

Accurate segmentation of blood vessels in brain magnetic resonance angiography (MRA) is essential for successful surgical procedures, such as aneurysm repair or bypass surgery. Currently, annotation is primarily performed through manual segmentation or classical methods, such as the Frangi filter, which often lack sufficient accuracy. Neural networks have emerged as powerful tools for medical image ...

Added: December 1, 2025

Сложное слово и словосочетание: корпусный подход (случай «bad blood»)

Филатов А. С., Когнитивные исследования языка 2025 Т. 1-2 № 25 С. 302–305

The article demonstrates the productivity of corpus-based linguistic analysis regarding the problem of distinguishing phrases from compounds. The object of the research is “bad blood” in the American English language, the morphological status of which is approached in close connection with its real-life usage and the polysemies of its constituents. ...

Added: November 24, 2025

Determining the boundary of dynamical chaos in the generalized Chirikov map via machine learning

Чернышов Д. П., Satanin A., Shchur L., / Series arXiv "math". 2025.

We investigate the boundary separating regular and chaotic dynamics in the generalized Chirikov map, an extension of the standard map with phase-shifted secondary kicks. Lyapunov maps were computed across the parameter space (K,K(α, τ)) and used to train a convolutional neural network (ResNet18) for binary classification of dynamical regimes. The model reproduces the known critical ...

Added: November 21, 2025

Эффективный алгоритм торговли на фондовом рынке: ретроспективный анализ, основанный на данных по S&P-500.

Rubchinskiy A., Chubarova D., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2025. No. WP7/2025/01.

The article examines one of the most famous examples of socio-economic systems, characterized by significant uncertainty – the S&P-500 stock market, where shares of 500 largest US companies are traded. No assumptions are made about the probabilistic characteristics of the stock market. A flexible algorithm for daily trading has been developed, based on both known fixed data ...

Added: November 9, 2025

Diffusion on language model embeddings for protein sequence generation

Meshchaninov V., Strashnov, P., Shevtsov A. et al., / Cornell University. Серия CoRR, arXiv:2403.03726 "Computing Research Repository,". 2025.

Protein design requires a deep understanding of the inherent complexities of the protein universe. While many efforts lean towards conditional generation or focus on specific families of proteins, the foundational task of unconditional generation remains underexplored and undervalued. Here, we explore this pivotal domain, introducing DiMA, a model that leverages continuous diffusion on embeddings derived ...

Added: October 5, 2025

Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation

Shabalin A., Meshchaninov V., Vetrov D., / Series cs.CL, arXiv:2505.18853 "Computation and Language". 2025.

Diffusion models have achieved state-of-the-art performance in generating images, audio, and video, but their adaptation to text remains challenging due to its discrete nature. Prior approaches either apply Gaussian diffusion in continuous latent spaces, which inherits semantic structure but struggles with token decoding, or operate in categorical simplex space, which respect discreteness but disregard semantic ...

Added: October 5, 2025