Automated Detection of Non-Relevant Posts on the Russian Imageboard "2ch": Importance of the Choice of Word Representations

?

Automated Detection of Non-Relevant Posts on the Russian Imageboard "2ch": Importance of the Choice of Word Representations

Lecture Notes in Computer Science. 2018. P. 16–21.

Bakarov A., Gureenkova O.

This study considers the problem of automated detection of non-relevant posts on Web forums and discusses the approach of resolving this problem by approximation it with the task of detection of semantic relatedness between the given post and the opening post of the forum discussion thread. The approximated task could be resolved through learning the supervised classifier with a composed word embeddings of two posts. Considering that the success in this task could be quite sensitive to the choice of word representations, we propose a comparison of the performance of different word embedding models. We train 7 models (Word2Vec, Glove, Word2Vec-f, Wang2Vec, AdaGram, FastText, Swivel), evaluate embeddings produced by them on dataset of human judgements and compare their performance on the task of non-relevant posts detection. To make the comparison, we propose a dataset of semantic relatedness with posts from one of the most popular Russian Web forums, imageboard "2ch", which has challenging lexical and grammatical features.

Research target: Philology and Linguistics Computer Science

Priority areas: IT and mathematics

Метод преобразования речевого сигнала для улучшения разборчивости речи

Savchenko L., Савченко В. В., Радиотехника и электроника 2025 Т. 70 № 8 С. 753–760

The problem of improving speech intelligibility in voice communication systems is considered. The acute issue of speaker recognition when applying known methods for solving this problem is highlighted. To overcome the specified problem, a new method for transforming the speech signal based on an autoregressive model of the vocal tract and the principle of frequency-selective ...

Added: January 29, 2026

Specification Tests for Jump-Diffusion Models Based on the Characteristic Function

Belomestny D., Grobler G. L., Meintanis S. G. et al., International Statistical Review 2026 P. 1–31

Goodness-of-fit tests are suggested for several popular jump-diffusion processes. The suggested test statistics utilise the marginal characteristic function of the model and its L2-type discrepancy from an empirical counterpart. Model parameters are estimated either by minimising the aforementioned L2-type discrepancy or by maximum likelihood. A hybrid estimation method that uses moment estimation is also proposed ...

Added: January 29, 2026

Применение технологий ИИ в обучении студентов в рамках дисциплины «Академическое письмо на английском языке»

Gabrielova E., Магия ИННО 2025 Т. 7 № 1 С. 165–172

Artificial intelligence (AI) technologies are rapidly developing and are being widely applied in various fields, including education. The use of AI carries certain risks; however, one cannot completely reject it in student education. The article presents the experience of using AI in teaching English to 34 fourth-year students and 26 post-graduate students within the discipline ...

Added: January 29, 2026

Explorations in Applied Ethnolinguistics: Words, Cultures, and Global Perspectives

Palgrave Macmillan, 2025.

This volume contributes to the growing body of cutting-edge research into the Natural Semantic Metalanguage (NSM) approach in linguistics. It explores the broad range of possible applications enabled by the NSM approach, from linguistic studies of semantics and culture to cross-cultural studies, psychology and childhood education. The volume builds on previous studies, bringing a diversity ...

Added: January 28, 2026

Эпос о Гильгамеше. Перевод Николая Гумилева. Предисловие Е. Маркиной. Введение В. Шилейко.

Markina E., Манн, Иванов и Фербер, 2025.

Аннотация издателя: «Эпос о Гильгамеше» — древнейший памятник мировой литературы, дошедший до нас из глубин шумерской и аккадской цивилизаций. Поэма повествует о приключениях могущественного царя города Урука и его друга Энкиду. Это история о силе и дружбе, гордыне и смирении, страхе смерти и жажде бессмертия. Поэма издается в переводе поэта-акмеиста Николая Гумилева с пояснительной статьей ассириолога и современника поэта Владимира Шилейко, ...

Added: January 28, 2026

An Analysis of Sequential Patterns in Datasets for Evaluation of Sequential Recommendations

Klenitskiy A., Володкевич А. А., Pembek A. et al., ACM Transactions on Recommender Systems 2026

Sequential recommender systems are an important and in-demand area of research. These systems aim to use the order of interactions in a user’s history to predict future interactions. The premise is that the order of interactions and sequential patterns play an essential role. Therefore, it is crucial to use datasets that exhibit a sequential structure ...

Added: January 28, 2026

Semi-fake indexicals in Russian

Тискин Д. Б., Типология морфосинтаксических параметров 2025 Vol. 8 No. 1 P. 112–129

There are several rival theories of fake indexicals, i.e. bound indexicals (prominently pronouns) whose φ-features do not semantically contribute to focus alternatives (e.g. Only Mary did her homework, John didn’t do his). According to Minimal Pronoun theories (such as Kratzer’s or Wurmbrand’s), bound pronouns are Merged without φ-features and acquire them under binding via agreement-like ...

Added: January 26, 2026

Некоторые модификации к теории связанных употреблений индексальных выражений И. Басси

Тискин Д. Б., Типология морфосинтаксических параметров 2024 Т. 7 № 1 С. 107–123

Fake indexicals (FIs), or bound-variable uses of e.g. 1st - and 2 nd -person pronouns, have been analysed by Bassi (2021) as arising from a post-syntactic process of inspecting the features of the referent. This leads to a peculiar analysis of the syntax and semantics of relative clauses containing FIs. I argue for a more ...

Added: January 26, 2026

Autoregressive generation strategies for Top-K sequential recommendations

Klenitskiy A., Гусак Д. И., Володкевич А. А. et al., User Modelling and User-Adapted Interaction 2025 No. 35 Article 13

The goal of modern sequential recommender systems is often formulated in terms of next-item prediction. In this paper, we explore the applicability of transformer-based generative models for the Top-K sequential recommendation task, where the goal is to predict items that a user is likely to interact with in the “near future.” This goal aligns with ...

Added: January 26, 2026

Искусство (не)простого юридического письма. Учебное пособие

Knutov A., Chaplinskiy A., Мищенко П. А. et al., М.: Проспект, 2026.

Учебное пособие содержит рекомендации к стилю юридического письма, следование которым поможет сделать его более понятным для читателей. Первая глава систематизирует накопившиеся знания об общих стилевых особенностях языка права и его месте в речевой системе русского языка. Последующие главы посвящены отдельным видам юридических документов: языку законов, языку процессуальных документов, языку договоров и языку юридических аналитических документов. ...

Added: January 26, 2026

Marchenko–Pastur Law for Spectra of Random Weighted Bipartite Graphs

Nadutkina A., Tikhomirov A., Timushev D., Siberian Advances in Mathematics 2025 Vol. 34 P. 146–153

We study the spectra of random weighted bipartite graphs. We establish that, under specific assumptions on the edge probabilities, the symmetrized empirical spectral distribution function of the graph’s adjacency matrix converges to the symmetrized Marchenko-Pastur distribution function. ...

Added: January 26, 2026

Из переписки Е. А. Миллиор с Я. М. Боровским (1946–1960)

Ermakova L., Вестник Удмуртского университета. Серия История и филология 2025 Т. 35 № 6 С. 1403–1422

The article publishes and analyzes the correspondence between the historian of antiquity Elena A. Millior (1900–1978) and the classical philologist Yakov M. Borovsky (1896–1994), covering the years 1946–1960 and preserved in the archives of the Institute of Russian Literature (Pushkin House) of the Russian Academy of Sciences and the Bibliotheca Classica Petropolitana in St. Petersburg. ...

Added: January 26, 2026

Творчество Д.Н. Мамина-Сибиряка и современный мир

М., Екатеринбург: Кабинетный ученый, 2024.

В монографии рассматривается творчество классика уральской и общерусской литературы XIX в. Д. Н. Мамина-Сибиряка. Исследуются и описываются различные аспекты его художественного мира: аксиологическая и этическая проблематика, имеющие как универсальный, так и национальный характер, вопросы гео- и этнопоэтики, особенности нарративной организации текстов и художественного языка писателя, родословие Мамина и прикладные моменты его творчества, включая представление наследия писателя современной аудитории. Издание снабжено указателем произведений Мамина-Сибиряка. Книга предназначена для ...

Added: January 26, 2026

«Философия права» Гегеля и дело Коцебу: культурно-политический контекст

Lagutina I., Философические письма. Русско-европейский диалог 2025 Т. 8 № 4 С. 165–201

This article examines the assassination of the playwright August von Kotzebue by the theology student K. L. Sand as an event reflecting the ideological and philosophical tensions of early nineteenth-century Germany. It analyzes G. W. F. Hegel’s response to this historical episode in the context of his “Philosophy of Right”, which criticizes ethical and religious ...

Added: January 25, 2026

Conceptual Knowledge Structures First International Joint Conference, CONCEPTS 2024, Cádiz, Spain, September 9–13, 2024, Proceedings

Obiedkov S., Switzerland: Springer, 2024.

This book constitutes the proceedings of the First International Joint Conference on Conceptual Knowledge Structures, CONCEPTS 2024, which took place in Cádiz, Spain, during September 9-13, 2024. The conference is an amalgamation of the 18th International Conference on Formal Concept Analysis (ICFCA); the 17th International Conference on Concept Lattices and Their Applications (CLA); and the 28th ...

Added: January 23, 2026

Cooperative games with fuzzy characteristic functions on concept lattices

Kemgne M. W., Njionou B. B., Ignatov D. I. et al., International Journal of Approximate Reasoning 2025 Vol. 186 P. 1–18

This paper introduces cooperative games with transferable utilities and fuzzy characteristic functions on concept lattices. While previous works have independently addressed games with fuzzy payoffs and games restricted to structured coalition systems such as lattices, our approach combines both perspectives. We consider cooperative settings where coalition formation is constrained by a concept lattice structure, and ...

Added: January 23, 2026

Китайский язык: второй иностранный язык: 5-й класс: учебник (8-е изд.)

Sizova A., Чэнь Ф., Чжу Ч., М.: Просвещение, 2025.

Учебник «Китайский язык. Второй иностранный язык. 5 класс» серии «Время учить китайский!» создан совместно с издательством «People’s Education Press» (Китайская Народная Республика) и предназначен для обучающихся общеобразовательных организаций, начинающих изучать китайский язык в качестве второго иностранного языка с 5 класса. Настоящий учебник подготовлен в соответствии с требованиями ФГОС ООО, утверждённого Приказом Министерства просвещения РФ № ...

Added: January 23, 2026

Run time dynamic digital twins and dynamic digital twins networks

Vodyaho A., Delhibabu R., Ignatov D. I. et al., Future Generation Computer Systems 2025 Vol. 172 P. 1–18

Digital twins are widely used for building various types of cyber–physical systems. There are a huge number of publications devoted to the use of digital twins in production systems. Much less attention is paid to the issues of building runtime digital twins. The article describes an approach to building complex distributed cyber–physical systems with a ...

Added: January 23, 2026

Iterative Ricci-Foster Curvature Flow with GMM-Based Edge Pruning: A Novel Approach to Community Detection

Sorokin K., Beketov M., Онучин А. et al., / arxiv.org. Серия cs.SI "Social and Information Networks ". 2025.

Community detection in complex networks is a fundamental problem, open to new approaches in various scientific settings. We introduce a novel community detection method, based on Ricci flow on graphs. Our technique iteratively updates edge weights (their metric lengths) according to their (combinatorial) Foster version of Ricci curvature computed from effective resistance distance between the ...

Added: January 15, 2026

Implementing Transport Coding in OMNeT++ for Message Delay Reduction

Petrovanov I., Sergeev A., / Series Computer Science "arxiv.org". 2025. No. 2512.18332.

Transport coding reduces message delay in packet-switched networks by introducing controlled redundancy at the transport layer: original packets are encoded into coded packets, and the message is reconstructed after the first successful deliveries, effectively shifting latency from the maximum packet delay to the -th order statistic. We present a concise, reproducible discrete-event implementation of transport coding in OMNeT++, including ...

Added: December 24, 2025

Hessian-based lightweight neural network for brain vessel segmentation on a minimal training dataset

Меньшиков И. А., Бернадотт А. К., Elvimov N. S., / Series arXie "Statistical mechanics". 2025.

Accurate segmentation of blood vessels in brain magnetic resonance angiography (MRA) is essential for successful surgical procedures, such as aneurysm repair or bypass surgery. Currently, annotation is primarily performed through manual segmentation or classical methods, such as the Frangi filter, which often lack sufficient accuracy. Neural networks have emerged as powerful tools for medical image ...

Added: December 1, 2025

Determining the boundary of dynamical chaos in the generalized Chirikov map via machine learning

Чернышов Д. П., Satanin A., Shchur L., / Series arXiv "math". 2025.

We investigate the boundary separating regular and chaotic dynamics in the generalized Chirikov map, an extension of the standard map with phase-shifted secondary kicks. Lyapunov maps were computed across the parameter space (K,K(α, τ)) and used to train a convolutional neural network (ResNet18) for binary classification of dynamical regimes. The model reproduces the known critical ...

Added: November 21, 2025

Эффективный алгоритм торговли на фондовом рынке: ретроспективный анализ, основанный на данных по S&P-500.

Rubchinskiy A., Chubarova D., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2025. No. WP7/2025/01.

The article examines one of the most famous examples of socio-economic systems, characterized by significant uncertainty – the S&P-500 stock market, where shares of 500 largest US companies are traded. No assumptions are made about the probabilistic characteristics of the stock market. A flexible algorithm for daily trading has been developed, based on both known fixed data ...

Added: November 9, 2025

Building a Clean Bartangi Language Corpus and Training Word Embeddings for Low-Resource Language Modeling

Shumen: INCOMA Ltd, 2025.

This paper introduces a rule-based lemmatization and word embedding pipeline for the endangered Bartangi language, part of the Pamiri language group. The system combines a manually constructed lemma dictionary with morphological suffix rules to improve linguistic consistency in low-resource settings. The results demonstrate enhanced lemmatization accuracy and higher-quality embeddings for downstream NLP tasks. The work ...

Added: October 20, 2025