?
Multilingual Facilitation
Helsinki :
University of Helsinki, 2021.
Under the general editorship: M. Hämäläinen, N. Partanen, K. Alnajjar
This is the Festschrift of Dr. Jack Rueter. This book presents peer-reviewed scientific work from Dr. Rueter's colleagues related to the latest advances in natural language processing, digital resources and endangered languages in a variety of languages asuch as historical English, Chukchi, Mansi, Erzya, Komi, Finnish, Apurina, Sign Languages, Sami languages, and Japanese. Most of the papers present work on endangered languages or on domains with a limited number of resources available for NLP. This book collects original and insightful papers from well-established researchers in NLP, linguistics, philology and digital humanities.
Chapters
Swanson D., , in: Multilingual Facilitation.: Helsinki: University of Helsinki, 2021. P. 133–146.
This paper presents lexd, a lexicon compiler for languages with non-suffixational morphology, which is intended to be faster and easier to use than existing solutions while also being compatible with other tools. We perform a case-study for Chukchi, comparing against a hand-optimised analyser written in lexc, and find that while lexd is easier to use, ...
Added: April 20, 2021
Biryukova K., Chelnokova D., Erkenova J. et al., Communications in Computer and Information Science 2024 Vol. 2364 CCIS P. 109 – 121
Added: February 25, 2026
Chepikov I., Karpov I., , in: 26th International Conference, AIED 2025, Palermo, Italy, July 22–26, 2025, Proceedings, Part I. Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium, Blue Sky, and WideAIED.: Springer, 2025. P. 352 – 358.
Modern LLM models such as BERT, ChatGPT, DeepSeek have shown great potential in solving various tasks, including text classification, text generation, analysis and summary of documents. In this paper, we show that these models close to classical ML approaches based on decision trees not only in text processing, but also in processing classical tabular data ...
Added: September 4, 2025
Morozov D., Garipov T., Lyashevskaya O. et al., Journal of Language and Education 2024 Vol. 10 No. 4 P. 71–84
Introduction: Numerous algorithms have been proposed for the task of automatic morpheme segmentation of Russian words. Due to the differences in task formulation and datasets utilized, comparing the quality of these algorithms is challenging. It is unclear whether the errors in the models are due to the ineffectiveness of algorithms themselves or to errors and inconsistencies ...
Added: January 7, 2025
Russo M., Pavone P., Meissner D. et al., Quality and Quantity 2025 Vol. 59 No. Suppl 1 P. S343–S367
In OECD countries, Science, Technology and Innovation (STI) policies were seen as key aspects of coping with the Covid-19 pandemic. Now that the pandemic is over, identifying which policy mix portfolios characterised countries in terms of their non-Covid-19 related and Covid-19 specific STI policies fills a knowledge gap on changes in STI policies induced by ...
Added: September 27, 2024
Parameter-Efficient Tuning of Transformer Models for Anglicism Detection and Substitution in Russian
Daniil Lukichev, Kryanina Darya, Anastasia Bystrova et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог». Вып. 22.Вып. 22.: [б.и.], 2023. P. 295–306.
Added: April 25, 2024
Sergei O. Kuznetsov, Parakal E. G., Lecture Notes in Networks and Systems 2023 Vol. 776 P. 423–434
Inherently explainable Machine Learning (ML) models are able to provide explanations for their predictions by virtue of their construction. The explanations of a ML model are more comprehensible if they are expressed in terms of its input features. Our paper proposes an inherently explainable pipeline for document classification using pattern structures and Abstract Meaning Representation ...
Added: February 5, 2024
Switzerland: Springer, 2024.
This book constitutes revised papers from the International Workshops held at the 21st International Conference on Business Process Management, BPM 2023, in Utrecht, The Netherlands, during September 2023.
Papers from the following workshops are included:
• 7th International Workshop on Artificial Intelligence for Business Process Management (AI4BPM 2023)
• 7th International Workshop on Business Processes Meet Internet-of-Things (BP-Meet-IoT ...
Added: January 17, 2024
Северина Е. М., Ларионова М. Ч., Litera 2023 № 10 С. 211–222
The article considers a model of preparation of machine-readable (semantic) markup of texts for the Chekhov Digital project on the example of philological interpretation of individual significant elements of A. P. Chekhov's story "Death of an Official" and presentation of this information explicitly based on the standards of digital publication Text Encoding Initiative (TEI/XML). Based ...
Added: January 12, 2024
Кругликова В. Г., В кн.: Анализ речи: теоретические и прикладные аспекты: сборник научных статей.: [б.и.], 2023.
The article presents a comparative analysis of various language models used to generate texts and evaluates their effectiveness for the task of generating conversational speech. There are such models as GPT-3, BERT, LSTM involved in the comparative analysis. This study is part of a project of developing a system for generating dialogues in Russian. The ...
Added: December 10, 2023
Baklanova V., Kurkin A., Teplova T., China Finance Review International 2024 Vol. 14 No. 3 P. 522–548
Purpose – The primary objective of this research is to provide a precise interpretation of the constructed
machine learning model and produce definitive summaries that can evaluate the influence of investor sentiment on the overall sales of non-fungible token (NFT) assets. To achieve this objective, the NFT hype
index was constructed as well as several approaches of ...
Added: December 10, 2023
Kirina M., Человек: образ и сущность. Гуманитарные аспекты 2024 № 2(58) С. 176–204
The article focuses on the application of opinion mining techniques to evaluate user experience on the Hyperskill educational platform, using Python, Java, and Kotlin programming projects as the basis of analysis. The study utilizes sentiment analysis and keyword extraction methods to gauge users' attitudes towards the platform, learning process, and topics covered. To achieve this, ...
Added: December 9, 2023
Bolshakova E. I., Семак В. В., Интеллектуальные системы. Теория и приложения 2021 Т. 25 № 4 С. 239–242
An approach to automatic extraction of terms from an individual scientific text is reported, which combines known methods: linguistic patterns, statistical terminological measures, methods of graph ranking. The combined methods and stages for extracting, selection and ranking of terms are described, which are implemented for processing documents in Russian. The results of experiments on extracting ...
Added: November 23, 2023
Galitsky B., Ilvovsky D., Goncharova E., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог». Вып. 22.Вып. 22.: [б.и.], 2023.
We extend the concept of a discourse tree (DT) in the discourse representation of text towards data of various forms and natures. The communicative DT to include speech act theory, extended DT to ascend to the level of multiple documents, entity DT to track how discourse covers various entities were defined previously in computational linguistics, we now proceed ...
Added: November 10, 2023
I. K. Kusakin, Fedorets O. V., A. Y. Romanov, Scientific and Technical Information Processing 2023 Vol. 50 No. 3 P. 176–183
This paper discusses modern approaches to natural language processing and the application of machine learning models to the task of classifying short scientific texts in Russian. This study is devoted to the analysis of methods for vectorization of textual information, selection of a model for scientific paper clas- sification, and training of linguistic model BERT ...
Added: November 4, 2023
Parameter-Efficient Tuning of Transformer Models for Anglicism Detection and Substitution in Russian
Daniil Lukichev, Kryanina D., Anastasia Bystrova et al., , in: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог». Вып. 22.Вып. 22.: [б.и.], 2023. P. 295–306.
This article is devoted to the problem of Anglicisms in texts in Russian: the tasks of detection and automatic rewriting of the text with the substitution of Anglicisms by their Russian-language equivalents. Within the framework of the study, we present a parallel corpus of Anglicisms and models that identify Anglicisms in the text and replace ...
Added: September 22, 2023
Parakal E. G., Kuznetsov S., , in: Proceedings of the 10th International Workshop "What can FCA do for Artificial Intelligence?"Vol. 3233.: CEUR Workshop Proceedings, 2022. Ch. 2 P. 9–22.
Explanations for the predictions made by Machine Learning (ML) models are best framed in terms of
abstract, high-level concepts that are easily comprehensible to human beings. The use of such concepts
constitutes a subfield of interpretability methods known as concept-based explanations. This work uses
concept-based explanations to build an intrinsically interpretable document classifier using a combination
of Formal Concept ...
Added: May 17, 2023
Alimova I., Tutubalina E., Nikolenko S. I., IEEE Access 2022 Vol. 10 P. 1432–1439
Relation extraction (RE) aims to extract relational facts from plain text, which is essential to the biomedical research field with the rapid growth of biomedical literature and generally large volumes of biomedicine-related text coming from various sources. Numerous annotated corpora and state-of-the-art models have been introduced in the past five years. However, there are no ...
Added: April 10, 2023
Sakhovskiy A., Tutubalina E., Journal of Biomedical Informatics 2022 Vol. 135 Article 104182
In this paper, we focus on the classification of tweets as sources of potential signals for adverse drug effects (ADEs) or drug reactions (ADRs). Following the intuition that text and drug structure representations are complementary, we introduce a multimodal model with two components. These components are state-of-the-art BERT-based models for language understanding and molecular property ...
Added: April 10, 2023