?
Building a Dictionary-Based Lemmatizer for Old Irish
P. 12-17.
Dereza O.
This paper explores the problem of developing NLP tools for morphologically rich and orthographically inconsistent classical languages. It is a case study of building a lemmatizer for Old Irish using only a dictionary and an unlabeled corpus as sources of data. At the current stage, the lemmatizer shows 76.31% average recall score on a corpus of ca. 100,000 tokens and is able to predict lemmas for out-of-vocabulary words.
In book
Vol. 6: Celtic Language Technology Workshop. , P. : [б.и.], 2016
Lyashevskaya O., , in : Computational Linguistics and Intellectual Technologies. Issue 18.: M. : Russian State University for the Humanitie, 2019. P. 422-434.
The paper discusses the standardization efforts to create a morphological standard for the Middle Russian corpus, which is part of the historical collection of the Russian National Corpus (RNC). To meet the needs of different categories of corpus researchers as well as NLP developers, we consider two styles of the morphological annotation (RNC schema and ...
Added: June 12, 2019
Chepovskiy A., М. : Национальный открытый университет «ИНТУИТ», 2015
В монографии рассмотрены различные математические модели для решения практических задач обработки текстов на естественных языках. Предлагаются решения проблем, возникающих при организации индексации и последующего поиска данных. Методы компьютерной лингвистики применяются для прикладных исследований. Предназначена для разработчиков информационных систем, специалистов в области компьютерной лингвистики. ...
Added: May 23, 2015
Lyashevskaya O., Afanasev I., Stefan Rebrikov et al., , in : Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог». Вып. 22. Вып. 22.: [б.и.], 2023. P. 307-318.
An updated annotation of the Main, Media, and some other corpora of the Russian National Corpus (RNC) features the part-of-speech and other morphological information, lemmas, dependency structures, and constituency types. Transformer-based architectures are used to resolve the homonymy in context according to a schema based on the manually disambiguated subcorpus of the Main corpus (morphology ...
Added: September 15, 2023
Лаврентьев А. М., Смирнов И. В., Соловьев Ф. Н. et al., Системы высокой доступности 2018 Т. 14 № 3 С. 76-81
The extension of the TXM platform for case analysis is considered. It is proposed to use the allocation of pseudo-words in words of text on the basis of the method of structural schemes and the identification of nominal groups in the structure of the text forselecting subcorps in terms of parameters. The results of the ...
Added: September 20, 2018
Durandin O. V., Strebkov D. Y., Hilal N. R., , in : Computational Linguistics and Intellectual Technologies: Proceedings of the Annual International Conference “Dialogue” (2016). : М. : Изд-во РГГУ, 2016. P. 1-13.
The paper presents work on automatic Arabic dialect classification and proposes machine learning classification method where training dataset consists of two corpora. The first one is a small corpus of manually dialectannotated instances. The second one contains big amount of instances that were grabbed from the Web automatically using word-marks—most unique and frequent dialectal words ...
Added: January 18, 2017
M. : Russian State University for the Humanitie, 2015
Added: April 28, 2015
Smetanin S., IEEE Access 2020 Vol. 8 P. 110693-110719
Sentiment analysis has become a powerful tool in processing and analysing expressed opinions on a large scale. While the application of sentiment analysis on English-language content has been widely examined, the applications on the Russian language remains not as well-studied. In this survey, we comprehensively reviewed the applications of sentiment analysis of Russian-language content and ...
Added: June 24, 2020
Toldova S., Lyashevskaya O., Вопросы языкознания 2014 № 1 С. 120-145
This paper is an overview of the current issues and tendencies in Computational linguistics. The overview is based on the materials of the conference on computational linguistics COLING’2012. The modern approaches to the traditional NLP domains such as pos-tagging, syntactic parsing, machine translation are discussed. The highlights of automated information extraction, such as fact extraction, ...
Added: October 15, 2013
Smetanin S., PeerJ Computer Science 2022 Vol. 8 Article e1164
Prior research suggests that weather conditions may substantively impact people’s emotional state and mood. In Russia, the relationship between weather and mood has been studied for certain regions—usually with severe or extreme climatic and weather conditions—but with quite limited samples of up to 1,000 people. Over the past decade, partly due to the proliferation of ...
Added: November 21, 2022
Кругликова В. Г., В кн. : Анализ речи: теоретические и прикладные аспекты: сборник научных статей. : [б.и.], 2023.
The article presents a comparative analysis of various language models used to generate texts and evaluates their effectiveness for the task of generating conversational speech. There are such models as GPT-3, BERT, LSTM involved in the comparative analysis. This study is part of a project of developing a system for generating dialogues in Russian. The ...
Added: December 10, 2023
Skorinkin D.A., Budnikov E. A., Stepanova M. E. et al., Компьютерная лингвистика и интеллектуальные технологии 2016 No. 15 P. 721-733
This paper presents a rule-based approach to Information Extraction (IE) task within FactRuEval-2016 competition. Our system is based on ABBYY Compreno Technology. The technology uses the results of deep syntactic-semantic analysis, which leads to significant reduction of the number of necessary rules and makes them laconic. The evaluation was conducted on FactRuEval dataset. FactRuEval is ...
Added: August 28, 2016
M. : ., 2014
The Conference Proceedings contain 64 papers from the international conference on Computational Linguistics and Intellectual Technologies “Dialogue 2014”, representing a large range of theoretical and applied research in the area of natural language description, language process description, creation of applied computer-linguistic technologies.
For specialists in the field of theoretical and applied linguistics and intellectual technologies. ...
Added: July 7, 2014
М. : Издательский центр «Российский государственный гуманитарный университет», 2019
The book includes 64 papers submitted to the International conference in computer linguistics and intellectual technologies Dialogue 2019 and presents a broad spectrum of theoretical and applied research of natural language description, language simulation, and creation of applied computer technologies. ...
Added: October 16, 2019
Kirina M., Человек: образ и сущность. Гуманитарные аспекты 2023
The article focuses on the application of opinion mining techniques to evaluate user experience on the Hyperskill educational platform, using Python, Java, and Kotlin programming projects as the basis of analysis. The study utilizes sentiment analysis and keyword extraction methods to gauge users' attitudes towards the platform, learning process, and topics covered. To achieve this, ...
Added: December 9, 2023
Wales : University of Wales Centre for Advanced Welsh and Celtic Studies, 2015
Added: October 5, 2017
Денис Турдаков, Астраханцев Н. А., Недумов Я. Р. et al., Труды Института системного программирования РАН 2014 Т. 26 С. 421-438
he paper presents a framework for fast text analytics developed during the Texterra project. Texterra is a technology for multilingual text mining based on novel text processing methods that exploit knowledge extracted from user-generated content. It delivers a fast scalable solution for text mining without the expensive customization. Depending on use-cases Texterra could be utilized ...
Added: November 6, 2017
Селегей В., -, 2020
Сборник включает 60 докладов международной конференции по компьютерной лингвистике и интеллектуальным технологиям «Диалог 2020», представляющих широкий спектр теоретических и прикладных исследований в области описания естественного языка, моделирования языковых процессов, создания практически применимых компьютерных лингвистических технологий.
Для специалистов в области теоретической и прикладной лингвистики и интеллектуальных технологий. ...
Added: June 21, 2020
M. : Russian State University for the Humanitie, 2019
The book includes 64 papers submitted to the International conference in computer linguistics and intellectual technologies Dialogue 2019 and presents a broad spectrum of theoretical and applied research of natural language description, language simulation, and creation of applied computer technologies. ...
Added: October 16, 2019
Kuzmina A., Лифшиц М. А., Kostenko V., Современная зарубежная психология 2022 Т. 11 № 1 С. 104-115
The use of modern methods of computational linguistics in psychological research opens up new possibilities both for the study of personality and language and for the development of psychodiagnostics methods. This article discusses the main possible directions of such research, as well as non-obvious nuances that are important in their planning. Maximum use of the ...
Added: April 18, 2022
Klyshinskiy E., Жеребцова Ю., Чижик А., Системный администратор 2019 № 10 С. 82-91
Nowadays, a field of dialogue systems and conversational agents is one of the rapidly growing research areas in artificial intelligence applications. Business and industry are showing increasing interest in implementing intelligent conversational agents into their products. Many recent studies has tended to focus on possibility of developing task-oriented systems which are able to have long ...
Added: October 26, 2019
Afanasev I., / НИУ ВШЭ. Series WP BRP "Linguistics". 2021.
The article considers a lemmatiser that is developed specifically for Old Church Slavonic (OCS). The introduction underlines the problem of the lack of lemmatisers that might deal with different datasets of the OCS. The review gives a short description of previous attempts and current trends in lemmatisation. The lemmatiser is hybrid-based and uses the advantages ...
Added: December 28, 2021
Фирсанова В. И., International Journal of Open Information Technologies 2021 Vol. 9 No. 12 P. 53-59
The paper presents a study on question answering systems evaluation. The purpose of the study is to determine if human evaluation is indeed necessary to qualitatively measure the performance of a sociomedical dialogue system. The study is based on the data from several natural language processing experiments conducted with a question answering dataset for inclusion of people with autism spectrum disorder and state-of-the-art ...
Added: September 25, 2023
М. : Изд-во РГГУ, 2015
Сборник содержит труды 21-й Международной конференции по компьютерной лингвистике. ...
Added: May 20, 2015