?
Comparative analysis of classification methods for text in UDC code generation problem for scientific articles
P. 359-363.
The research is devoted to studying of applicability of most relevant modern classification methods to the issue of automatic universal decimal classificator code generation for arbitrary scientific article. The next methods are considered as classifiers: artificial neural network, logistic regression, naive Bayesian classifier and metrical
In book
M. : Association of graduates and employees of AFEA named after prof. Zhukovsky, 2017
I. K. Kusakin, Fedorets O. V., A. Y. Romanov, Scientific and Technical Information Processing 2023 Vol. 50 No. 3 P. 176-183
This paper discusses modern approaches to natural language processing and the application of machine learning models to the task of classifying short scientific texts in Russian. This study is devoted to the analysis of methods for vectorization of textual information, selection of a model for scientific paper clas- sification, and training of linguistic model BERT ...
Added: November 4, 2023
Pimonova E., Durandin O., Malafeev A., , in : Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected Papers. Vol. 11832.: Cham : Springer, 2019. P. 193-204.
This work tackles the problem of modeling author style in Russian. In particular, we solve the task of authorship attribution using the collected dataset of 30 authors, 1506 texts written in the period of 18th – 21st century. We apply various approaches to solving the attribution problem: Random Forest, Logistic Regression, SVM Classifier. In terms ...
Added: November 7, 2019
Romanov A., Ломотин К. Е., Козлова Е. С., Информационные технологии 2017 Т. 23 № 6 С. 418-423
The paper deals with the applicability of modern machine learning methods to the problem of automatic generation of UDC for scientific articles. As the classifiers, such models as artificial neural networks, logistic regression and boosting are considered. Graph algorithms and a prototype software module to generate UDC are designed. ...
Added: July 30, 2017
Berlin : Association for Computational Linguistics, 2016
The 2016 Conference on Computational Natural Language Learning is the twentieth in the series of annual meetings organized by SIGNLL, the ACL special interest group on natural language learning. CoNLL 2016 will be held on August 11-12, 2016, and is co-located with the 54th annual meeting of the Association for Computational Linguistics (ACL) in Berlin, ...
Added: November 12, 2016
Ekaterinburg : CEUR Workshop Proceedings, 2014
AIST'2014 is an international data science conference on Analysis of Images, Social Networks, and Texts. Traditionally, the conference is held annually in Yekaterinburg, Russia. The conference is intended for computer scientists and practitioners whose research interests involve Internet mathematics and other related fields of data science.
LIST OF TOPICS (NON EXHAUSTIVE)
Applications of Data Mining and Machine ...
Added: August 28, 2014
Кусакин И. К., Цурупа А. М., Алмакаев А. В. et al., В кн. : НТИ-2022. Научная информация в современном мире: глобальные вызовы и национальные приоритеты : материалы 10-ой научной конференции с международным участием, посвященной 70-летию ВИНИТИ РАН, Москва, 25–26 октября 2022 года. : М. : ВИНИТИ РАН, 2022. С. 103-109.
This work is devoted to the study of approaches for training BERT-based classifiers of scientific articles to implement the application with the adoption of the best models for use in the infrastructure of the VINITI RAS. For this purpose, the BERT linguistic model was trained on a specialized corpus of scientific texts for subsequent use ...
Added: January 31, 2023
S.D. Kuznetsov, D.Yu. Turdakov, Астраханцев Н. А. et al., Programming and Computer Software 2014 Vol. 40 No. 5 P. 288-295
A framework for fast text analysis, which is developed as a part of the Texterra project, is described. Texterra provides a scalable solution for the fast text processing on the basis of novel methods that exploit knowledge extracted from the Web and text documents. For the developed tools, details of the project, use cases, and ...
Added: November 26, 2017
Sergey Smetanin, Mathematics 2022 Vol. 10 No. 16 Article 2947
Policymakers and researchers worldwide are interested in measuring the subjective well-being (SWB) of populations. In recent years, new approaches to measuring SWB have begun to appear, using digital traces as the main source of information, and show potential to overcome the shortcomings of traditional survey-based methods. In this paper, we propose the formal model for ...
Added: August 15, 2022
Romanov A., Lomotin K.E., Kozlova E.S. et al., , in : 2016 International Siberian Conference on Control and Communications (SIBCON). Proceedings. : M. : HSE, 2016. Ch. 543fu4t.
In this work realization of automatic scientific articles classification according to Universal Decimal Classifier is presented. Efficiency of neural networks technologies application for current task is researched, and optimal neural network structure and parameters are offered ...
Added: June 11, 2016
A. V. Belov, E. A. Egorova, Bulletin D. Serikbayev East Kazakhstan Technical University 2023 No. 4 P. 92-102
When conducting scientific and technical expertise, it is necessary to analyze the texts of reports on scientific research work. The analysis is carried out in order to determine whether the research being conducted belongs to the class of scientific research and development work in the field of IT. This article discusses the tasks of binary ...
Added: March 9, 2024
Toldova S., Lyashevskaya O., Вопросы языкознания 2014 № 1 С. 120-145
This paper is an overview of the current issues and tendencies in Computational linguistics. The overview is based on the materials of the conference on computational linguistics COLING’2012. The modern approaches to the traditional NLP domains such as pos-tagging, syntactic parsing, machine translation are discussed. The highlights of automated information extraction, such as fact extraction, ...
Added: October 15, 2013
Fenogenova A., Karpov I., Kazorin V., , in : Proceedings of the Artificial Intelligence and Natural Language AINL FRUCT 2016 Conference, Saint-Petersburg, Russia, 10-12 November 2016. : FRUCT Oy, 2016. P. 31-36.
With the process of globalization the number of borrowings from English has rapidly increased in languages all over the world. In systems of automatic speech recognition, spell-checking, tagging and other tasks in the field of natural language processing the loan words frequently cause problems and should be treat separately. In this paper we present a ...
Added: October 19, 2016
Dmitry Romanov, Kazantsev N., Edgeeva E., , in : Business Process Management: Blockchain and Central and Eastern Europe Forum. BPM 2019. Vol. 361.: Springer, 2019. P. 337-341.
This paper studies ‘the order effect’ in decision making based on classification results of 120 000 citizen claims to Moscow Government. We use machine learning methods and derive that with 60% probability the first out of two consequent claims is prioritized. We conclude that this impact must be considered whilst developing artificial intelligence units. ...
Added: October 26, 2020
Alimova l., Tutubalina E., Journal of Biomedical Informatics 2020 Vol. 103 P. 1-9
Relation extraction aims to discover relational facts about entity mentions from plain texts. In this work, we focus on clinical relation extraction; namely, given a medical record with mentions of drugs and their attributes, we identify relations between these entities. We propose a machine learning model with a novel set of knowledge-based and BioSentVec embedding ...
Added: October 28, 2020
Berlin : Springer, 2014
This book constitutes the proceedings of the Third International Conference on Analysis of Images, Social Networks and Texts, AIST 2014, held in Yekaterinburg, Russia, in April 2014. The 11 full and 10 short papers were carefully reviewed and selected from 74 submissions. They are presented together with 3 short industrial papers, 4 invited papers and ...
Added: November 13, 2014
Malafeev A., Nikolaev K., , in : Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Kazan, Russia, July 17–19, 2019, Revised Selected Papers. Communications in Computer and Information Science. Vol. 1086.: Springer, 2020. P. 154-159.
In this paper, a deep learning method study is conducted to solve a new multiclass text classification problem, identifying user interests by text messages. We used an original dataset of almost 90 thousand forum text messages, labeled for ten interests. We experimented with different modern neural network architectures: recurrent and convolutional, as well as simpler ...
Added: November 7, 2019
Gerasimenko Ekaterina, Puzhaeva Svetlana, Zakharova Elena et al., , in : Proceedings of Third Workshop "Computational linguistics and language science". Issue 4.: Manchester : EasyChair, 2019. P. 61-69.
In this paper, we address the problem of automatic extraction of discourse formulae. By discourse formulae (DF) we mean a special type of constructions at the discourse level, which have a fixed form and serve as a typical response in the dialogue. Unlike traditional constructions [4, 5, 6], they do not contain variables within the ...
Added: October 31, 2019
Springer, 2021
This book constitutes the proceedings of the 19th Russian Conference on Artificial Intelligence, RCAI 2021, held in Moscow, Russia, in October 2021.
The 19 full papers and 7 short papers presented in this volume were carefully reviewed and selected from 80 submissions. The conference deals with a wide range of topics, categorized into the following topical ...
Added: October 28, 2021
Durandin O., Hilal N., Strebkov D. et al., , in : Proceedings of the ISMW-FRUCT 2016. : [б.и.], 2016. P. 90-93.
The paper contains a take on the classification problem variation featuring class noise where each object in the training set is associated with a probability distribution over the class label set instead of a particular class label. That type of task was illustrated on the complex natural language processing problem – automatic Arabic dialect classification. ...
Added: January 17, 2017
Springer, 2021
This book constitutes revised selected papers from the 9th International Conference on Analysis of Images, Social Networks and Texts, AIST 2020, held during October 15-16, 2020. The conference was planned to take place in Moscow, Russia, but changed to an online format due to the COVID-19 pandemic.
The 27 full papers and 4 short papers presented ...
Added: October 7, 2020
Tikhonova M., Elina Telesheva, Mirzoev S. et al., , in : 2021 International Conference Engineering and Telecommunication (En&T). : IEEE, 2022. P. 1-6.
Style transfer is an important and a rapidly developing of Natural Language Processing. This days more and more methods and models are proposed which allow us to generate text in predefined style. In this paper we propose a framework for style transfer of “Friends” TV series. The trained models are able to mimic one of ...
Added: May 21, 2022
Денис Турдаков, Астраханцев Н. А., Недумов Я. Р. et al., Труды Института системного программирования РАН 2014 Т. 26 С. 421-438
he paper presents a framework for fast text analytics developed during the Texterra project. Texterra is a technology for multilingual text mining based on novel text processing methods that exploit knowledge extracted from user-generated content. It delivers a fast scalable solution for text mining without the expensive customization. Depending on use-cases Texterra could be utilized ...
Added: November 6, 2017
Bartunov S., Кондрашкин Д. А., Osokin A. et al., / Arxiv.org. Series arXiv:1502.07257 "Computation and language". 2015.
Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed to ...
Added: November 5, 2015
Smetanin S., IEEE Access 2020 Vol. 8 P. 110693-110719
Sentiment analysis has become a powerful tool in processing and analysing expressed opinions on a large scale. While the application of sentiment analysis on English-language content has been widely examined, the applications on the Russian language remains not as well-studied. In this survey, we comprehensively reviewed the applications of sentiment analysis of Russian-language content and ...
Added: June 24, 2020