?
Использование информационной теории восприятия речи для анализа качества речи
С. 264-266.
Karpov N.
In book
Вып. 17. , Воронеж : Научная книга, 2012
Arkhangelskiy T., Гильмуллин Р. А., Невзорова О. А. et al., Научно-техническая информация. Серия 2: Информационные процессы и системы 2013
В статье описывается электронный корпус татарского языка, созданный в рамках программы фундаментальных исследований Президиума РАН "Корпусная лингвистика", и методы, использованные авторами для создания этого корпуса. В частности, описываются текстовый состав и жанровая структура корпуса, принятые авторами решения о выделении морфологических характеристик, автоматическая морфологическая разметка текстов с помощью двухуровневой модели морфологии и анализатора PC-KIMMO и размещение ...
Added: October 25, 2013
Kutuzov A. B., Velldal E., Øvrelid L., , in : Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. : Berlin : Association for Computational Linguistics, 2016. P. 115-125.
This paper studies how word embeddings trained on the British National Corpus interact with part of speech boundaries. Our work targets the Universal PoS tag set, which is currently actively being used for annotation of a range of languages. We experiment with training classifiers for predicting PoS tags for words based on their embeddings. The ...
Added: November 12, 2016
Karpov N., , in : Satellite Workshops & Doctoral Consortium. : Nizhny Novgorod : Higher School of Economics in Nizhny Novgorod, 2012. P. 175-182.
Technology of electronic distance course development was created. This technology is useful for developing training tutorial of foreign languages. Firstly, it can be integrated into the LMS and can be used online as a web service. Secondly, it can be used as a standalone desktop tutorial. A helpful tool tip was added. The tip contains ...
Added: October 6, 2012
Pimonova E., Durandin O., Malafeev A., , in : Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected Papers. Vol. 11832.: Cham : Springer, 2019. P. 193-204.
This work tackles the problem of modeling author style in Russian. In particular, we solve the task of authorship attribution using the collected dataset of 30 authors, 1506 texts written in the period of 18th – 21st century. We apply various approaches to solving the attribution problem: Random Forest, Logistic Regression, SVM Classifier. In terms ...
Added: November 7, 2019
Kutuzov A. B., Козлова О. С., , in : Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва,1–4 июля 2016 г.). Вып. 15.: М. : Изд-во РГГУ, 2016. P. 288-300.
In natural language processing, distributional semantic models are known as an efficient data driven approach to word and text representation, which allows computing meaning directly from large text corpora into word embeddings in a vector space. This paper addresses the role of linguistic preprocessing in enhancing performance of distributional models, and particularly studies pronominal anaphora ...
Added: November 12, 2016
Malafeev A., International Journal of Conceptual Structures and Smart Applications (IJCSSA) 2014 Vol. 2 No. 2 P. 20-35
This article presents an approach to the automatic generation of open cloze exercises based on arbitrary English text. The exercise format is similar to the open cloze test used in Cambridge English certificate exams (FCE, CAE, CPE). The presented method also makes it possible to adjust the difficulty of the resulting exercises to better suit ...
Added: November 29, 2014
Kirill Maslinsky, , in : TALN-RECITAL 2014 Workshop TALAf 2014 : Traitement Automatique des Langues Africaines (TALAf 2014: African Language Processing). : Marseille : Association pour le Traitement Automatique des Langues, 2014. P. 114-122.
This article provides a brief overview of Daba software package created in the course of building corpora for Manding languages. Key software features are motivated by the tasks and problems characteristic of many African languages. The corpus-building model proposed here was initially developed for Bambara Reference Corpus which is available online and is freely accessible. ...
Added: March 26, 2015
Sibirtseva V., Karpov N., / Издательский дом НИУ ВШЭ. Series WP "Working Papers of Humanities". 2012. No. 2012-6.
The paper considers the features of selecting the teaching illustrative material for the theoretical part of a multimedia textbook on Russian as a foreign language, and describes the peculiarities of compiling a set of exercises on the basis of the National Corpus of the Russian Language. The author(s) analysed in detail the difficulties caused by ...
Added: November 8, 2012
Berlin : Association for Computational Linguistics, 2016
The 2016 Conference on Computational Natural Language Learning is the twentieth in the series of annual meetings organized by SIGNLL, the ACL special interest group on natural language learning. CoNLL 2016 will be held on August 11-12, 2016, and is co-located with the 54th annual meeting of the Association for Computational Linguistics (ACL) in Berlin, ...
Added: November 12, 2016
Osaka : [б.и.], 2016
Language resources are increasingly used not only in Language Technology (LT), but also in other subject fields, such as the digital humanities (DH) and in the field of education. Applying LT tools and data for such fields implies new perspectives on these resources regarding domain adaptation, interoperability, technical requirements, documentation, and usability of user interfaces. ...
Added: November 12, 2016
Braslavski P., Karpov Nikolay, Worring M. et al., ACM SIGIR Forum 2014 Vol. 48 No. 2 P. 105-110
The 8th Russian Summer School in Information Retrieval (RuSSIR 2014) was held on August 18-22, 2014 in Nizhniy Novgorod, Russia.1 The school was co-organized by the National Research University Higher School of Economics2 and the Russian Information Retrieval Evaluation Seminar (ROMIP) ...
Added: August 22, 2015
Суворова М. И., Кобозева М. В., Toldova S. et al., Искусственный интеллект и принятие решений 2020 № 1 С. 17-26
В статье обсуждается важность автоматического сценарного анализа для понимания текстов на естественном языке. Дан широкий обзор методов и подходов к описанию и извлечению сценариев. Рассмотрены теоретические подходы к формализации сценариев. Приведен список задач, для решения которых используется информация о сценарной структуре текста. Представлены популярные подходы к автоматическому извлечению сценариев из текстов и методы оценки их ...
Added: April 22, 2020
Karpov N., , in : Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). : Vancouver : Association for Computational Linguistics, 2017. P. 683-688.
In many areas, such as social science, politics or market research, people need to deal with dataset shifting over time. Distribution drift phenomenon usually appears in the field of sentiment analysis, when proportions of instances are changing over time. In this case, the task is to correctly estimate proportions of each sentiment expressed in the ...
Added: November 14, 2017
Karpov N., Babkina T. S., Babkin E., , in : CEUR Workshop Proceedings. 2nd International Workshop on Ontologies and Information Systems, WOIS 2014; Lund; Sweden; 22 September 2014 through 24 September 2014. Vol. 1230.: Lund : CEUR Workshop Proceedings, 2014. P. 43-54.
The paper proposes a new method for facilitating knowledge exchange by seeking relevant university experts for commenting actual information events arising in the open environment of a modern economical cluster. This method is based on a new mathematical model of ontology concepts matching. We propose to use in the formal core of our method a ...
Added: January 20, 2015
Kutuzov A. B., Kuzmenko E., Marakasova A., , in : Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH). : Osaka : [б.и.], 2016. P. 26-34.
We present an approach to detect differences in lexical semantics across English language registers, using word embedding models from distributional semantics paradigm. Models trained on register-specific subcorpora of the BNC corpus are employed to compare lists of nearest associates for particular words and draw conclusions about their semantic shifts depending on register in which they ...
Added: November 12, 2016
Marseille : European Language Resources Association (ELRA), 2022
The proceedings are organised on the basis of the 22 Tracks of the Conference on Language Resources and Evaluation (LREC) held in Marseille, France, from 20 to 25 June 2022. Major topics include corpora and annotation (including tools, systems, treebanks), information extraction and information retrieval (including ner, qa, text mining, document classification, text categorisation), applications involving lrs and evaluation (including ...
Added: February 22, 2023
Kirina M., Человек: образ и сущность. Гуманитарные аспекты 2023
The article focuses on the application of opinion mining techniques to evaluate user experience on the Hyperskill educational platform, using Python, Java, and Kotlin programming projects as the basis of analysis. The study utilizes sentiment analysis and keyword extraction methods to gauge users' attitudes towards the platform, learning process, and topics covered. To achieve this, ...
Added: December 9, 2023
NY : Springer, 2014
This book constitutes the proceedings of the Third International Conference on Analysis of Images, Social Networks and Texts, AIST 2014, held in Yekaterinburg, Russia, in April 2014. The 11 full and 10 short papers were carefully reviewed and selected from 74 submissions. They are presented together with 3 short industrial papers, 4 invited papers and ...
Added: November 4, 2014
Marseille : Association pour le Traitement Automatique des Langues, 2014
Dans la suite du premier atelier TALAf qui s'est tenu le 8 juin 2012 à Grenoble, lors de la conférence JEP-TALN-RECITAL 2012 (voir les actes : http://aclweb.org/anthology//W/W12/#1300), nous proposons une nouvelle édition de cet atelier lors de la conférence TALN 2014 le premier juillet à Marseille.
Cette deuxième édition montre l'intérêt d'un atelier francophone sur le traitement ...
Added: March 26, 2015
Lyashevskaya O., Droganova K., Zeman D. et al., / НИУ ВШЭ. Series WP BRP "Linguistics". 2016. No. 44.
This paper presents the Universal Dependencies tagset (UD v1) as a new annotation scheme for Russian treebanks. The universal list of dependency relations was adopted and extended to comply with certain language-specific syntactic constructions. The tagset was validated, converting two Russian treebanks into the UD format, UD-Russian-SynTagRus and UD-Russian-Google. ...
Added: December 14, 2016
I. K. Kusakin, Fedorets O. V., A. Y. Romanov, Scientific and Technical Information Processing 2023 Vol. 50 No. 3 P. 176-183
This paper discusses modern approaches to natural language processing and the application of machine learning models to the task of classifying short scientific texts in Russian. This study is devoted to the analysis of methods for vectorization of textual information, selection of a model for scientific paper clas- sification, and training of linguistic model BERT ...
Added: November 4, 2023
Switzerland : Springer, 2015
This book constitutes the refereed proceedings of the 6th Conference on Knowledge Engineering and the Semantic Web, KESW 2015, held in Moscow, Russia, in September/October 2015. The 17 revised full papers presented together with 6 short system descriptions were carefully reviewed and selected from 35 submissions. The papers address research issues related to semantic web, ...
Added: September 16, 2015
Springer, 2021
This book constitutes the proceedings of the 19th Russian Conference on Artificial Intelligence, RCAI 2021, held in Moscow, Russia, in October 2021.
The 19 full papers and 7 short papers presented in this volume were carefully reviewed and selected from 80 submissions. The conference deals with a wide range of topics, categorized into the following topical ...
Added: October 28, 2021
Ivan P. Yamshchikov, Shibaev V., Nagaev A. et al., , in : Proceedings of the 3rd Workshop on Neural Generation and Translation. : Association for Computational Linguistics, 2019. P. 128-137.
This paper focuses on latent representations that could effectively decompose different aspects of textual information. Using a framework of style transfer for texts, we propose several empirical methods to assess information decomposition quality. We validate these methods with several state-of-the-art textual style transfer methods. Higher quality of information decomposition corresponds to higher performance in terms ...
Added: January 7, 2021