?
Amateur Prose On The Web: Verb Construction As A Feature Of Genre Classification
.
Builova N.
In our research we studied the dependency structure of the text genre love stories, detective stories, science fiction and fantasy). The novel characteristics (such syntactic attributes as verb constructions and construction of a specific cumulative threshold) which can be additional machine learning parameters were identified. We conducted experiment with novel features and showed that these characteristics can be useful for closely related genre recognition.
Publication based on the results of:
In book
Wohlgenannt G., von Waldenfels R., Toldova S., Rakhilina E. V., Lyashevskaya O., Loukachevitch N. V., Artemova E. Issue 4. , Manchester: EasyChair, 2019.
Omrani H., Emrouznejad A., Teplova T. et al., Engineering Applications of Artificial Intelligence 2024 Vol. 133 No. F Article 108636
Evaluating the efficiency of electricity distribution companies (EDCs) accurately is one of the most important issues for regulators and policy makers. This research combines the results of data envelopment analysis (DEA) and corrected ordinary least squares (COLS) with machine learning techniques to evaluate a set of EDCs in the period 2011–2020. We propose a three-stage ...
Added: May 31, 2024
Koshevoy A., Panova A., Makarchuk I., , in: Proceedings of the Sixth Workshop on Universal Dependencies (UDW, GURT/SyntaxFest 2023).: Washington: Association for Computational Linguistics, 2023. P. 1–6.
In this paper, we discuss the challenges that we faced during the construction of a Universal Dependencies treebank for Abaza, a polysynthetic Northwest Caucasian language. We propose an alternative to the morpheme-level annotation of polysynthetic languages introduced in Park et al. (2021). Our approach aims at reducing the number of morphological features, yet providing all ...
Added: March 20, 2023
Washington: Association for Computational Linguistics, 2023.
Added: March 20, 2023
Yana Shishkina, Lyashevskaya O., , in: Analysis of Images, Social Networks and Texts. 10th International Conference, AIST 2021, Tbilisi, Georgia, December 16–18, 2021, Revised Selected Papers.: Cham: Springer, 2022. P. 137–147.
Enhanced Universal Dependencies (EUD) are enhanced graphs expressed on top of basic dependency trees. EUD support repre- sentation of deeper syntactic relations in constructions such as coordi- nation, gapping, relative clauses, and argument sharing through control and raising. The paper presents experiments on the EUD parsing of the low-resource Belarusian language, for which no corpora ...
Added: January 4, 2022
Lyashevskaya O., Afanasev I., Jazykovedny Casopis 2021 Vol. 72 No. 2 P. 556–567
We present a hybrid HMM-based PoS tagger for Old Church Slavonic. The training corpus is a portion of one text, Codex Marianus (40k) annotated with the Universal Dependencies UPOS tags in the UD-PROIEL treebank. We perform a number of experiments in within-domain and out-of-domain settings, in which the remaining part of Codex Marianus serves as ...
Added: October 21, 2021
Moroz G., , in: Дурхъаси хазна. Сборник статей к 60-летию Р. О. Муталова.: М.: Буки Веди, 2021. P. 258–282.
In this article I present a connection between frequency and length of person-number indexes via two independent researches: token frequency obtained from the Universal Dependencies’ treebanks and type frequency gathered within a typological study. After introducing the results of those two studies, I will present East Caucasian data. I show that the unusual history of ...
Added: May 23, 2021
IEEE, 2020.
2020 43rd International Conference on Telecommunications and Signal Processing (TSP), IEEE Conference Record #49548, which is held during July 7-9, 2020, in Milan, Italy (virtual event), we would like to welcome you in this event and thank you for your personal support and for your papers, which are published in the proceeding. In this respect, ...
Added: October 8, 2020
Durandin O., Malafeev A., , in: Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Kazan, Russia, July 17–19, 2019, Revised Selected Papers. Communications in Computer and Information ScienceVol. 1086.: Springer, 2020. P. 120–131.
In recent works on learning representations for graph structures, methods have been proposed both for the representation of nodes and edges for large graphs, and for representation of graphs as a whole. This paper considers the popular graph2vec approach, which shows quite good results for ordinary graphs. In the field of natural language processing, however, ...
Added: November 16, 2019
Иваненков Я., Жаворонков А., Ямиданов Р. et al., Frontiers in Pharmacology 2019 Vol. 10 No. 913 P. 1–15
Many pharmaceutical companies are avoiding the development of novel antibacterials due to a range of rational reasons and the high risk of failure. However, there is an urgent need for novel antibiotics especially against resistant bacterial strains. Available in silico models suffer from many drawbacks and, therefore, are not applicable for scoring novel molecules with high structural ...
Added: October 1, 2019
Lyashevskaya O., , in: Computational Linguistics and Intellectual TechnologiesIssue 18.: M.: Russian State University for the Humanitie, 2019. P. 422–434.
The paper discusses the standardization efforts to create a morphological standard for the Middle Russian corpus, which is part of the historical collection of the Russian National Corpus (RNC). To meet the needs of different categories of corpus researchers as well as NLP developers, we consider two styles of the morphological annotation (RNC schema and ...
Added: June 12, 2019
Lyashevskaya O., Пантелеева И. М., , in: Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories (TLT 16).: Association for Computational Linguistics, 2017. P. 80–87.
The paper presents a Universal Dependencies (UD) annotation scheme for a learner English corpus. The REALEC dataset consists of essays written in English by Russian-speaking university students in the course of general English. The original corpus is manually annotated for learners’ errors and gives information on the error span, error type, and the possible correction ...
Added: December 11, 2018
Дроганова К. А., Lyashevskaya O., Zeman D., , in: Proceedings of TLT 2018 International Workshop on Treebanks and Linguistic Theories, 13-14 November 2018, Oslo, Norway. NEALT Proceedings Series.: Linköping University Electronic Press, 2018. P. 52–65.
In this paper we focus on syntactic annotation consistency within Universal Dependencies (UD) treebanks for Russian: UD_Russian-SynTagRus, UD_Russian-GSD, UD\_Russian-Taiga, and UD_Russian-PUD. We describe the four treebanks, their distinctive features and development. In order to test and improve consistency within the treebanks, we reconsidered the experiments by Martinez Alonso and Zeman; our parsing experiments were conducted ...
Added: November 6, 2018
Дроганова К. А., Lyashevskaya O., , in: Digital Transformation and Global Society Third International Conference, DTGS 2018, St. Petersburg, Russia, May 30 –June 2, 2018, Revised Selected Papers, Part IIssue 858.: Cham: Springer, 2018. Ch. 31 P. 380–390.
Cross-tagset parsing is based on the substitution of one annotation layer for another while processing data within one language. As often as not, either the native tagger or the dependency parser used in (pre-)annotation of the Gold treebank is not available. The crosstagset approach allows one to annotate new texts using freely available tools or ...
Added: October 10, 2018
Sorokin A., Shavrina T., Lyashevskaya O. et al., , in: Computational Linguistics and Intellectual Technologies. International Conference "Dialogue 2017" ProceedingsVol. 1. Issue 16 (23).: M.: -, 2017. P. 297–313.
MorphoRuEval-2017 is an evaluation campaign designed to stimulate the development of the automatic morphological processing technologies for Russian, both for normative texts (news, fiction, nonfiction) and those of less formal nature (blogs and other social media). This article compares the methods participants used to solve the task of morphological analysis. It also discusses the problem ...
Added: October 9, 2018
Fenogenova A., Kazorin V., Karpov I. et al., , in: Proceedings of Third Workshop "Computational linguistics and language science"Issue 4.: Manchester: EasyChair, 2019. P. 11–17.
Automatic morphological analysis is one of the fundamental and significant tasks of NLP (Natural Language Processing). Due to special features of Internet texts, as they can be both normative texts (news, fiction, nonfiction) and less formal texts (such as blogs and texts from social networks), the morphological tagging has become non-trivial and an actual task. ...
Added: October 5, 2018
Toldova S., Ionov M., , in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 17th International Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2016Vol. 1. Issue 9623.: Springer Publishing Company, 2018. P. 648–662.
This paper concerns discourse-new mention detection in Russian. This might be helpful for different NLP applications such as coreference resolution, protagonist identification, summarization and different tasks of information extraction to detect the mention of an entity newly introduced into discourse. In our work, we are dealing with the Russian where there is no grammatical devices, ...
Added: September 1, 2018
Lyukina E. V., Вестник Новосибирского государственного университета. Серия: Лингвистика и межкультурная коммуникация 2018 Т. 16 № 2 С. 19–33
The paper is dedicated to the initiative of universal dependences (UD), with aim to develop cross-linguistically consistent annotation scheme of grammatical analysis. The purpose of this initiative is in simplification of cross-language research, unification of interlanguage linguistic typology, building a foundation for the automated multilingual systems and the universal cross-language text parser.
In the first part ...
Added: April 21, 2018
Lyashevskaya O., Bocharov V., Sorokin A. et al., Jazykovedny Casopis 2017 Vol. 68 No. 2 P. 258–267
The paper describes the preparation and development of the text collections within the framework of MorphoRuEval-2017 shared task, an evaluation campaign designed to stimulate development of the automatic morphological processing technologies for Russian. The main challenge for the organizers was to standardize all available Russian corpora with the manually verified high-quality tagging to a single ...
Added: January 30, 2018
Lyashevskaya O., Пантелеева И. М., / NRU HSE. Series WP BRP "Linguistics". 2017.
The paper presents a Universal Dependencies (UD) annotation scheme for a learner English corpus. The REALEC dataset consists of essays written in English by Russian-speaking university students in the course of general English. The essays are a part of students' preparation for the independent final examination similar to the international English exam. While adjusting existing ...
Added: December 15, 2017
Ryzhikov A., Ustyuzhanin A., Journal of Physics: Conference Series 2018 Vol. 1085 P. 1–6
In the research, a new approach for finding rare events in high-energy physics was tested. As an example of physics channel the decay of \tau -> 3 \mu is taken that has been published on Kaggle within LHCb-supported challenge. The training sample consists of simulated signal and real background, so the challenge is to train ...
Added: December 11, 2017
Cham: Springer, 2017.
This book constitutes the refereed proceedings of the 4th International Conference on Mining Intelligence and Knowledge Exploration, MIKE 2016, held in Mexico City, Mexico, in November 2016.
The 18 full papers presented were carefully reviewed and selected from 56 submissions.
Accepted papers were grouped into various subtopics including information retrieval, machine learning, pattern recognition, knowledge discovery, classification, clustering, image ...
Added: June 8, 2017