• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Classification Models for RST Discourse Parsing of Texts In Russian
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
June 5, 2026
Neural Network Maps as a Method for Constructing Mathematical Models
Scientists from HSE University–Nizhny Novgorod and the Institute of Physics Belgrade, Serbia, are jointly exploring the application of machine learning techniques and neural networks to the study of nonlinear dynamics. Natalya Stankevich, Leading Research Fellow at the Laboratory of Topological Methods in Dynamics of the Faculty of Informatics, Mathematics, and Computer Science at HSE University–Nizhny Novgorod, spoke to the HSE News Service about this international project.
June 5, 2026
‘In the Age of Technology, It Is Interesting to Look into the Past and Think about What We Can Take from It
Polina Tabakova decided to apply for a Philology degree at HSE in Nizhny Novgorod because she grew up in Mari El and did not want to move far away from the Russian forests. In an interview for the Young Scientists of HSE University project, she spoke about the genre of the campus novel, the existential drama of Kolobok, and a blackout version of Eugene Onegin.
June 5, 2026
HSE Scientists Develop Method to Compress Large Language Models Without Losing Quality
Researchers from the AI and Digital Science Institute at the HSE Faculty of Computer Science have developed a new compression method for large language models such as GPT and LLaMA that reduces their size by 25–36% without additional training or significant loss of accuracy. This is the first approach to use mathematical transformations—specifically, rotations of model weights—to make models more amenable to compression with structured matrices. The study results have been published in ACL Findings 2025. The code is available on GitHub.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Classification Models for RST Discourse Parsing of Texts In Russian

P. 163–176.
Chistova E., Shelmanov A., Kobozeva M., Pisarevskaya D., Smirnov I., Toldova S.

The paper considers the task of automatic discourse parsing of texts in Russian. Discourse parsing is a well-known approach to capturing text semantics across boundaries of single sentences. Discourse annotation was found to be useful for various tasks including summarization, sentiment analysis, question-answering. Recently, the release of manually annotated Ru-RSTreebank corpus unlocked the possibility of leveraging supervised machine learning techniques for creating such parsers for the Russian language. The corpus provides the discourse annotation in a widely adopted formalization – Rhetorical Structure Theory. In this work, we develop feature sets for rhetorical relation classification in Russian-language texts, investigate the importance of various types of features, and report results of the first experimental evaluation of machine learning models trained on Ru-RSTreebank corpus. We consider various machine learning methods including gradient boosting, neural network, and ensembling of several models by soft voting.

Language: English
Text on another site
Keywords: RST treebankFeature Engineeringdiscourse parsing

In book

Computational Linguistics and Intellectual Technologies Papers from the Annual International Conference “Dialogue” (2019)
Issue 18. , M.: Russian State University for the Humanitie, 2019.
Similar publications
A Feature Engineering Framework for Computer Vision Based on Topological Data Analysis
Абрамов А. С., Chernyshev V. L., Mikhaylets E. et al., / Series Social Science Research Network "Social Science Research Network". 2025.
Computer vision is one of the most relevant modern research areas with broad practical applications. However, traditional solutions based on deep learning have signicant limitations and can be misleading. Topological data analysis, on the other hand, is a modern approach to solving similar problems using mathematically deterministic methods of algebraic topology that reduce the risk ...
Added: September 23, 2025
Entropy-based text feature engineering approach for forecasting financial liquidity changes
Aleksei Riabykh, Suleimanov I., Nagovitcyn I. et al., EPJ Data Science 2025 Vol. 14 Article 17
Changes in individual and institutional financial behavior leading to shifts in liquidity flows often depend on events reflected in news. However, the task of establishing relationship between financial behavior and news remains challenging and understudied. We propose a news-based feature generation approach that allows accounting for news events in liquidity flow time-series predicting tasks, thereby ...
Added: August 5, 2025
Reducing False Positives in Bank Anti-fraud Systems Based on Rule Induction in Distributed Tree-based Models
Ivan Vorobyev, Krivitskaya A., Computers and Security 2022 Vol. 120 Article 102786
Fraud detection in bank payments transactions suffers from a high number of false positives. To deal with this problem, we introduce a rules generation framework for a fraud-detection system – an automatic rules generation using distributed tree-based ML (machine learning) algorithms such as Decision Tree, Random Forest and Gradient Boosting, where the components of expert ...
Added: June 8, 2022
Automated Metaphor Identification in Russian and Its Implications for Metaphor Studies
Badryzlova Y., Lyashevskaya O., Nikiforova A., , in: Distributed Computing and Artificial Intelligence, Volume 2: Special Sessions 18th International Conference (Lecture Notes in Networks and Systems 332)Vol. 2.: Springer, 2022. Ch. 8 P. 86–96.
Added: September 17, 2021
Proceedings of the First Workshop on Computational Approaches to Discourse
Association for Computational Linguistics, 2020.
Added: November 18, 2020
Proceedings of DISRPT 2019 - The Workshop on Discourse Relation Parsing and Treebanking. NAACL HLT 2019
Association for Computational Linguistics, 2019.
This book summarizes the main topics at the 2019 workshop on Discourse Relation Parsing and Treebanking (DISRPT 2019). Co-located with NAACL 2019 in Minneapolis, the workshop’s aim was to bring together researchers working on corpus-based and computational approaches to discourse relations. In addition to an invited talk, eighteen papers outlined below were presented, four of which ...
Added: April 22, 2020
A Multi-Feature Classifier for Verbal Metaphor Identification in Russian Texts
Badryzlova Y., Panicheva P., , in: Artificial Intelligence and Natural Language, 7th International Conference, AINL 2018, St. Petersburg, Russia, October 17–19, 2018, ProceedingsIssue 930.: Switzerland: Springer, 2018. Ch. 3 P. 23–34.
The paper presents a supervised machine learning experiment with multiple features for identification of sentences containing verbal metaphors in raw Russian text. We introduce the custom-created training dataset, describe the feature engineering techniques, and discuss the results. The following set of features is applied: distributional semantic features, lexical and morphosyntactic co-occurrence frequencies, flag words, quotation ...
Added: August 30, 2018
Rhetorical relation markers in Russian RST Treebank
Toldova S., Dina Pisarevskaya, Ananyeva M. et al., , in: Proceedings of the 6th Workshop on Recent Advances in RST and Related Formalisms.: Stroudsburg, PA: Association for Computational Linguistics, 2017.
The paper deals with the pilot version of the first RST discourse treebank for Russian. The project started in 2016. At present, the tree bank consists of sixty news texts annotated for rhetorical relations according to RST scheme. However, this scheme was slightly modified in order to achieve higher inter-annotator agreement score. During the annotation pro cedure, we also registered the discourse con nectives ...
Added: November 6, 2017
Proceedings of the 6th Workshop on Recent Advances in RST and Related Formalisms
Stroudsburg, PA: Association for Computational Linguistics, 2017.
Added: November 6, 2017
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit