• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Noisy Text Sequences Aggregation as a Summarization Subtask
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 15, 2026
Preserving Rationality in a Period of Turbulence
The HSE International Laboratory for Logic, Linguistics and Formal Philosophy studies logic and rationality in a transformed world characterised by a diversity of logical systems and rational agents. The laboratory supports and develops academic ties with Russian and international partners. The HSE News Service spoke with the head of the laboratory, Prof. Elena Dragalina-Chernaya, about its work.
May 15, 2026
‘All My Time Is Devoted to My Dissertation
Ilya Venediktov graduated from the Master’s programme at the HSE Tikhonov Moscow Institute of Electronics and Mathematics through the combined Master’s–PhD track and is currently studying at the HSE Doctoral School of Engineering Sciences. At present, he is undertaking a long-term research internship at the University of Science and Technology of China in Hefei, where he is preparing his dissertation. In this interview, he explains how an internship differs from an academic mobility programme, discusses his research topic, and describes the daily life of a Russian doctoral student in China.
May 15, 2026
‘What Matters Is Not What You Study, but Who You Study with
Katerina Koloskova began studying Arabic expecting to give it up after a year—now she cannot imagine her life without it. In an interview for the Young Scientists of HSE University project, she spoke about two translated books, an expedition to Socotra, and her love for Bethlehem.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Noisy Text Sequences Aggregation as a Summarization Subtask

Ch. 1. P. 15–20.
Pletnev Sergey

Most speech-driven systems on the first step convert audio to text through an automatic speech recognition (ASR) model and then pass the text to any downstream natural language processing (NLP) modules. However, these ASR models can lead to system failure or undesirable output when being exposed to natural language perturbation or variation in practice. In this paper, we introduce a simple yet efficient model for improving the understanding of the semantics of the input speeches and error correction by processing multi-hypothesis ASR systems.

Language: English
Full text
Text on another site
Keywords: Text summarizationсуммаризацияМашинный перевод текста и речиNeural Language Processing (NLP)ASR

In book

Crowd Science Workshop: Trust, Ethics, and Excellence in Crowdsourced Data Management at Scale (CSW 2021)
Copenhagen, Denmark: CEUR Workshop Proceedings, 2021.
Similar publications
A hybrid lemmatiser for Old Church Slavonic
Afanasev I., / NRU HSE. Series WP BRP "Linguistics". 2021.
The article considers a lemmatiser that is developed specifically for Old Church Slavonic (OCS). The introduction underlines the problem of the lack of lemmatisers that might deal with different datasets of the OCS. The review gives a short description of previous attempts and current trends in lemmatisation. The lemmatiser is hybrid-based and uses the advantages ...
Added: December 28, 2021
Crowd Science Workshop: Trust, Ethics, and Excellence in Crowdsourced Data Management at Scale (CSW 2021)
Copenhagen, Denmark: CEUR Workshop Proceedings, 2021.
The second workshop on Crowd Science is organized in conjunction with the 47th International Conference on Very Large Data Bases (VLDB 2021). This workshop is the second in a series of events that has the goal of helping crowdsourcing “transition” from art to science, and tackles the research challenges that we face to make crowdsourcing ...
Added: December 13, 2021
Reflections of syntactic structures in non­autoregressive language models
Плетенев С. А., В кн.: Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной международной конференции «Диалог» (Москва, 16–19 июня 2021 г.)Issue 20.: Russian State University for the Humanitie, 2021.
Added: December 13, 2021
Double-Blind Peer-Reviewing and Inclusiveness in Russian NLP Conferences
Kutuzov A. B., Никишина И. А., , in: Analysis of Images, Social Networks and Texts. 8th International Conference, AIST 2019, Lecture Notes in Computer Science, Revised Selected PapersVol. 11832.: Cham: Springer, 2019. P. 3–8.
Double-blind peer reviewing has been proved to be a pretty effective and fair way of academic work selection. However, to the best of our knowledge, nobody has yet analysed the effects caused by its introduction at the Russian NLP conferences. We investigate how the double-blind peer reviewing influences gender and location (according to authors’ affiliations) ...
Added: January 20, 2020
Proceedings of Third Workshop "Computational linguistics and language science"
Wohlgenannt G., von Waldenfels R., Toldova S. et al., Manchester: EasyChair, 2019.
The EPiC Series in Language and Linguistics publishes high quality collections of papers in language, linguistics and related areas. ...
Added: September 9, 2019
Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing
Пономарева М. А., Дроганова К. А., Smurov I. et al., Florence: Association for Computational Linguistics, 2019.
This paper provides a comprehensive overview of the gapping dataset for Russian that consists of 7.5k sentences with gapping (as well as 15k relevant negative sentences) and comprises data from various genres: news, fiction, social media and technical texts. The dataset was prepared for the Automatic Gapping Resolution Shared Task for Russian (AGRR-2019) - a ...
Added: September 5, 2019
Lemmatization for ancient languages: Rules or neural networks?
Dereza O., , in: Artificial Intelligence and Natural Language, 7th International Conference, AINL 2018, St. Petersburg, Russia, October 17–19, 2018, ProceedingsIssue 930.: Switzerland: Springer, 2018. P. 35–47.
Lemmatisation, which is one of the most important stages of text preprocessing, consists in grouping the inflected forms of a word together so they can be analysed as a single item. This task is often considered solved for most modern languages irregardless of their morphological type, but the situation is dramatically different for ancient languages. Rich inflectional system and ...
Added: November 14, 2018
Система автоматического аннотирования текстов с помощью стохастической модели
Voznesenskaya T., Леднов Д. А., Машинное обучение и анализ данных 2018 Т. 4 № 4 С. 266–279
This paper is toward the system of automatic text summarization developed by «DC – Systems» company in cooperation with the faculty of computer science at HSE. The summary is a concise description of the text in terms of its content and meaning, i.e. from the point of view of its semantics. The purpose of the ...
Added: October 5, 2018
Bayesian Compression for Natural Language Processing
Chirkova N., Lobacheva E., Vetrov D., , in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.: Association for Computational Linguistics, 2018. P. 2910–2915.
In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters. The majority of these parameters are often concentrated in the embedding layer, which size grows proportionally to the vocabulary length. We propose a Bayesian sparsification technique for RNNs which allows ...
Added: September 5, 2018
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Association for Computational Linguistics, 2018.
Added: September 5, 2018
Investigation and development of the intelligent voice assistant for the Internet of Things using machine learning
Rolich A., Polyakov E. V., Voskov L. et al., , in: 2018 Moscow Workshop on Electronic and Networking Technologies (MWENT). Proceedings.: M.: IEEE, 2018. P. 1–5.
Artificial intelligence technologies are beginning to be actively used in human life, this is facilitated by the appearance and wide dissemination of the Internet of Things (IoT). Autonomous devices are becoming smarter in their way to interact with both a human and themselves. New capacities lead to creation of various systems for integration of smart ...
Added: May 3, 2018
Разработка интеллектуального голосового ассистента и исследование обучающей способности алгоритмов распознавания естественного языка
Polyakov E. V., Мажанов М. С., Качалова М. В. et al., Системный администратор 2017 № 12 С. 80–85
The development of cognitive technologies contributes to the effective introduction of Artificial Intelligence into the everyday life of a person. New interfaces for device-human interaction appear. Understanding the natural language of human is one of the most promising areas of the development of Artificial Intelligence. Voice assistants are a striking example of such systems, they ...
Added: December 10, 2017
23rd International Symposium on Methodologies for Intelligent Systems - Proceedings
Birkhauser/Springer, 2017.
This book constitutes the proceedings of the 23rd International Symposium on Foundations of Intelligent Systems, ISMIS 2017, held in Warsaw, Poland, in June 2017. The 56 regular and 15 short papers presented in this volume were carefully reviewed and selected from 118 submissions. The papers include both theoretical and practical aspects of machine learning, data mining ...
Added: September 18, 2017
Компьютерная лингвистика и интеллектуальные технологии: По мате­риалам ежегодной международной конференции «Диалог» (Москва, 31 мая — 3 июня 2017 г.). Вып. 16 (23): В 2 т.
М.: Изд-во РГГУ, 2017.
The 16th issue of the annual report “Computational Linguistics and Intellectual Technologies” contains the selected materials of the 23rd international conference “Dialogue”. The presented works reflect the areas of research in computational modelling and analysis of natural language that are traditionally represented at the conference. ...
Added: March 15, 2017
Настройка нелинейной модели данных экспериментов с экспрессионными ДНК-микрочипами
Рябенко Е.А., Математическая биология и биоинформатика 2012 Т. 7 № 2 С. 554–566
Рассматривается нелинейная модель данных ДНК-микрочипов, в которой интенсивность флуоресценции проб описывается функцией Лэнгмюра. Разработан метод настройки параметров модели на основе общедоступных данных нескольких тысяч экспериментов, основанный на минимизации функции потерь из класса AB-дивергенций; для выбора оптимальных значений гиперпараметров проведены численные эксперименты. Полученная модель описывает интенсивности флуоресценции проб микрочипа точнее стандартной линейной, а полученные на её ...
Added: October 14, 2016
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit