Generating Sport Summaries: A Case Study for Russian

Malykh V.; Porplenko D.; E. Tutubalina

doi:10.1007/978-3-030-72610-2_11

Publications

Сhapter

Generating Sport Summaries: A Case Study for Russian

P. 149–161.

Malykh V., Porplenko D., Tutubalina E.

We present a novel dataset of sports broadcasts with 8,781 games. The dataset contains 700 thousand comments and 93 thousand related news documents in Russian. We run an extensive series of experiments of modern extractive and abstractive approaches. The results demonstrate that BERT-based models show modest performance, reaching up to 0.26 ROUGE-1F-measure. In addition, human evaluation shows that neural approaches could generate feasible although inaccurate news basing on broadcast text.

Keywords: natural language processing deep learning глубокие нейронные сети автоматическая обработка текстов

In book

Analysis of Images, Social Networks and Texts: 9th International Conference, AIST 2020, Skolkovo, Moscow, Russia, October 15–16, 2020, Revised Selected Papers

Vol. 12602. , Springer, 2021.

Generalized approach to sentiment analysis of short text messages in natural language processing

Polyakov E. V., Voskov L., Abramov P. et al., Informatsionno-upravliaiushchie sistemy [Information and Control Systems] 2020 No. 1 P. 2–14.

Introduction: Sentiment analysis is a complex problem whose solution essentially depends on the context, field of study and amount of text data. Analysis of publications shows that the authors often do not use the full range of possible data transformations and their combinations. Only a part of the transformations is used, limiting the ways to ...

Added: February 20, 2020

So What’s the Plan? Mining Strategic Planning Documents

Artemova E., Batura T., Golenkovskaya A. et al., , in: Digital Transformation and Global Society. DTGS 2020Vol. 1242.: Springer, 2020. P. 208–222..

In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-government research. We demonstrate the pipeline for creating a text corpus from scratch. First, the annotation schema ...

Added: May 10, 2021

Research of heuristic approaches for determining the tonality of text messages in natural language processing problems

Polyakov E. V., Polyakov S. V., Abramov P., , in: Proceedings of 2019 XVI International Symposium "Problems of Redundancy in Information and Control Systems" (REDUNDANCY).: IEEE, 2019. P. 159–164..

Determining the tonality of the text is a difficult task, the solution of which essentially depends on the context, the field of study and the amount of text data. The analysis shows that the authors in their works do not jointly use the full range of possible transformations on the data and their combinations. The ...

Added: September 20, 2020

Intelligent Systems and Applications

Cham: Springer, 2019..

Intelligent Systems Conference (IntelliSys) 2018 is the fourth research conference in the series. This conference is a part of SAI conferences being held since 2013. The conference series has featured keynote talks, special sessions, poster presentation, tutorials, workshops, and contributed papers each year. The conference focus on areas of intelligent systems and artificial intelligence (AI) and ...

Added: August 29, 2018

Aspect-Based Sentiment Analysis of Russian Hotel Reviews

Рыбаков В. В., Malafeev A., , in: Supplementary Proceedings of the 7th International Conference on Analysis of Images, Social Networks and Texts (AIST-SUP 2018), Moscow, Russia, July 5-7, 2018.: Aachen: CEUR Workshop Proceedings, 2018. Ch. 8 P. 75–84..

The paper presents an attempt to solve the task of aspect-based sentiment analysis in the domain of Russian-language hotel reviews, using distributed representation of words. The authors follow an approach similar to [Blinov, Kotelnikov, 2014], but applied to a different domain and using different parameters. The authors also present a new dataset that is made ...

Added: February 15, 2019

Использование сверточных нейронных сетей для реидентификации людей в городских условиях

Сучков Е. П., Алексеенко Г. О., Налчаджи К. В., Интеллектуальные системы. Теория и приложения 2022 Т. 26 № 1 С. 250–254.

Currently, video surveillance systems are becoming more widespread. One of the main goals of such systems is to control and track a person’s movement. The solution of this problem allows us to solve such applied problems as tracking the occupancy of various premises (whether shopping facilities or educational and cultural institutions), creating a motion heatmap or organizing control of access to ...

Added: January 31, 2023

Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected Papers

Switzerland: Springer, 2019..

This book constitutes the refereed proceedings of the 11th International Conference on Intelligent Data Processing, IDP 2016, held in Barcelona, Spain, in October 2016. The 11 revised full papers were carefully reviewed and selected from 52 submissions. The papers of this volume are organized in topical sections on machine learning theory with applications; intelligent data processing in life ...

Added: February 8, 2020

Lost in Conversation: A Conversational Agent Based on the Transformer and Transfer Learning

Golovanov S., Tselousov A., Rauf Kurbanov et al., , in: The NeurIPS '18 Competition: From Machine Learning to Intelligent Conversations.: Springer, 2020. P. 295–315..

Added: February 20, 2021

Количественная оценка грамматической неоднозначности некоторых европейских языков

Klyshinskiy E., Логачёва В. К., Карпик О. В. et al., Вестник Новосибирского государственного университета. Серия: Лингвистика и межкультурная коммуникация 2020 Т. 18 № 1 С. 5–21.

The grammatical ambiguity (multiple sets of grammatical features for one word form or coinciding surface forms of different words) can be of different types. We describe six classes of grammatical ambiguity: unambiguous, ambiguous by grammatical features, by part of speech, by lemma, by lemma and part of speech, and out-of-vocabulary words. These classes are presented ...

Added: December 11, 2019

Recognition of the Bare Soil Using Deep Machine Learning Methods to Create Maps of Arable Soil Degradation Based on the Analysis of Multi-Temporal Remote Sensing Data

Rukhovich D., Koroleva P., Rukhovich D. et al., Remote Sensing 2022 Vol. 14 No. 9 Article 2224.

The detection of degraded soil distribution areas is an urgent task. It is difficult and very time consuming to solve this problem using ground methods. The modeling of degradation processes based on digital elevation models makes it possible to construct maps of potential degradation, which may differ from the actual spatial distribution of degradation. The ...

Added: November 14, 2022

The Russian Drug Reaction Corpus and Neural Models for Drug Reactions and Effectiveness Detection in User Reviews

Tutubalina E., Алимова И. С., Мифтахутдинов З. et al., Bioinformatics 2021 Vol. 37 No. 2 P. 243–249.

Drugs and diseases play a central role in many areas of biomedical research and healthcare. Aggregating knowledge about these entities across a broader range of domains and languages is critical for information extraction (IE) applications. To facilitate text mining methods for analysis and comparison of patient’s health conditions and adverse drug reactions reported on the ...

Added: January 13, 2021

Информационные модели в задачах обработки текстов на естественных языках. Второе издание, переработанное

Chepovskiy A., М.: Национальный открытый университет «ИНТУИТ», 2015..

В монографии рассмотрены различные математические модели для решения практических задач обработки текстов на естественных языках. Предлагаются решения проблем, возникающих при организации индексации и последующего поиска данных. Методы компьютерной лингвистики применяются для прикладных исследований. Предназначена для разработчиков информационных систем, специалистов в области компьютерной лингвистики. ...

Added: May 22, 2015

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Association for Computational Linguistics, 2019..

The 4th Workshop on Representation Learning for NLP (RepL4NLP) will be hosted by ACL 2019 and held on 2 August 2019. The workshop is being organised by Isabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Alexis Conneau, Johannes Welbl, Xian Ren and Marek Rei; and advised by Kyunghyun Cho, Edward Grefenstette, Karl Moritz ...

Added: October 31, 2019

Distributed Deep Learning In Open Collaborations

Diskin M., Bukhtiyarov A., Ryabinin M. et al., , in: Advances in Neural Information Processing Systems 34 (NeurIPS 2021).: Curran Associates, Inc., 2021. P. 7879–7897..

Added: November 24, 2021

Automatic Morphemic Analysis of Russian Words

Мальтина Л. П., Malafeev A., , in: Supplementary Proceedings of the 7th International Conference on Analysis of Images, Social Networks and Texts (AIST-SUP 2018), Moscow, Russia, July 5-7, 2018.: Aachen: CEUR Workshop Proceedings, 2018. Ch. 9 P. 85–94..

The paper considers the task of the morphemic analysis of Russian words and compares the efficiency of several proposed models. These models can be divided into three groups: derivational and inflectional rule-based, proba- bilistic, and hybrid models. The latter achieved state-of-the-art results of 0.848 F-score on a test set of 500 Russian words. The models ...

Added: February 15, 2019

Large-scale transfer learning for natural language generation

Golovanov S., Rauf Kurbanov, Sergey Nikolenko et al., , in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.: Association for Computational Linguistics, 2019. P. 6053–6058..

Large-scale pretrained language models define state of the art in natural language processing, achieving outstanding performance on a variety of tasks. We study how these architectures can be applied and adapted for natural language generation, comparing a number of architectural and training schemes. We focus in particular on open-domain dialog as a typical high entropy ...

Added: February 20, 2021

Analysis of Images, Social Networks and Texts. 10th International Conference, AIST 2021, Tbilisi, Georgia, December 16–18, 2021, Revised Selected Papers

Cham: Springer, 2022..

This book constitutes revised selected papers from the 9th International Conference on Analysis of Images, Social Networks and Texts, AIST 2020, held during December 16-18, 2021. The world of Data Science changes every year. At AIST, we exchange our understanding of the Science state-of-the-art, as well as how it applies to life and business. AIST ...

Added: January 4, 2022

SocialBERT – Transformers for Online Social Network Language Modelling

Ilia Karpov, Nick Kartashev, , in: Analysis of Images, Social Networks and Texts. 10th International Conference, AIST 2021, Tbilisi, Georgia, December 16–18, 2021, Revised Selected Papers.: Cham: Springer, 2022. P. 1–10..

The ubiquity of the contemporary language understanding tasks gives relevance to the development of generalized, yet highly efficient models that utilize all knowledge, provided by the data source. In this work, we present SocialBERT - the first model that uses knowledge about the author’s position in the network during text analysis. We investigate possible models ...

Added: October 31, 2021

Russian Q&A Method Study: From Naive Bayes to Convolutional Neural Networks

Nikolaev K., Malafeev A., , in: Analysis of Images, Social Networks and Texts. 7th International Conference AIST 2018.: Springer, 2018. Ch. 12 P. 121–126..

This paper deals with automatic classification of questions in the Russian language. In contrast to previously used methods, we introduce a convolutional neural network for question classification. We took advantage of an existing corpus of 2008 questions, manually annotated in accordance with a pragmatic 14-class typology. We modified the data by reducing the typology to ...

Added: February 15, 2019

Data-Driven Short-Term Daily Operational Sea Ice Regional Forecasting

Grigoryev T., Verezemskaya P., Krinitskiy M. et al., Remote Sensing 2022 Vol. 14 No. 22 Article 5837.

Global warming has made the Arctic increasingly available for marine operations and created a demand for reliable operational sea ice forecasts to increase safety. Because ocean-ice numerical models are highly computationally intensive, relatively lightweight ML-based methods may be more efficient for sea ice forecasting. Many studies have exploited different deep learning models alongside classical approaches ...

Added: June 19, 2023

International Conference Recent Advances in Natural Language Processing, RANLP 2021

Association for Computational Linguistics, 2021..

Natural Language Processing (NLP) has benefited from promising recent advances including the employment of latest deep learning technology amongst a host of other solutions. The current pandemic has prevented the in-person exchange of ideas and networking of NLP researchers and students, but virtual communication opportunities have enabled continued collaboration and provided alternative communication channels. While ...

Added: September 27, 2021

Social media-based opinion retrieval for product analysis using multi-task deep neural networks

Gozuacik N., Sakar C. O., Ozcan S., Expert Systems with Applications 2021 Vol. 183 No. 30 November 2021 P. 1–13.

Social media platforms are considered one of the most effective intermediaries for companies to interact with consumers. Social media-based decision support systems for the marketing domain are highly developed, but product development and innovation-oriented studies remain limited. This study offers a novel approach which utilises opinion retrieval theme along with sentiment analysis to support the ...

Added: December 12, 2021

Deep Learning for the Russian Language

Artemova E., , in: The Palgrave Handbook of Digital Russia Studies.: Palgrave Macmillan, 2021. Ch. 26 P. 465–481..

Deep learning is a term used to describe artificial intelligence (AI) technologies. AI deals with how computers can be used to solve complex problems in the same way that humans do. Such technologies as computer vision (CV) and natural language processing (NLP) are distinguished as the largest AI areas. To imitate human vision and the ...

Added: December 20, 2020

DeepADEMiner: a deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug event mentions on Twitter

Magge A., Tutubalina E., Miftahutdinov Z. et al., Journal of the American Medical Informatics Association : JAMIA 2021 Vol. 28 No. 10 P. 2184–2192.

Objective Research on pharmacovigilance from social media data has focused on mining adverse drug events (ADEs) using annotated datasets, with publications generally focusing on 1 of 3 tasks: ADE classification, named entity recognition for identifying the span of ADE mentions, and ADE mention normalization to standardized terminologies. While the common goal of such systems is to ...

Added: October 1, 2021

Generalized approach to sentiment analysis of short text messages in natural language processing

Polyakov E. V., Voskov L., Abramov P. et al., Informatsionno-upravliaiushchie sistemy [Information and Control Systems] 2020 No. 1 P. 2–14.

Added: February 20, 2020

So What’s the Plan? Mining Strategic Planning Documents

Artemova E., Batura T., Golenkovskaya A. et al., , in: Digital Transformation and Global Society. DTGS 2020Vol. 1242.: Springer, 2020. P. 208–222..

Added: May 10, 2021

Research of heuristic approaches for determining the tonality of text messages in natural language processing problems

Polyakov E. V., Polyakov S. V., Abramov P., , in: Proceedings of 2019 XVI International Symposium "Problems of Redundancy in Information and Control Systems" (REDUNDANCY).: IEEE, 2019. P. 159–164..

Added: September 20, 2020

Intelligent Systems and Applications

Cham: Springer, 2019..

Added: August 29, 2018

Aspect-Based Sentiment Analysis of Russian Hotel Reviews

Added: February 15, 2019

Использование сверточных нейронных сетей для реидентификации людей в городских условиях

Сучков Е. П., Алексеенко Г. О., Налчаджи К. В., Интеллектуальные системы. Теория и приложения 2022 Т. 26 № 1 С. 250–254.

Added: January 31, 2023

Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected Papers

Switzerland: Springer, 2019..

Added: February 8, 2020

Lost in Conversation: A Conversational Agent Based on the Transformer and Transfer Learning

Golovanov S., Tselousov A., Rauf Kurbanov et al., , in: The NeurIPS '18 Competition: From Machine Learning to Intelligent Conversations.: Springer, 2020. P. 295–315..

Added: February 20, 2021

Количественная оценка грамматической неоднозначности некоторых европейских языков

Added: December 11, 2019

Recognition of the Bare Soil Using Deep Machine Learning Methods to Create Maps of Arable Soil Degradation Based on the Analysis of Multi-Temporal Remote Sensing Data

Rukhovich D., Koroleva P., Rukhovich D. et al., Remote Sensing 2022 Vol. 14 No. 9 Article 2224.

Added: November 14, 2022

The Russian Drug Reaction Corpus and Neural Models for Drug Reactions and Effectiveness Detection in User Reviews

Tutubalina E., Алимова И. С., Мифтахутдинов З. et al., Bioinformatics 2021 Vol. 37 No. 2 P. 243–249.

Added: January 13, 2021

Информационные модели в задачах обработки текстов на естественных языках. Второе издание, переработанное

Chepovskiy A., М.: Национальный открытый университет «ИНТУИТ», 2015..

Added: May 22, 2015

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Association for Computational Linguistics, 2019..

Added: October 31, 2019

Distributed Deep Learning In Open Collaborations

Diskin M., Bukhtiyarov A., Ryabinin M. et al., , in: Advances in Neural Information Processing Systems 34 (NeurIPS 2021).: Curran Associates, Inc., 2021. P. 7879–7897..

Added: November 24, 2021

Automatic Morphemic Analysis of Russian Words

Added: February 15, 2019

Large-scale transfer learning for natural language generation

Added: February 20, 2021

Analysis of Images, Social Networks and Texts. 10th International Conference, AIST 2021, Tbilisi, Georgia, December 16–18, 2021, Revised Selected Papers

Cham: Springer, 2022..

Added: January 4, 2022

SocialBERT – Transformers for Online Social Network Language Modelling

Added: October 31, 2021

Russian Q&A Method Study: From Naive Bayes to Convolutional Neural Networks

Nikolaev K., Malafeev A., , in: Analysis of Images, Social Networks and Texts. 7th International Conference AIST 2018.: Springer, 2018. Ch. 12 P. 121–126..

Added: February 15, 2019

Data-Driven Short-Term Daily Operational Sea Ice Regional Forecasting

Grigoryev T., Verezemskaya P., Krinitskiy M. et al., Remote Sensing 2022 Vol. 14 No. 22 Article 5837.

Added: June 19, 2023

International Conference Recent Advances in Natural Language Processing, RANLP 2021

Association for Computational Linguistics, 2021..

Added: September 27, 2021

Social media-based opinion retrieval for product analysis using multi-task deep neural networks

Gozuacik N., Sakar C. O., Ozcan S., Expert Systems with Applications 2021 Vol. 183 No. 30 November 2021 P. 1–13.

Added: December 12, 2021

Deep Learning for the Russian Language

Artemova E., , in: The Palgrave Handbook of Digital Russia Studies.: Palgrave Macmillan, 2021. Ch. 26 P. 465–481..

Added: December 20, 2020

DeepADEMiner: a deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug event mentions on Twitter

Magge A., Tutubalina E., Miftahutdinov Z. et al., Journal of the American Medical Informatics Association : JAMIA 2021 Vol. 28 No. 10 P. 2184–2192.

Added: October 1, 2021