• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • A Hybrid Approach to the Analysis of a Collection of Research Papers
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

A Hybrid Approach to the Analysis of a Collection of Research Papers

P. 423–433.
Mirkin B., Frolov D., Vlasov A., Nascimento S., Fenner T.

We define and find a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a taxonomy. This generalization lifts the set to a “head subject” in the higher ranks of the taxonomy, that is supposed to “tightly” cover the query set, possibly bringing in some errors, both “gaps” and “offshoots”. Our hybrid method involves two more automated analysis techniques: a fuzzy clustering method, FADDIS, involving both additive and spectral properties, and a purely structural string-to-text relevance measure based on suffix trees annotated by frequencies. We apply this to extract research tendencies from two collections of research papers: (a) about 18000 research papers published in Springer journals on data science for 20 years, and (b) about 27000 research papers retrieved from Springer and Elsevier journals in response to data science related queries. We consider a taxonomy of Data Science based on the Association for Computing Machinery Classification of Computing System (ACM-CCS 2012). Our findings allow us to make some comments on the tendencies of research that cannot be derived by using more conventional techniques.

Language: English
Full text
DOI
Text on another site
Keywords: hybrid approachannotated suffix tree Generalizationfuzzy clusterresearch tendency
Publication based on the results of:
Разработка методов структуризации и концептуализации текстовых данных на основе таксономии предметной области (2019)

In book

Intelligent Data Engineering and Automated Learning – IDEAL 2020/ 21st International Conference, Guimaraes, Portugal, November 4–6, 2020, Proceedings, Part II
Intelligent Data Engineering and Automated Learning – IDEAL 2020/ 21st International Conference, Guimaraes, Portugal, November 4–6, 2020, Proceedings, Part II
Vol. 12490: Lecture Notes in Computer Science. , Cham: Springer, 2020.
Similar publications
The Benefits of Query-Based KGQA Systems for Complex and Temporal Questions in LLM Era
Alekseev A., Chaichuk M., Butko M. et al., , in: 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Kanazawa, Japan, July 4–6, 2025, Proceedings, Part I. Natural Language Processing and Information Systems. (LNCS, volume 15836)* I. Vol. 15836.: Springer, 2025. P. 426–441.
Large language models excel in question-answering (QA) but struggle with multi-hop reasoning and temporal questions. Query-based knowledge graph QA (KGQA) offers a modular alternative by generating executable queries instead of direct answers. We explore multi-stage query-based framework for WikiData QA, proposing multi-stage approach that enhances performance on challenging multi-hop and temporal benchmarks. Through generalization and ...
Added: February 3, 2026
Modeling Generalization in Domain Taxonomies Using a Maximum Likelihood Criterion
Zhirayr Hayrapetyan, Nascimento S., Trevor F. et al., , in: Information Systems and Technologies: WorldCIST 2022, Volume 2Issue 469.: Springer, 2022. P. 141–147.
We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its “head subject” node in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly involving some ...
Added: November 18, 2022
A hybrid lemmatiser for Old Church Slavonic
Afanasev I., / NRU HSE. Series WP BRP "Linguistics". 2021.
The article considers a lemmatiser that is developed specifically for Old Church Slavonic (OCS). The introduction underlines the problem of the lack of lemmatisers that might deal with different datasets of the OCS. The review gives a short description of previous attempts and current trends in lemmatisation. The lemmatiser is hybrid-based and uses the advantages ...
Added: December 28, 2021
A Hybrid Approach to Interpretable Analysis of Research Paper Collections
Mirkin B., Frolov D., Vlasov A. et al., , in: WIMS 2020: Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics.: Association for Computing Machinery (ACM), 2020. P. 184–189.
We define and find a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a taxonomy. This generalization lifts the set to a “head subject” in the higher ranks of the taxonomy, that is supposed to “tightly” cover the query set, possibly bringing in some errors, both ...
Added: August 28, 2020
Computational Generalization in Taxonomies Applied to: (1) Analyze Tendencies of Research and (2) Extend User Audiences
Frolov D., Mirkin B., Nascimento S. et al., , in: Intelligent Data Engineering and Automated Learning – IDEAL 2019Vol. 2.: Springer, 2019. P. 3–11.
We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its “head subject” node in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly bringing in some errors referred to ...
Added: December 7, 2019
Intelligent Data Engineering and Automated Learning – IDEAL 2019
Springer, 2019.
We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its “head subject” node in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly bringing in some errors referred to ...
Added: December 7, 2019
Using Taxonomy Tree to Generalize a Fuzzy Thematic Cluster
Frolov D., Mirkin B., Nascimento S. et al., , in: Fuzzy Systems (FUZZ-IEEE), IEEE International Conference Proceedings.: IEEE, 2019. P. 1–6.
This paper presents an algorithm, ParGenFS, for generalizing, or “lifting”, a fuzzy set of topics to higher ranks of a hierarchical taxonomy of a research domain. The algorithm ParGenFS finds a globally optimal generalization of the topic set to minimize a penalty function, by balancing the number of introduced “head subjects” and related errors, the ...
Added: October 30, 2019
Parsimonious Generalization of Fuzzy Thematic Sets in Taxonomies Applied to the Analysis of Tendencies of Research in Data Science
Frolov D., Nascimento S., Fenner T. et al., Information Sciences 2020 Vol. 512 P. 595–615
This paper proposes a novel method, referred to as ParGenFS, for finding a most specific generalization of a query set represented by a fuzzy set of topics assigned to leaves of the rooted tree of a taxonomy. The query set is generalized by “lifting” it to one or more “head subjects” in the higher ranks ...
Added: October 9, 2019
A Method for Audience Extending in Programmatic Advertising by Using Parsimonious Generalization of User Segments
Frolov D., Taran Z., Mirkin B., , in: International Conference on Human Interaction and Emerging Technologies.: Springer, 2020. P. 837–841.
We propose a novel method for efficient target audience augmentation in programmatic digital advertising. This method utilizes a novel ParGenFS algorithm for most adequate generalization in taxonomies which was developed by the authors in a joint work. The ParGenFS extends user segments by parsimoniously lifting them off-line as a fuzzy set over IAB content taxonomy ...
Added: July 31, 2019
Globally Optimal Parsimoniously Lifting a Fuzzy Query Set Over a Taxonomy Tree
Frolov D., Mirkin B., Nascimento S. et al., , in: Optimization of Complex Systems: Theory, Models, Algorithms and Applications.: Switzerland: Springer Publishing Company, 2020. P. 779–789.
This paper presents a relatively rare case of an optimization problem in data analysis to admit a globally optimal solution by a recursive algorithm. We are concerned with finding a most specific generalization of a fuzzy set of topics assigned to leaves of domain taxonomy represented by a rooted tree. The idea is to “lift” ...
Added: June 25, 2019
Using Domain Taxonomy to Model Generalization of Thematic Fuzzy Clusters
Frolov D., Mirkin B., Nascimento S. et al., , in: CONTENT 2019, The Eleventh International Conference on Creative Content Technologies.: International Academy, Research, and Industry Association (IARIA), 2019. P. 20–25.
We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its 'head subject' in the higher ranks of the taxonomy tree. The head subject is supposed to 'tightly' cover the query set, possibly bringing in some ...
Added: June 4, 2019
CONTENT 2019, The Eleventh International Conference on Creative Content Technologies
International Academy, Research, and Industry Association (IARIA), 2019.
Added: June 4, 2019
Method for Generalization of Fuzzy Sets
Frolov D., Mirkin B., Nascimento S. et al., , in: International Conference on Artificial Intelligence and Soft Computing. 18th International Conference, ICAISC 2019, Zakopane, Poland, June 16–20, 2019, Proceedings* 1. Issue 11508.: Cham: Springer, 2019. P. 273–286.
We define and find a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a taxonomy. This generalization lifts the set to a “head subject” in the higher ranks of the taxonomy, that is supposed to “tightly” cover the query set, possibly bringing in some errors, both ...
Added: June 3, 2019
International Conference on Artificial Intelligence and Soft Computing. 18th International Conference, ICAISC 2019, Zakopane, Poland, June 16–20, 2019, Proceedings
Cham: Springer, 2019.
The series Lecture Notes in Computer Science (LNCS), including its subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics (LNBI), has established itself as a medium for the publication of new developments in computer science and information technology research and teaching - quickly, informally, and at a high level. The two-volume set LNCS ...
Added: June 3, 2019
Annotated Suffix Tree Method for German Compound Splitting
Shishkova A., Artemova E., , in: CLLS 2016. Computational Linguistics and Language Science. Proceedings of the Workshop on Computational Linguistics and Language Science. Moscow, Russia, April 26, 2016Vol. 1886.: Aachen: CEUR Workshop Proceedings, 2017. P. 42–47.
The paper presents an unsupervised and knowledge-free ap- proach to compound splitting. Although the research is focused on Ger- man compounds, the method is expected to be extensible to other com- pounding languages. The approach is based on the annotated suffix tree (AST) method proposed and modified by Mirkin et al. To the best of ...
Added: October 10, 2017
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit