• A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • A
  • A
  • A
  • A
  • A
Обычная версия сайта
  • RU
  • EN
  • HSE University
  • Publications
  • Book chapter
  • Globally Optimal Parsimoniously Lifting a Fuzzy Query Set Over a Taxonomy Tree
  • RU
  • EN
Расширенный поиск
Высшая школа экономики
Национальный исследовательский университет
Priority areas
  • business informatics
  • economics
  • engineering science
  • humanitarian
  • IT and mathematics
  • law
  • management
  • mathematics
  • sociology
  • state and public administration
by year
  • 2027
  • 2026
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016
  • 2015
  • 2014
  • 2013
  • 2012
  • 2011
  • 2010
  • 2009
  • 2008
  • 2007
  • 2006
  • 2005
  • 2004
  • 2003
  • 2002
  • 2001
  • 2000
  • 1999
  • 1998
  • 1997
  • 1996
  • 1995
  • 1994
  • 1993
  • 1992
  • 1991
  • 1990
  • 1989
  • 1988
  • 1987
  • 1986
  • 1985
  • 1984
  • 1983
  • 1982
  • 1981
  • 1980
  • 1979
  • 1978
  • 1977
  • 1976
  • 1975
  • 1974
  • 1973
  • 1972
  • 1971
  • 1970
  • 1969
  • 1968
  • 1967
  • 1966
  • 1965
  • 1964
  • 1963
  • 1958
  • More
Subject
News
May 25, 2026
HSE Scientists Train Neural Network to 'Hear' Faults in Electric Motors
Researchers at the AI and Digital Science Institute of the HSE Faculty of Computer Science have developed a new method—the Signature-Guided Data Augmentation (SGDA) framework—that achieves 99% accuracy in motor fault detection and 86% accuracy in fault classification. The application of this approach can reduce industrial equipment repair costs, minimise downtime, and improve production safety. The study results have been published in Engineering Applications of Artificial Intelligence.
May 25, 2026
'The Humanities Serve as a Conscience'
Maria Mizernaia studies Soviet literature and the history of book publishing. In this interview for the HSE Young Scientists project, she discusses plans to publish a novel about besieged Leningrad, AI-provoked reflections on what it means to be human, and how novels can help satisfy our dopamine hunger.
May 25, 2026
Is It Possible to Predict a Citys Life Based on the Shape of Its Neighbourhoods?
Is it possible to predict, based on the configuration of streets and buildings, where a café will open or where traffic congestion will occur? Participants in the Spatial Analysis and Modelling of Urban Processes research and study group use open data and machine learning to identify universal patterns. Alexander Sheludkov and Eduard Somov discuss the purpose of comparing cities, the need for new forms of urban statistics, and how open data is transforming approaches to urban studies.

 

Have you spotted a typo?
Highlight it, click Ctrl+Enter and send us a message. Thank you for your help!

Publications
  • Books
  • Articles
  • Chapters of books
  • Working papers
  • Report a publication
  • Research at HSE

?

Globally Optimal Parsimoniously Lifting a Fuzzy Query Set Over a Taxonomy Tree

P. 779–789.
Frolov D., Mirkin B., Nascimento S., Fenner T.

This paper presents a relatively rare case of an optimization problem in data analysis to admit a globally optimal solution by a recursive algorithm. We are concerned with finding a most specific generalization of a fuzzy set of topics assigned to leaves of domain taxonomy represented by a rooted tree. The idea is to “lift” the set to its “head subject” in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly bringing in some errors, either “gaps” or “offshoots” or both. Our method globally minimizes a penalty function combining the numbers of head subjects and gaps and offshoots, differently weighted. We apply this to a collection of 17645 research papers on Data Science published in 17 Springer journals for the past 20 years. We extract a taxonomy of Data Science (TDS) from the international Association for Computing Machinery Computing Classification System 2012. We find fuzzy clusters of leaf topics over the text collection, optimally lift them to head subjects in TDS, and comment on the tendencies of current research following from the lifting results.

Language: English
Full text
DOI
Text on another site
Keywords: additive fuzzy clusteringparsimonyannotated suffix tree hierarchical taxonomy Generalization Spectral clustering
Publication based on the results of:
Разработка методов структуризации и концептуализации текстовых данных на основе таксономии предметной области (2019)

In book

Optimization of Complex Systems: Theory, Models, Algorithms and Applications
Switzerland: Springer Publishing Company, 2020.
Similar publications
The Benefits of Query-Based KGQA Systems for Complex and Temporal Questions in LLM Era
Alekseev A., Chaichuk M., Butko M. et al., , in: 30th International Conference on Applications of Natural Language to Information Systems, NLDB 2025, Kanazawa, Japan, July 4–6, 2025, Proceedings, Part I. Natural Language Processing and Information Systems. (LNCS, volume 15836)* I. Vol. 15836.: Springer, 2025. P. 426–441.
Large language models excel in question-answering (QA) but struggle with multi-hop reasoning and temporal questions. Query-based knowledge graph QA (KGQA) offers a modular alternative by generating executable queries instead of direct answers. We explore multi-stage query-based framework for WikiData QA, proposing multi-stage approach that enhances performance on challenging multi-hop and temporal benchmarks. Through generalization and ...
Added: February 3, 2026
Modeling Generalization in Domain Taxonomies Using a Maximum Likelihood Criterion
Zhirayr Hayrapetyan, Nascimento S., Trevor F. et al., , in: Information Systems and Technologies: WorldCIST 2022, Volume 2Issue 469.: Springer, 2022. P. 141–147.
We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its “head subject” node in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly involving some ...
Added: November 18, 2022
A Hybrid Approach to the Analysis of a Collection of Research Papers
Mirkin B., Frolov D., Vlasov A. et al., , in: Intelligent Data Engineering and Automated Learning – IDEAL 2020/ 21st International Conference, Guimaraes, Portugal, November 4–6, 2020, Proceedings, Part IIVol. 12490: Lecture Notes in Computer Science.: Cham: Springer, 2020. P. 423–433.
We define and find a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a taxonomy. This generalization lifts the set to a “head subject” in the higher ranks of the taxonomy, that is supposed to “tightly” cover the query set, possibly bringing in some errors, both ...
Added: November 13, 2020
A Hybrid Approach to Interpretable Analysis of Research Paper Collections
Mirkin B., Frolov D., Vlasov A. et al., , in: WIMS 2020: Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics.: Association for Computing Machinery (ACM), 2020. P. 184–189.
We define and find a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a taxonomy. This generalization lifts the set to a “head subject” in the higher ranks of the taxonomy, that is supposed to “tightly” cover the query set, possibly bringing in some errors, both ...
Added: August 28, 2020
Computational Generalization in Taxonomies Applied to: (1) Analyze Tendencies of Research and (2) Extend User Audiences
Frolov D., Mirkin B., Nascimento S. et al., , in: Intelligent Data Engineering and Automated Learning – IDEAL 2019Vol. 2.: Springer, 2019. P. 3–11.
We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its “head subject” node in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly bringing in some errors referred to ...
Added: December 7, 2019
Intelligent Data Engineering and Automated Learning – IDEAL 2019
Springer, 2019.
We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its “head subject” node in the higher ranks of the taxonomy tree. The head subject is supposed to “tightly” cover the query set, possibly bringing in some errors referred to ...
Added: December 7, 2019
Using Taxonomy Tree to Generalize a Fuzzy Thematic Cluster
Frolov D., Mirkin B., Nascimento S. et al., , in: Fuzzy Systems (FUZZ-IEEE), IEEE International Conference Proceedings.: IEEE, 2019. P. 1–6.
This paper presents an algorithm, ParGenFS, for generalizing, or “lifting”, a fuzzy set of topics to higher ranks of a hierarchical taxonomy of a research domain. The algorithm ParGenFS finds a globally optimal generalization of the topic set to minimize a penalty function, by balancing the number of introduced “head subjects” and related errors, the ...
Added: October 30, 2019
Parsimonious Generalization of Fuzzy Thematic Sets in Taxonomies Applied to the Analysis of Tendencies of Research in Data Science
Frolov D., Nascimento S., Fenner T. et al., Information Sciences 2020 Vol. 512 P. 595–615
This paper proposes a novel method, referred to as ParGenFS, for finding a most specific generalization of a query set represented by a fuzzy set of topics assigned to leaves of the rooted tree of a taxonomy. The query set is generalized by “lifting” it to one or more “head subjects” in the higher ranks ...
Added: October 9, 2019
A Method for Audience Extending in Programmatic Advertising by Using Parsimonious Generalization of User Segments
Frolov D., Taran Z., Mirkin B., , in: International Conference on Human Interaction and Emerging Technologies.: Springer, 2020. P. 837–841.
We propose a novel method for efficient target audience augmentation in programmatic digital advertising. This method utilizes a novel ParGenFS algorithm for most adequate generalization in taxonomies which was developed by the authors in a joint work. The ParGenFS extends user segments by parsimoniously lifting them off-line as a fuzzy set over IAB content taxonomy ...
Added: July 31, 2019
Using Domain Taxonomy to Model Generalization of Thematic Fuzzy Clusters
Frolov D., Mirkin B., Nascimento S. et al., , in: CONTENT 2019, The Eleventh International Conference on Creative Content Technologies.: International Academy, Research, and Industry Association (IARIA), 2019. P. 20–25.
We define a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a domain taxonomy. This generalization lifts the set to its 'head subject' in the higher ranks of the taxonomy tree. The head subject is supposed to 'tightly' cover the query set, possibly bringing in some ...
Added: June 4, 2019
CONTENT 2019, The Eleventh International Conference on Creative Content Technologies
International Academy, Research, and Industry Association (IARIA), 2019.
Added: June 4, 2019
Method for Generalization of Fuzzy Sets
Frolov D., Mirkin B., Nascimento S. et al., , in: International Conference on Artificial Intelligence and Soft Computing. 18th International Conference, ICAISC 2019, Zakopane, Poland, June 16–20, 2019, Proceedings* 1. Issue 11508.: Cham: Springer, 2019. P. 273–286.
We define and find a most specific generalization of a fuzzy set of topics assigned to leaves of the rooted tree of a taxonomy. This generalization lifts the set to a “head subject” in the higher ranks of the taxonomy, that is supposed to “tightly” cover the query set, possibly bringing in some errors, both ...
Added: June 3, 2019
International Conference on Artificial Intelligence and Soft Computing. 18th International Conference, ICAISC 2019, Zakopane, Poland, June 16–20, 2019, Proceedings
Cham: Springer, 2019.
The series Lecture Notes in Computer Science (LNCS), including its subseries Lecture Notes in Artificial Intelligence (LNAI) and Lecture Notes in Bioinformatics (LNBI), has established itself as a medium for the publication of new developments in computer science and information technology research and teaching - quickly, informally, and at a high level. The two-volume set LNCS ...
Added: June 3, 2019
  • About
  • About
  • Key Figures & Facts
  • Sustainability at HSE University
  • Faculties & Departments
  • International Partnerships
  • Faculty & Staff
  • HSE Buildings
  • HSE University for Persons with Disabilities
  • Public Enquiries
  • Studies
  • Admissions
  • Programme Catalogue
  • Undergraduate
  • Graduate
  • Exchange Programmes
  • Summer University
  • Summer Schools
  • Semester in Moscow
  • Business Internship
  • Research
  • International Laboratories
  • Research Centres
  • Research Projects
  • Monitoring Studies
  • Conferences & Seminars
  • Academic Jobs
  • Yasin (April) International Academic Conference on Economic and Social Development
  • Media & Resources
  • Publications by staff
  • HSE Journals
  • Publishing House
  • iq.hse.ru: commentary by HSE experts
  • Library
  • Economic & Social Data Archive
  • Video
  • HSE Repository of Socio-Economic Information
  • HSE1993–2026
  • Contacts
  • Copyright
  • Privacy Policy
  • Site Map
Edit