?
Pattern structures for news clustering
P. 35–42.
Usually web search results are represented as long list of document snippets. It is
difficult for users to navigate through this collection of text. We propose clustering method
that uses pattern structure constructed on augmented syntactic parse trees. In addition, we
compare our method to other clustering methods and demonstrate the limitations of the
competitive methods.
Keywords: pattern structures
Publication based on the results of:
In book
Buenos Aires: [б.и.], 2015.
Sergei O. Kuznetsov, Parakal E. G., Lecture Notes in Networks and Systems 2023 Vol. 776 P. 423–434
Inherently explainable Machine Learning (ML) models are able to provide explanations for their predictions by virtue of their construction. The explanations of a ML model are more comprehensible if they are expressed in terms of its input features. Our paper proposes an inherently explainable pipeline for document classification using pattern structures and Abstract Meaning Representation ...
Added: February 5, 2024
Ilya Semenkov, Sergei O. Kuznetsov, , in: Proceedings of the 9th International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI 2021)Vol. 2972.: CEUR-WS, 2021. P. 105–112.
This paper presents different versions of classification ensemble methods based on pattern structures. Each of these methods is described and tested on multiple datasets (including datasets with exclusively numerical and exclusively nominal features). As a baseline model Random Forest generation is used. For some classification tasks the classification algorithms based on pattern structures showed better ...
Added: December 19, 2022
Kuznetsov S., Goncharova E., , in: Proceedings of the Fifth International Scientific Conference "Intelligent Information Technologies for Industry" (IITI'21)Vol. 330.: Springer, 2022. P. 410–420.
Added: October 28, 2021
Goncharova E., Ilvovsky D., Galitsky B., , in: Proceedings of the 9th International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI 2021)Vol. 2972.: CEUR-WS, 2021. P. 51–58.
Added: October 28, 2021
Belfodil A., Kuznetsov S., Kaytoue M., International Journal of General Systems 2020 Vol. 49 No. 8 P. 785–818
Order and lattice theory provides convenient mathematical tools for pattern mining, in particular for condensed irredundant representations of pattern spaces and their efficient generation. Formal Concept Analysis (FCA) offers a generic framework, called pattern structures, to formalize many types of patterns, such as itemsets, intervals, graphs, and sequence sets. Moreover, FCA provides generic algorithms to generate irredundantly all ...
Added: January 25, 2021
Kuznetsov S., Demko C., Bertet K. et al., , in: Electronic Procedings Theoretical Computer ScienceVol. 845.: [б.и.], 2020. P. 1–20.
In this article, we present a new data type agnostic algorithm calculating a concept lattice from heterogeneous and complex data. Our NextPriorityConcept algorithm is first introduced and proved in the binary case as an extension of Bordat's algorithm with the notion of strategies to select only some predecessors of each concept, avoiding the generation of ...
Added: October 29, 2020
[б.и.], 2020.
Theoretical Computer Science is mathematical and abstract in spirit, but it derives its motivation from practical and everyday computation. Its aim is to understand the nature of computation and, as a consequence of this understanding, provide more efficient methodologies. All papers introducing or studying mathematical, logic and formal concepts and methods are welcome, provided that their motivation is ...
Added: October 29, 2020
Gizdatullin D., Baixeries J., Ignatov D. I. et al., , in: Intelligent Data Processing 11th International Conference, IDP 2016, Barcelona, Spain, October 10–14, 2016, Revised Selected PapersVol. 794.: Switzerland: Springer, 2019. Ch. 6 P. 74–91.
There are many different methods for computing relevant
patterns in sequential data and interpreting the results. In this paper,
we compute emerging patterns (EP) in demographic sequences using
sequence-based pattern structures, along with different algorithmic solutions.
The purpose of this method is to meet the following domain
requirement: the obtained patterns must be (closed) frequent contiguous
prefixes of the input sequences. ...
Added: February 9, 2020
Makhalova T., Kuznetsov S., Napoli A., , in: 2019 Data Compression Conference Proceedings.: IEEE, 2019. P. 112–121.
Pattern Mining (PM) has a prominent place in Data Science and finds its application in a wide range of domains. To avoid the exponential explosion of patterns different methods have been proposed. They are based on assumptions on interestingness and usually return very different pattern sets. In this paper, we propose to use a compression-based ...
Added: July 2, 2019
Korepanova N., Kuznetsov S., , in: Formal Concept Analysis for Knowledge Discovery. Proceedings of International Workshop on Formal Concept Analysis for Knowledge Discovery (FCA4KD 2017), Moscow, Russia, June 1, 2017.Vol. 1921.: CEUR-WS.org, 2017. P. 13–21.
Today personalized medicine is one of the most popular interdisciplinary research field, risk group identification being one of its most important tasks. Even though the first attempts to estimate the effect of patient’s characteristics on the outcome were proposed in statistics in the middle of the twentieth century, it is still an open question how ...
Added: October 4, 2017
Alam M., Buzmakov A. V., Napoli A., Discrete Applied Mathematics 2018 Vol. 249 P. 2–17
With an increased interest in machine processable data and with the progress of semantic technologies, many datasets are now published in the form of RDF triples for constituting the so-called Web of Data. Data can be queried using SPARQL but there are still needs for integrating, classifying and exploring the data for data analysis and ...
Added: September 26, 2017
Buzmakov A. V., Kuznetsov S., Napoli A., , in: 2017 IEEE 17th International Conference on Data Mining (ICDM).: New Orleans: IEEE, 2017. Ch. 89 P. 757–762.
A scalable method for mining graph patterns stable under subsampling is proposed.
The existing subsample stability and robustness measures are not antimonotonic according to definitions known so far.
We study a broader notion of antimonotonicity for graph patterns, so that measures of subsample stability become antimonotonic. Then we propose gSOFIA for mining the most subsample-stable graph patterns.
The ...
Added: September 26, 2017
Gizdatullin D., Ignatov D. I., Mitrofanova E. et al., , in: 14th International Conference on Formal Concept Analysis - Supplementary Proceedings.: University Rennes 1, 2017. P. 49–66.
This paper presents recent results of studies in application of sequence-based pattern structures and emerging patterns to analysis of demographic sequences in Russia. This study is performed on data of 11 generations from 1930 till 1984 for the panel of three waves of the Russian part of Generation and Gender Survey, which took place in ...
Added: June 20, 2017
Muratova A., Gizdatullin D., Ignatov D. I. et al., В кн.: Социология и общество: социальное неравенство и социальная справедливость (Екатеринбург , 19-21 октября 2016 года). Материалы V Всероссийского социологического конгресса.: М.: Российское общество социологов, 2016. С. 9601–9615.
In this paper, we summarize the results of recent studies on the application of pattern mining and machine learning to the analysis of demographic sequences. The main goal is the demonstration of demographers’ needs, including next-event prediction and the extraction of interesting patterns from substantial datasets of demographic data, which cannot be handled by conventional ...
Added: November 24, 2016
Buzmakov A. V., Napoli A., , in: Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at ECAI 2016).: M.: [б.и.], 2016. P. 89–96.
FCA is a mathematical formalism having many applications
in data mining and knowledge discovery. Originally it deals with binary
data tables. However, there is a number of extensions that enrich stan
dard FCA. In this paper we consider two important extensions: fuzzy
FCA and pattern structures, and discuss the relation between them. In
particular we introduce a scaling procedure that ...
Added: October 14, 2016
Buzmakov A. V., Napoli A., , in: CLA 2016: Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications. CEUR Workshop ProceedingsVol. 1624.: M.: Higher School of Economics, National Research University, 2016. P. 85–96.
FCA is a mathematical formalism having many applications in data mining and knowledge discovery. Originally it deals with binary data tables. However, there is a number of extensions that enrich stan- dard FCA. In this paper we consider two important extensions: fuzzy FCA and pattern structures, and discuss the relation between them. In particular we ...
Added: October 14, 2016
Natalia V. Korepanova, Sergei O. Kuznetsov, , in: CLA 2016: Proceedings of the Thirteenth International Conference on Concept Lattices and Their Applications. CEUR Workshop ProceedingsVol. 1624.: M.: Higher School of Economics, National Research University, 2016. P. 217–229.
A comparison of different treatment strategies does not always result in determining the best one for all patients, one needs to study subgroups of patients with significant difference in efficiency between treatment strategies. To solve this problem an approach to subgroups generation is proposed, where data are described in terms of a pattern structure and ...
Added: October 12, 2016
Makhalova T., Ilvovsky D., Galitsky B., , in: ACL-IJCNLP 2015, Proceedings of the First Workshop on Computing News Storylines.: Beijing: [б.и.], 2015. P. 16–20.
A web search engine usually returns a long list of documents and it may be difficult for users to navigate through this collection
and find the most relevant ones. We present an approach to post-retrieval snippet clustering based on pattern structures construction on augmented syntactic parse trees. Since an algorithm may be too slow for a ...
Added: October 11, 2016
Kashnitsky Y., Kuznetsov S., , in: Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI at ECAI 2016).: M.: [б.и.], 2016. P. 105–112.
Decision tree learning is one of the most popular classifica- tion techniques. However, by its nature it is a greedy approach to finding a classification hypothesis that optimizes some information-based crite- rion. It is very fast but may lead to finding suboptimal classification hy- potheses. Moreover, in spite of decision trees being easily interpretable, ensembles ...
Added: October 6, 2016
Buzmakov A. V., Egho E., Jay N. et al., International Journal of General Systems 2016 Vol. 45 No. 2 P. 135–159
Nowadays data-sets are available in very complex and heterogeneous ways. Mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of “complex” sequential data by means of interesting sequential patterns. We approach the problem using the elegant mathematical framework of ...
Added: February 25, 2016