From Triconcepts to Triclusters

D. I. Ignatov; S. Kuznetsov

?

From Triconcepts to Triclusters

P. 185–200.

A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use an approach based on computing (closed) sets of attributes having large support (large extent) as clusters of similar documents. The method is tested in a series of computer experiments on large public collections of web documents and compared to other established methods and software, such as biclustering, on same datasets. Practical efficiency of different algorithms for computing frequent closed sets of attributes is compared.

Language: English

Full text

Text on another site

Keywords: разработка данных (Data Mining)информационный поиск анализ формальных понятий data mining information retrieval formal concept analysis near-duplicate detection closed frequent itemsets поиск нечетких дубликатов частые замкнутые множества

In book

Conceptual Structures: Leveraging Semantic Technologies. 17th International Conference on Conceptual Structures, ICCS 2009, Moscow, Russia, July 26-31, 2009, Proceedings

Vol. 5662. , Berlin, Heidelberg: Springer, 2009.

Advances in Information Retrieval: 48th European Conference on Information Retrieval, ECIR 2026, Delft, The Netherlands, March 29 – April 2, 2026, Proceedings, Part II. (LNCS, volume 16484)

Cham: Springer Publishing Company, 2026.

The four-volume set LNCS 16483-16486 constitutes the refereed conference proceedings of the 48th European Conference on Information Retrieval, ECIR 2026, held in Delft, The Netherlands, during March 29–April 2, 2026. The 46 full papers and 37 short papers presented together with 10 findings papers, 9 reproducibility papers, 17 resource papers, 11 workshop papers, 7 tutorial papers, ...

Added: June 18, 2026

SMMR: Sampling-Based MMR Reranking for Faster, More Diverse, and Balanced Recommendations and Retrieval

Liakhnovich K., Lashinin O., Babkin A. et al., Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval 2025 P. 2754–2758

Relevance and diversity are critical objectives in modern information retrieval (IR), particularly in recommender systems. Achieving a balance between relevance (exploitation) and diversity (exploration) optimizes user satisfaction and business goals such as catalog coverage and novelty. While existing post-processing reranking methods address this trade-off, they usually rely on greedy strategies, leading to suboptimal outcomes for ...

Added: February 3, 2026

Advances in Information Retrieval: 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6–10, 2025, Proceedings, Part I

Springer, 2025.

The five-volume set LNCS 15572, 15573, 15574, 15575 and 15576 constitutes the refereed conference proceedings of the 47th European Conference on Information Retrieval, ECIR 2025, held in Lucca, Italy, during April 6–10, 2025. The 52 full papers, 11 findings, 42 short papers and 76 papers of other types presented in these proceedings were carefully reviewed and selected from 530 submissions. The accepted papers ...

Added: April 17, 2025

Advances in Information Retrieval: 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6–10, 2025, Proceedings, Part IV

Springer, 2025.

Added: April 10, 2025

Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part X. LNCS, volume 14950

Cham: Springer, 2024.

This multi-volume set, LNAI 14941 to LNAI 14950, constitutes the refereed proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2024, held in Vilnius, Lithuania, in September 2024. ...

Added: November 22, 2024

2023 IEEE International Conference on Data Mining Workshops (ICDMW) 1–4 December 2023, Shanghai, China

Shanghai: IEEE Computer Society, 2023.

The IEEE International Conference on Data Mining (ICDM) has established itself as the world’s premier research conference in data mining. It provides an international forum for presentation of original research results, as well as exchange and dissemination of innovative and practical development experiences. The conference covers all aspects of data mining, including algorithms, software, systems, ...

Added: March 20, 2024

Поиск закономерностей и важности признаков в данных виктимизационного опроса

D'yakonov A., Головина А. М., Прикладная математика и информатика 2023 Т. 61 № 74 С. 91–108

A methodology for finding patterns by solving machine learning problems with a teacher is described and applied to the analysis of national victimization survey data. Important features for machine learning models, interesting patterns and inconsistencies in the data are found. Experiments on estimating feature importance using different methods are described. ...

Added: March 18, 2024

A Note on the Number of (Maximal) Antichains in the Lattice of Set Partitions

Ignatov D. I., , in: LNAI 14133: 28th International Conference on Conceptual Structures, ICCS 2023, Berlin, Germany, September 11–13, 2023, Proceedings. Graph-Based Representation and Reasoning.: Berlin: Springer, 2023. P. 56–69.

Set partitions and partition lattices are well-known objects in combinatorics and play an important role as a search space in many applied problems including ensemble clustering. Searching for antichains in such lattices is similar to that of in Boolean lattices. Counting the number of antichains in Boolean lattices is known as the Dedekind problem. In ...

Added: November 23, 2023

Сентимент-анализ как метод исследования информационной повестки и общественного мнения (на примере СМИ и социальных сетей КНР)

Анташева М. С., Lobanova P., Isaeva J. K. et al., Социология: методология, методы, математическое моделирование 2023 № 57 С. 7–41

The information agenda broadcast by Chinese media resources is a source of up-to-date data on public opinion on key issues of social welfare. Due to the technical peculiarities of the organization of Chinese websites and the need to attract additional resources for automatic processing (parsing) of texts in Chinese, this topic is not widely represented in domestic and foreign studies. The ...

Added: November 9, 2023

FCA4AI 2023 What can FCA do for Artificial Intelligence 2023 Proceedings of the 11th International Workshop "What can FCA do for Artificial Intelligence?" co-located with the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023) Macao, S.A.R. China; August 20, 2023

CEUR-WS.org, 2023.

Added: September 27, 2023

17th International Conference, ICFCA 2023, Kassel, Germany, July 17–21, 2023, Proceedings. Formal Concept Analysis, (LNCS, volume 13934)

Switzerland: Springer, 2023.

Added: September 27, 2023

Data Analysis and Optimization. In Honor of Boris Mirkin's 80th Birthday

Springer, 2023.

This book presents the state-of-the-art in the emerging field of data science and includes models for layered security with applications in the protection of sites—such as large gathering places—through high-stake decision-making tasks. Such tasks include cancer diagnostics, self-driving cars, and others where wrong decisions can possibly have catastrophic consequences. Additionally, this book provides readers with ...

Added: August 31, 2023

Knowledge Discovery, Knowledge Engineering and Knowledge Management: 13th International Joint Conference, IC3K 2021, Virtual Event, October 25–27, 2021, Revised Selected Papers

Springer, 2023.

This book constitutes the extended and revised versions of a set of selected papers from the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2021, on October 25–27, 2021. The conference was held virtually due to the COVID-19 crisis. The 9 full papers included in this book were carefully reviewed and ...

Added: July 8, 2023

On the Number of Maximal Antichains in Boolean Lattices for 𝑛 up to 7

Ignatov D. I., Lobachevskii Journal of Mathematics 2023 No. 44 P. 137–146

We consider two ways how to compute the number of maximal antichains in the Boolean lattice on 𝑛 elements. The first one is based on full direct enumeration, while the second ones relies on concept lattices or Galois lattices (studied in Formal Concept Analysis, an applied branch of lattice theory) and the Dedekind–MacNeille completion of a partial ...

Added: June 13, 2023

Cognitive load measurement during navigation and information retrieval in digital text

Ledneva T., Kovalev A., Procedia Computer Science 2021 Vol. 192 P. 2720–2730

Interaction with digital text permeates practically all types of educational, professional and leisure activities of modern life. The effective working with digital instructions and materials determines the success in solving a number of real problems. At the same time, the negative impact of overly complex digital environments on learning outcomes, work efficiency, and subjective well-being ...

Added: April 27, 2023

Исследование и определение признаков скрытых атак на предприятии для алгоритмов машинного обучения

Золотухина М. А., Zykov S. V., Вестник Российского нового университета 2023 № 1 С. 20–28

Зачастую именно человеческий фактор ведет к распространению угроз на предприятиях. Если техническое устройство представляет собой четко работающий и слаженный механизм с возможностью при помощи диагностического оборудования проводить замеры параметров неисправностей и устранять их, то для исследования скрытых атак необходим новый компонент системы. Предприятия и промышленность в целом нуждаются в интеллектуальной системе защиты и обнаружения скрытых ...

Added: April 11, 2023

Information Systems and Design. Third International Conference, ICID 2022, Tashkent, Uzbekistan, September 12–13, 2022, Revised Selected Papers

Springer, 2023.

This book constitutes the proceedings of Third International Conference on Information Systems and Design, ICID 2022, which took place in Tashkent, Uzbekistan, in September 2022. The 12 papers presented in this volume were carefully reviewed and selected from 35 submissions. They were organized in topical sections as follows: methodological support of analysis and management tools: theoretical-focused ...

Added: March 31, 2023

АНАЛИЗ СТРУКТУРЫ ВРЕМЕННЫХ РЯДОВ КОЛИЧЕСТВА ДЕЛ В СУДЕ

Lukianchenko P., Gromov V., Beschastnov Y. et al., Вестник кибернетики 2022 Т. 4 № 48 С. 37–48

The study analyzes the time series of the number of new cases in the administrative courts of the Russian Federation using two methods of time series grouping according to the chaotic, stochastic, and regular structure. The first model is based on the entropy‒complexity plane, the second one is presented by the attribute‒object graph. As a result, four groups ...

Added: March 20, 2023

On Shapley value interpretability in concept-based learning with formal concept analysis

Ignatov D. I., Kwuida L., Annals of Mathematics and Artificial Intelligence 2022 Vol. 90 No. 11 P. 1197–1222

We propose the usage of two power indices from cooperative game theory and public choice theory for ranking attributes of closed sets, namely intents of formal concepts (or closed itemsets). The introduced indices are related to extensional concept stability and are also based on counting of generators, especially of those that contain a selected attribute. ...

Added: January 31, 2023

Применение методов анализа формальных понятий для анализа временных рядов тока крови для гемодиализных больных

Gromov V., Урманцева Н. Р., [б.и.], 2021.

В докладе рассматриваются подходы к прогнозированию на основе кластеризации, опирающиеся на методологию анализа формальных понятий. Методология применяется для кластеризации участков временного ряда с целью выделения характерных участков (мотивов), отвечающих больным с различной степенью засорённости фистулы. ...

Added: January 30, 2023