Summable and nonsummable data‐driven models for community detection in feature‐rich networks

S. Shalileh; B. Mirkin

doi:10.1007/s13278-021-00774-8

Publications

?

Summable and nonsummable data‐driven models for community detection in feature‐rich networks

Social Network Analysis and Mining. 2021. Vol. 11. No. 1. P. 1–23.

Shalileh S., Mirkin B.

A feature-rich network is a network whose nodes are characterized by categorical or quantitative features. We propose a data-driven model for finding a partition of the nodes to approximate both the network link data and the feature data. The model involves summary quantitative characteristics of both network links and features. We distinguish between two modes of using the network link data. One mode postulates that the link values are comparable and summable across the network (summability); the other assumption models the case in which different nodes represent different measurement systems so that the link data are neither comparable, nor summable, across different nodes (nonsummability). We derive a Pythagorean decomposition of the combined data scatter involving our data recovery least-squares criterion. We address an equivalent problem of maximizing its complementary part, the contribution of a found partition to the combined data scatter. We follow a doubly greedy strategy in maximizing that. First, communities are found one-by-one, and second, entities are added one-by-one in the process of identifying a community. Our algorithms determine the number of clusters automatically. The nonsummability version proves to have a niche of its own; also, it is faster than the other version. In our experiments, they appear to be competitive over generated synthetic data sets and six real-world data sets from the literature.

Research target: Computer Science

Priority areas: IT and mathematics

Keywords: social network analysis community detection algorithms Community detection clustering algorithms Feature-rich Networks

Publication based on the results of:

Data analysis and choice of solutions in the studies of socio-economic and political systems (2021)

HoTPP benchmark: Are we good at the long horizon events forecasting?

Karpukhin I., Shipilov F., Savchenko A., Neurocomputing 2026 Vol. 672 Article 132771

Forecasting multiple future events within a given time horizon is essential for applications in finance, retail, social networks, and healthcare. This problem is typically addressed using Marked Temporal Point Processes (MTPP), which provide a principled framework for modeling both event timing and event labels. While most existing research focuses on predicting only the next event, forecasting distant future ...

Added: February 25, 2026

Comparative analysis of the characteristics of promising apsk modulation schemes in wireless telecommunications

Kazakov G. N., Nguyen H. T., Shevgunov T. et al., T-Comm: Telecommunications and transport 2025 Vol. 19 No. 9 P. 59–76

The growing requirements for the use of high-speed and energy-efficient high-capacity data transmission channels in modern and future telecommunication networks have led to an increasing interest in the formation and application of signals with new constellations. Requirements for the shape of signal constellations in connection with the emergence of new technologies of wireless telecommunications are ...

Added: February 25, 2026

Метод оценки частно-временной плотности вероятности цифрового сигнала с использованием линейной интерполяции

Shevgunov T., T-Comm: Телекоммуникации и транспорт 2024 Т. 18 № 7 С. 4–12

В работе представлена разработка нового инструмента частно-временного (fraction-of-time) подхода, в рамках которого случайный процесс описывается с использованием функциональных моделей, синтезируемых по его единственной наблюдаемой реализации, без необходимости построения абстрактных вероятностных моделей в условиях отсутствия достоверной априорной информации о проявлении процессом свойства эргодичности. На основе полученной ранее аналитической формулы, выражающей частно-временную плотность непрерывного сигнала в явной ...

Added: February 25, 2026

Proceedings of the Ninth International Scientific Conference “Intelligent Information Technologies for Industry”

Cham: Springer Publishing Company, 2026.

This book contains the works connected with the key advances in Intelligent Information Technologies for Industry presented at IITI 2025, the Ninth International Scientific Conference on Intelligent Information Technologies for Industry held on November 5-7, 2025 in Sirius Federal Territory, Russia. The book is written by the experts in the field of applied artificial intelligence ...

Added: February 25, 2026

Measuring External Conflict in Dempster-Shafer Theory Based on Kantorovich Problems

Bronevich A., Lepskiy A., International Journal of Approximate Reasoning 2026 Vol. 190 Article 109597

In the paper, we consider three possible types of external conflict in Dempster-Shafer theory and propose its measurement based on functionals evaluating intersection, inclusion and distance between random sets. All proposed functionals can be viewed as extensions of known functionals like Jaccard metric, Jaccard index, and Dice coefficient from usual sets to random sets based ...

Added: February 25, 2026

Proceeding of international Conference on Data Mining (ICDM 2022)

Sergei O. Kuznetsov, Buzmakov A., Makhalova T. et al., IEEE, 2022.

In this paper, we revisit pattern mining and study the distribution underlying a binary dataset thanks to the closure structure which is based on passkeys, i.e., minimum generators in equivalence classes robust to noise. We introduce △-closedness, a generalization of the closure operator, where △ measures how a closed set differs from its upper neighbors ...

Added: February 25, 2026

АНАЛИЗ ТОНАЛЬНОСТИ РУССКОЙ ДРАМЫ XVIII–XX ВВ. КАК ИНСТРУМЕНТ МОДЕЛИРОВАНИЯ ХУДОЖЕСТВЕННОЙ СТРУКТУРЫ

Anisimova K., Цифровые гуманитарные исследования 2025 № 2 С. 24–47

Исследование посвящено описанию эмоциональной динамики как проявления художественной структуры русской драмы XVIII–XX вв. на основе автоматической разметки тональности реплик с использованием нейросетевых моделей BERT-архитектуры. Такие модели, дообученные даже на нехудожественных текстах, показывают удовлетворительные результаты при анализе тональности драматических реплик, что было проверено на вручную размеченной тестовой выборке. На основе такой автоматической эмоциональной разметки было показано, ...

Added: February 24, 2026

Explainable artificial intelligence for smart and ethical healthcare

Elena Yu. Pesotskaya, Avdoshin S. M., Advanced SmartHealth 2026 Vol. 1 No. 1 P. 1–15

SmartHealth technologies are evolving rapidly, and the emerging Medicine 5.0 paradigm highlights the need for artificial intelligence that pairs high performance with explainability, transparency, and ethical soundness. However, many neural network approaches remain “black boxes,” limiting their uptake in clinical practice, where justification and trust are essential. This article reviews their applications in diagnosis, monitoring ...

Added: February 24, 2026

Advanced SmartHealth

Avdoshin S. M., Pesotskaya E. Y., Singapore: AccScience Publishing, 2026.

Added: February 24, 2026

Explainable artificial intelligence for smart and ethical healthcare

Avdoshin S. M., Elena Yu. P., Advanced Journal of SmartHealth 2026 Vol. X P. 1–15

SmartHealth technologies are evolving rapidly, and the emerging Medicine 5.0 paradigm highlights the need for artificial intelligence that pairs high performance with explainability, transparency, and ethical soundness. However, many neural-network approaches remain “black boxes,” limiting their uptake in clinical practice, where justification and trust are essential. This article reviews their applications in diagnosis, monitoring of ...

Added: February 24, 2026

ФУНДАМЕНТАЛЬНАЯ МОДЕЛЬ ДЛЯ ВРЕМЕННЫХ РЯДОВ И КАК ЕЕ (НЕ) ОБУЧАТЬ НА СИНТЕТИКЕ

Temirkhanov A., Костромина А. М., Цымбой О. А. et al., Доклады Российской академии наук. Математика, информатика, процессы управления (ранее - Доклады Академии Наук. Математика) 2025 Т. 527 № S С. 485–494

The industry is rich in cases when we are required to make forecasting for large amounts of time series at once. However, we might be in a situation where we can not afford to train a separate model for each of them. Such issue in time series modeling remains without due attention. The remedy for ...

Added: February 24, 2026

ГрафиКон 2025 : материалы 35-й Международной конференции по компьютерной графике и машинному зрению (Россия, Йошкар-Ола, 30 сентября – 2 октября 2025 г.)

Йошкар-Ола: Поволжский государственный технологический университет, 2025.

Представлены материалы 35-й Международной конференции «ГрафиКон 2025», проходившей на базе Поволжского государственного технологического университета. В сборник вошли доклады участников конференции, посвященные методам и технологиям компьютерного анализа изображений, визуальной и когнитивной аналитики, 3D-реконструкции, визуальной навигации и человеко-машинного взаимодействия, виртуальной и дополненной реальности, распознавания образов и др. Издание адресовано сотрудникам научно-исследовательских и образовательных организаций, специалистам предприятий ИТ-индустрии, аспирантам, студентам. ...

Added: February 21, 2026

BIG DATA и анализ высокого уровня = BIG DATA and Advanced Analytics: сб. науч. ст. XI Междунар. науч.-практ. конф. (Республика Беларусь, Минск, 23–24 апреля 2025 года)

Мн.: БГУИР, 2025.

BIG DATA и анализ высокого уровня = BIG DATA and Advanced Analytics : сб. науч. ст. XI Междунар. науч.-практ. конф. (Республика Беларусь, Минск, 23–24 апреля 2025 года) / редкол.: В. А. Богуш [и др.]. – Минск : БГУИР, 2025. – 498 с. ISBN 978-985-543-814-5. Опубликованы результаты научных исследований и разработок в области BIG DATA and Advanced Analytics для оптимизации ...

Added: February 21, 2026

Real numbers equally compressible in every base

Nandakumar S., Pulari S., ACM Transactions on Computation Theory 2025 Vol. 17 No. 3 P. 1–28

This work solves an open question in finite-state compressibility posed by Lutz and Mayordomo about compressibility of real numbers in different bases. Finite-state compressibility, or equivalently, finite-state dimension, quantifies the asymptotic lower density of information in an infinite sequence. Absolutely normal numbers, being finite-state incompressible in every base, are precisely those numbers which have finite-state dimension equal to ...

Added: February 20, 2026

Finite-state relative dimension, dimensions of A. P. subsequences and a finite-state van Lambalgen's theorem

Nandakumar S., Pulari S., S A., Information and Computation 2024 Vol. 298 Article 105156

Finite-state dimension, introduced by Dai, Lathrop, Lutz and Mayordomo quantifies the information rate in an infinite sequence as measured by finite-state automata. In this paper, we define a relative version of finite-state dimension. The finite-state relative dimension dimFSY(X) of a sequence X relative to Y is the finite-state dimension of X measured using the class of finite-state gamblers with oracle access to Y. We ...

Added: February 20, 2026

Modeling the Light Curves of Cosmic Gamma-Ray Bursts

Khabibullin А., Хабибуллин, А., Позанэнко, А. et al., Lobachevskii Journal of Mathematics 2025 Vol. 46 No. 4 P. 1459 – 1470

It is known that the light curve of a prompt emission of cosmic gamma-ray bursts consists of separate pulses. Moreover, in a real light curve these pulses often overlap so that it becomes impossible to study separately the pulses in each light curve. However, spectral analysis, in particular, the power spectral density (PSD) of light ...

Added: February 20, 2026

Картирование медицинской науки: результаты интеллектуального анализа больших данных

Grebenyuk A. Y., Lobanova P., Саввин Н. В. et al., Медицинские технологии. Оценка и выбор 2026 Т. 1 № 48 С. 3–34

Objective: Analysis of the current global agenda in medical science. Material and methods: The article proposes an approach to building a medical research landscape based on semantic analysis and mapping of medical topics. For this purpose, 2252 topics from English-language articles published in 2024 related to the field of medicine were vectorized, the embeddings were obtained ...

Added: February 20, 2026

Iterative Ricci-Foster Curvature Flow with GMM-Based Edge Pruning: A Novel Approach to Community Detection

Sorokin K., Beketov M., Онучин А. et al., / arxiv.org. Серия cs.SI "Social and Information Networks ". 2025.

Community detection in complex networks is a fundamental problem, open to new approaches in various scientific settings. We introduce a novel community detection method, based on Ricci flow on graphs. Our technique iteratively updates edge weights (their metric lengths) according to their (combinatorial) Foster version of Ricci curvature computed from effective resistance distance between the ...

Added: January 15, 2026

Implementing Transport Coding in OMNeT++ for Message Delay Reduction

Petrovanov I., Sergeev A., / Series Computer Science "arxiv.org". 2025. No. 2512.18332.

Transport coding reduces message delay in packet-switched networks by introducing controlled redundancy at the transport layer: original packets are encoded into coded packets, and the message is reconstructed after the first successful deliveries, effectively shifting latency from the maximum packet delay to the -th order statistic. We present a concise, reproducible discrete-event implementation of transport coding in OMNeT++, including ...

Added: December 24, 2025

Hessian-based lightweight neural network for brain vessel segmentation on a minimal training dataset

Меньшиков И. А., Бернадотт А. К., Elvimov N. S., / Series arXie "Statistical mechanics". 2025.

Accurate segmentation of blood vessels in brain magnetic resonance angiography (MRA) is essential for successful surgical procedures, such as aneurysm repair or bypass surgery. Currently, annotation is primarily performed through manual segmentation or classical methods, such as the Frangi filter, which often lack sufficient accuracy. Neural networks have emerged as powerful tools for medical image ...

Added: December 1, 2025

Determining the boundary of dynamical chaos in the generalized Chirikov map via machine learning

Чернышов Д. П., Satanin A., Shchur L., / Series arXiv "math". 2025.

We investigate the boundary separating regular and chaotic dynamics in the generalized Chirikov map, an extension of the standard map with phase-shifted secondary kicks. Lyapunov maps were computed across the parameter space (K,K(α, τ)) and used to train a convolutional neural network (ResNet18) for binary classification of dynamical regimes. The model reproduces the known critical ...

Added: November 21, 2025

Эффективный алгоритм торговли на фондовом рынке: ретроспективный анализ, основанный на данных по S&P-500.

Rubchinskiy A., Chubarova D., / Series WP7 "Математические методы анализа решений в экономике, бизнесе и политике". 2025. No. WP7/2025/01.

The article examines one of the most famous examples of socio-economic systems, characterized by significant uncertainty – the S&P-500 stock market, where shares of 500 largest US companies are traded. No assumptions are made about the probabilistic characteristics of the stock market. A flexible algorithm for daily trading has been developed, based on both known fixed data ...

Added: November 9, 2025

Diffusion on language model embeddings for protein sequence generation

Meshchaninov V., Strashnov, P., Shevtsov A. et al., / Cornell University. Серия CoRR, arXiv:2403.03726 "Computing Research Repository,". 2025.

Protein design requires a deep understanding of the inherent complexities of the protein universe. While many efforts lean towards conditional generation or focus on specific families of proteins, the foundational task of unconditional generation remains underexplored and undervalued. Here, we explore this pivotal domain, introducing DiMA, a model that leverages continuous diffusion on embeddings derived ...

Added: October 5, 2025

Smoothie: Smoothing Diffusion on Token Embeddings for Text Generation

Shabalin A., Meshchaninov V., Vetrov D., / Series cs.CL, arXiv:2505.18853 "Computation and Language". 2025.

Diffusion models have achieved state-of-the-art performance in generating images, audio, and video, but their adaptation to text remains challenging due to its discrete nature. Prior approaches either apply Gaussian diffusion in continuous latent spaces, which inherits semantic structure but struggles with token decoding, or operate in categorical simplex space, which respect discreteness but disregard semantic ...

Added: October 5, 2025