A Method for Community Detection in Networks with Mixed Scale Features at its Nodes
The problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. Algorithms based on probabilistic community models require the node features to be categorical. We use a data-driven model by combining the least-squares data recovery criteria for both, the graph structure and node features. This allows us to take into account both quantitative and categorical features. After deriving an equivalent complementary criterion to optimize, we apply a greedy-wise algorithm for detecting communities in sequence. We experimentally show that our proposed method is effective on both real-world data and synthetic data. In the cases at which attributes are categorical, we compare our approach with state-of-the-art algorithms. Our algorithm appears competitive against them.
Multimodal clustering is an unsupervised technique for mining interesting patterns in n-ary relations or n-mode networks. Among different types of such generalised patterns one can find biclusters and formal concepts (maximal bicliques) for two-mode case, triclusters and triconcepts for three-mode case, closed n-sets for n-mode case, etc. Object-attribute biclustering (OA-biclustering) for mining large binary datatables (formal contexts or two-mode networks) arose by the end of the last decade due to intractability of computation problems related to formal concepts; this type of patterns was proposed as a meaningful and scalable approximation of formal concepts. In this paper, our aim is to present recent advance in OA-biclustering and its extensions to mining multi-mode communities in SNA setting. We also discuss connection between clustering coefficients known in SNA community for one-mode and two-mode networks and OA-bicluster density, the main quality measure of an OA-bicluster. Our experiments with two-, three-, and four-mode large real-world networks show that this type of patterns is suitable for community detection in multi-mode cases within reasonable time even though the number of corresponding n-cliques is still unknown due to computation difficulties. An interpretation of OA-biclusters for one-mode networks is provided as well.
In this article, our ultimate goal is to transform a graph’s adjacency matrix into a distance matrix. Because cluster density is not observable prior to the actual clustering, our goal is to find a distance whose pairwise minimisation will lead to densely connected clusters. Our thesis is centred on the widely accepted notion that strong clusters are sets of vertices with high induced subgraph density. We posit that vertices sharing more connections are closer to each other than vertices sharing fewer connections. This definition of distance differs from the usual shortest-path distance. At the cluster level, our thesis translates into low mean intra-cluster distances, which reflect high densities. We compare three distance measures from the literature. Our benchmark is the accuracy of each measure’s reflection of intra-cluster density, when aggregated (averaged) at the cluster level. We conduct our tests on synthetic graphs, where clusters and intra-cluster density are known in advance. In this article, we restrict our attention to unweighted graphs with no self-loops or multiple edges. We examine the relationship between mean intra-cluster distances and intra-cluster densities. Our numerical experiments show that Jaccard and Otsuka-Ochiai offer very accurate measures of density, when averaged over vertex pairs within clusters.
The paper is devoted to game-theoretic methods for community detection in networks. The traditional methods for detecting community structure are based on selecting dense subgraphs inside the network. Here we propose to use the methods of cooperative game theory that highlight not only the link density but also the mechanisms of cluster formation. Specifically, we suggest two approaches from cooperative game theory: the first approach is based on the Myerson value, whereas the second approach is based on hedonic games. Both approaches allow to detect clusters with various resolutions. However, the tuning of the resolution parameter in the hedonic games approach is particularly intuitive. Furthermore, the modularity-based approach and its generalizations as well as ratio cut and normalized cut methods can be viewed as particular cases of the hedonic games. Finally, for approaches based on potential hedonic games we suggest a very efficient computational scheme using Gibbs sampling.
ASONAM '20: International Conference on Advances in Social Networks Analysis and Mining, 7-10 December 2020, The Hague, Netherlands (Virtual).
The problem of community detection in a network with features at its nodes takes into account both the graph structure and node features. The goal is to find relatively dense groups of interconnected entities sharing some features in common. Existing approaches require the number of communities pre-specified. We apply the so-called data recovery approach to allow a relaxation of the criterion for finding communities one-by-one. We show that our proposed method is effective on real-world data, as well as on synthetic data involving either only quantitative features or only categorical attributes or both. In the cases at which attributes are categorical, state-of-the-art algorithms are available. Our algorithm appears competitive against them.
MARAMI 2020 Modèles & Analyse des Réseaux : Approches Mathématiques & Informatiques - Network Modeling and Analysis 2020
Proceedings of MARAMI 2020 - Modèles & Analyse des Réseaux : Approches Mathématiques & Informatiques - The 11th Conference on Network Modeling and Analysis
Virtual Conference, October 14-15, 2020.
CIRAD, UMR Tetis, Montpellier, France TETIS, Univ. Montpellier, AgroParisTech, CIRAD, CNRS, INRAE, Montpellier, France
In this paper, we propose and implement a method for detecting intersecting and nested communities in graphs of interacting objects of different natures. For this, two classical algorithms are taken: a hierarchical agglomerate and one based on the search for k-cliques. The combined algorithm presented is based on their consistent application. In addition, parametric options are developed that are responsible for actions with communities whose sizes are smaller than the given k, and also with single vertices. Varying these parameters allows us to take into account differences in the topology of the original graph and thus to correct the algorithm. The testing was carried out on real data, including on a group of graphs of a social network, and the qualitative content of the resulting partition was investigated. To assess the differences between the integrated method and the classical algorithms of community detections, a common measure of similarity was used. As a result, it is clearly shown that the resulting partitions are significantly different. We found that for the approach proposed in the article the index of the numerical characteristic of the partitioning accuracy, modularity, can be lower than the corresponding value for other approaches. At the same time, the result of an integrated method is often more informative due to intersections and nested community structure. A visualization of the partition obtained for one of the examples by an integrated method at the first and last steps is presented. Along with the successfully found set of parameters of the integrated method for small communities and cut offvertices in the case of social networks, some shortcomings of the proposed model are noted. Proposals are made to develop this approach by using a set of parametric algorithms.
A method for detecting intersecting and nested communities in graphs of interacting objects of different nature is proposed and implemented. For this, two classical algorithms are taken: a hierarchical agglomerate and based on the search for k -cliques. The presented combined method is based on their consistent application. In addition, parametric options are developed that are responsible for actions with communities whose sizes are smaller than the given k , and also with single vertices. Varying these parameters allows to take into account differences in the topology of the original graph and thus to correct the algorithm. The testing was carried out on real data, including on the group of graphs of the social network, the qualitative content of the resulting partition was investigated. Proposals to develop this approach in the form of using a set of parametric algorithms are made.