Decentralized Diffusion-Controlled Algorithm for Community Detection : Initialization and Resolution Study

University essay from KTH/Skolan för informations- och kommunikationsteknik (ICT)

Abstract: Community detection in graphs has been an important research topic for many fields. The aim of community detection is to extract from graphs those groups of nodes that present more connections between them than with the rest of the network. Detecting such groups at different scales can help understanding the global behaviour of the system. However, recent studies have shown that realworld graphs follow power-law distributions for degree and community sizes. Specifically, these graphs present many small communities but just a few large ones. This unbalanced community size distribution poses a great challenge for community detection algorithms.Most of the existing methods are based on global approaches that require information about the network to be processed as a whole. Thus, those techniques can not be applied when the graph is too big to fit into one single machine, or in distributed setting when the graph is partitioned among multiple machines. To solve this limitations, a completely decentralized community detection algorithm is presented. It is based on diffusion, following a vertex-centric approach that allows each node to decide the diffusion rates based on local information. It adds as well a mechanism for controlling the diffusion speed through a customizable function.We evaluate the algorithm with a variety of graphs with different levels of imbalance and community structures. Our algorithm is able to detect (almost) perfectly the communities when the imbalance between community sizes is not extreme. We show as well how the sizes of the detected communities can be controlled by the diffusion strategy, allowing for better detection of finer or coarser resolutions in hierarchical graphs. The algorithm is also compared to other two well-known existing methods, achieving similar results in most of the cases though with a higher computation time.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)