Clustering and community detection with imbalanced clusters
Files
Accepted manuscript
Date
2017
DOI
Authors
Aksoylar, Cem
Qian, Jing
Saligrama, Venkatesh
Version
Accepted manuscript
OA Version
Citation
Cem Aksoylar, Jing Qian, Venkatesh Saligrama. 2017. "Clustering and Community Detection With Imbalanced Clusters." IEEE Transactions on Signal and Information Processing over Networks, Volume 3, pp. 61 - 76.
Abstract
Spectral clustering methods that are frequently used in clustering and community detection applications are sensitive to the specific graph constructions particularly when imbalanced clusters are present. We show that ratio cut (RCut) or normalized cut (NCut) objectives are not tailored to imbalanced cluster sizes since they tend to emphasize cut sizes over cut values. We propose a graph partitioning problem that seeks minimum cut partitions under minimum size constraints on partitions to deal with imbalanced cluster sizes. Our approach parameterizes a family of graphs by adaptively modulating node degrees on a fixed node set, yielding a set of parameter dependent cuts reflecting varying levels of imbalance. The solution to our problem is then obtained by optimizing over these parameters.We present rigorous limit cut analysis results to justify our approach and demonstrate the superiority of our method through experiments on synthetic and real datasets for data clustering, semisupervised learning and community detection.