Efficient Parallel Subgraph Counting using G-Tries

Pedro Ribeiro, Fernando Silva and Luís Lopes

2010

Abstract

Finding and counting the occurrences of a collection of subgraphs within another larger network is a computationally hard problem, closely related to graph isomorphism. The subgraph count is by itself a very powerful characterization of a network and it is crucial for other important network measurements. G-tries are a specialized data-structure designed to store and search for subgraphs. By taking advantage of subgraph common substructure, g-tries can provide considerable speedups over previously used methods. In this paper we present a parallel algorithm based precisely on g-tries that is able to efficiently find and count subgraphs. The algorithm relies on randomized receiver-initiated dynamic load balancing and is able to stop its computation at any given time, efficiently store its search position, divide what is left to compute in two halfs, and resume from where it left. We apply our algorithm to several representative real complex networks from various domains and examine its scalability. We obtain an almost linear speedup up to 128 processors, thus allowing us to reach previously unfeasible limits. We showcase the multidisciplinary potential of the algorithm by also applying it to network motif discovery.

Keywords

Parallel Algorithms; Adaptive Load Balancing; Complex Networks; Graph Mining; G-Tries

Digital Object Identifier (DOI)

doi 10.1109/CLUSTER.2010.27

Publication in PDF format

pdf Download PDF

Journal/Conference/Book

IEEE International Conference on Cluster Computing

Reference (text)

Pedro Ribeiro, Fernando Silva and Luís Lopes. Efficient Parallel Subgraph Counting using G-Tries. Proceedings of the IEEE International Conference on Cluster Computing (CLUSTER), pp. 217-226, IEEE CS Press, Crete, Greece, September, 2010.

Bibtex

@inproceedings{ribeiro-CLUSTER2010,
  author = {Pedro Ribeiro and  Fernando Silva and Luís Lopes},
  title = {Efficient Parallel Subgraph Counting using G-Tries},
  doi = {10.1109/CLUSTER.2010.27},
  booktitle = {IEEE International Conference on Cluster Computing},
  pages = {217-226},
  publisher = {IEEE},
  month = {September},
  year = {2010}
}