Extending the Applicability of Graphlets to Directed Networks

David Aparício, Pedro Ribeiro and Fernando Silva

2016

Abstract

With recent advances in high-throughput cell biology the amount of cellular biological data has grown drastically. Such data is often modeled as graphs (also called networks) and studying them can lead to new insights into molecule-level organization. A possible way to understand their structure is by analysing the smaller components that constitute them, namely network motifs and graphlets. Graphlets are particularly well suited to compare networks and to assess their level of similarity due to the rich topological information that they offer but are almost always used as small undirected graphs of up to five nodes, thus limiting their applicability in directed networks. However, a large set of interesting biological networks such as metabolic, cell signaling or transcriptional regulatory networks are intrinsically directional, and using metrics that ignore edge direction may gravely hinder information extraction. Our main purpose in this work is to extend the applicability of graphlets to directed networks by considering their edge direction, thus providing a powerful basis for the analysis of directed biological networks. We tested our approach on two network sets, one composed of synthetic graphs and another of real directed biological networks, and verified that they were more accurately grouped using directed graphlets than undirected graphlets. It is also evident that directed graphlets offer substantially more topological information than simple graph metrics such as degree distribution or reciprocity. However, enumerating graphlets in large networks is a computationally demanding task. Our implementation addresses this concern by using a state-of-the-art data structure, the g-trie, which is able to greatly reduce the necessary computation. We compared our tool to other state-of-the art methods and verified that it is the fastest general tool for graphlet counting.

Keywords

Graphs and Networks, Pattern matching, Network topology, Graph Algorithms

Digital Object Identifier (DOI)

doi 10.1109/TCBB.2016.2586046

Publication in PDF format

pdf

Journal/Conference/Book

IEEE/ACM Transactions on Computational Biology and Bioinformatics

Awards/Notice

to appear - preprint available

Reference (text)

David Aparício, Pedro Ribeiro and Fernando Silva. Extending the Applicability of Graphlets to Directed Networks. (to appear - preprint available) In IEEE/ACM Transactions on Computational Biology and Bioinformatics, IEEE, 2016.

Bibtex

@article{ribeiro-TCBB2016,
  author = {David Aparício and  Pedro Ribeiro and Fernando Silva},
  title = {Extending the Applicability of Graphlets to Directed Networks},
  doi = {10.1109/TCBB.2016.2586046},
  journal = {IEEE/ACM Transactions on Computational Biology and Bioinformatics},
  publisher = {IEEE},
  year = {2016}
}