Research Article: Effective Identification of Conserved Pathways in Biological Networks Using Hidden Markov Models

Date Published: December 7, 2009

Publisher: Public Library of Science

Author(s): Xiaoning Qian, Byung-Jun Yoon, Diego Di Bernardo.

Abstract: The advent of various high-throughput experimental techniques for measuring molecular interactions has enabled the systematic study of biological interactions on a global scale. Since biological processes are carried out by elaborate collaborations of numerous molecules that give rise to a complex network of molecular interactions, comparative analysis of these biological networks can bring important insights into the functional organization and regulatory mechanisms of biological systems.

Partial Text: Recent advances in high-throughput experimental techniques for measuring molecular interactions [1]–[4] have enabled the systematic study of biological interactions on a global scale for an increasing number of organisms [5]. Genome-scale interaction networks provide invaluable resources for investigating the functional organization of cells and understanding their regulatory mechanisms. Biological networks can be conveniently represented as graphs, in which the nodes represent the basic entities in a given network and the edges indicate the interactions between them. Network alignment provides an effective means for comparing the networks of different organisms by aligning these graphs and finding their common substructures. This can facilitate the discovery of conserved functional modules and ultimately help us study their functions and the detailed molecular mechanisms that contribute to these functions. For this reason, there have been growing efforts to develop efficient network alignment algorithms that can effectively detect conserved interaction patterns in various biological networks, including protein-protein interaction (PPI) networks [6]–[20], metabolic networks [7], [12], [21], gene regulatory networks [22], and signal transduction networks [23]. It has been demonstrated that network alignment algorithms can detect many known biological pathways and also make statistically significant predictions of novel pathways.

In this section, we present an algorithm for solving the local network alignment problem based on HMMs. For simplicity, we first focus on the problem of aligning two networks, which can be formally defined as follows: Given two biological networks and and a specified length , find the most similar pair of linear paths, where belongs to the network and belongs to , and each of them have nodes. As we show later, the pairwise network alignment algorithm can be easily extended for aligning multiple networks in a straightforward manner.

To demonstrate the effectiveness of the HMM-based network alignment algorithm, we carried out the following experiments. First, we used our algorithm to align two pairs of small synthetic networks that were used to validate the network alignment algorithm proposed in [24]. Second, we used the proposed algorithm for finding putative pathways in the fruit fly PPI network that look similar to known human pathways. Finally, we applied the algorithm for aligning microbial PPI networks to assess its ability to find conserved functional modules.

In this paper, we proposed an HMM-based network alignment algorithm that can be used for finding conserved pathways in two or more biological networks. The HMM framework and the proposed alignment algorithm has a number of important advantages compared to other existing local network alignment algorithms. First of all, despite its generality, the proposed algorithm is very simple and efficient. In fact, the alignment algorithm based on the proposed HMM framework is a variant of the Viterbi algorithm. As a result, it has a very low polynomial computational complexity, which grows only linearly with respect to the length of the identified pathways and the number of edges in each network. This makes it possible to find conserved pathways with more than 10 nodes in networks with thousands of nodes and tens of thousands of interactions within a few minutes on a personal computer. Furthermore, the HMM-based framework can handle a large class of path isomorphism, which allows us to find pathway alignments with any number of gaps (node insertions and deletions) at arbitrary locations. In addition to this, the proposed framework is very flexible in choosing the scoring scheme for pathway alignments, where different penalties can be used for mismatches, insertions and deletions. We can also assign different penalties for gap opening and gap extension, which can be convenient when comparing networks that are remotely related to each other. Another important advantage of the proposed framework is that it allows us to use an efficient dynamic programming algorithm for finding the mathematically optimal alignment. Considering that many available algorithms rely on heuristics that cannot guarantee the optimality of the obtained solutions, this is certainly a significant merit of the HMM-based approach. Although the mathematical optimality does not guarantee the biological significance of the obtained solution, it can certainly lead to more accurate predictions if combined with a realistic scoring scheme for assessing pathway homology. As demonstrated in our experiments, the proposed algorithm yields accurate and biologically meaningful results both for querying known pathways in the network of another organism and also for finding conserved functional modules in the networks of different organisms. Finally, the HMM-based framework presented in this paper can be extended for aligning multiple networks. While many current multiple network alignment algorithms adopt a progressive approach for comparing multiple networks [9], [14]–[17], our HMM-based framework provides a potential way to simultaneously align multiple networks to find the optimal set of conserved pathways with maximum alignment score.



0 0 vote
Article Rating
Notify of
Inline Feedbacks
View all comments