Limits...
Optimal network alignment with graphlet degree vectors.

Milenković T, Ng WL, Hayes W, Przulj N - Cancer Inform (2010)

Bottom Line: The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments.Our method detects topologically similar regions in large networks that are statistically significant.It does this independent of protein sequence or any other information external to network topology.

View Article: PubMed Central - PubMed

Affiliation: Department of Computing, Imperial College London SW7 2AZ, UK.

ABSTRACT
Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.

No MeSH data available.


Statistics of H-GRAAL’s core yeast-human alignment for α = 0.5. We present the percentage of yeast proteins, out of 2,390 of them, that participate in n “optimizing pairs” (defined in Methods), for n = 1, 2, ..., 6, 7–48. Recall that an aligned pair is optimizing if it appears in at least one optimal alignment. Hence, when we examine all optimal alignments, we compute the percentage of yeast proteins that are aligned to n human proteins by optimal alignments. Around 72% of all yeast proteins have a unique human protein that they are aligned to by every optimal alignment. These yeast-human protein pairs form H-GRAAL’s core alignment.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2901631&req=5

f4-cin-2010-121: Statistics of H-GRAAL’s core yeast-human alignment for α = 0.5. We present the percentage of yeast proteins, out of 2,390 of them, that participate in n “optimizing pairs” (defined in Methods), for n = 1, 2, ..., 6, 7–48. Recall that an aligned pair is optimizing if it appears in at least one optimal alignment. Hence, when we examine all optimal alignments, we compute the percentage of yeast proteins that are aligned to n human proteins by optimal alignments. Around 72% of all yeast proteins have a unique human protein that they are aligned to by every optimal alignment. These yeast-human protein pairs form H-GRAAL’s core alignment.

Mentions: Even with the improved efficiency from using the dynamic Hungarian algorithm, network alignment problems will typically yield far too many optimal alignments to make exhaustive enumeration practical; yet it would be undeniably useful to somehow summarize information about the set of all possible optimal alignments. To this end, we say that an aligned pair is optimizing if it appears in at least one optimal alignment. The set of all optimizing aligned pairs is clearly the union of all aligned pairs from all optimal alignments, which gives a short, O(n2)-sized, synopsis of the set of all optimal alignments (Fig. 4). Furthermore, the set of optimizing pairs can be computed fairly efficiently in at worst O(n4) time. The procedure for enumerating optimizing pairs basically uses the same pair-removal trick described earlier for generating new optimal alignments. The pseudo-code for this procedure is shown below, with an explanation for each line afterwards.


Optimal network alignment with graphlet degree vectors.

Milenković T, Ng WL, Hayes W, Przulj N - Cancer Inform (2010)

Statistics of H-GRAAL’s core yeast-human alignment for α = 0.5. We present the percentage of yeast proteins, out of 2,390 of them, that participate in n “optimizing pairs” (defined in Methods), for n = 1, 2, ..., 6, 7–48. Recall that an aligned pair is optimizing if it appears in at least one optimal alignment. Hence, when we examine all optimal alignments, we compute the percentage of yeast proteins that are aligned to n human proteins by optimal alignments. Around 72% of all yeast proteins have a unique human protein that they are aligned to by every optimal alignment. These yeast-human protein pairs form H-GRAAL’s core alignment.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2901631&req=5

f4-cin-2010-121: Statistics of H-GRAAL’s core yeast-human alignment for α = 0.5. We present the percentage of yeast proteins, out of 2,390 of them, that participate in n “optimizing pairs” (defined in Methods), for n = 1, 2, ..., 6, 7–48. Recall that an aligned pair is optimizing if it appears in at least one optimal alignment. Hence, when we examine all optimal alignments, we compute the percentage of yeast proteins that are aligned to n human proteins by optimal alignments. Around 72% of all yeast proteins have a unique human protein that they are aligned to by every optimal alignment. These yeast-human protein pairs form H-GRAAL’s core alignment.
Mentions: Even with the improved efficiency from using the dynamic Hungarian algorithm, network alignment problems will typically yield far too many optimal alignments to make exhaustive enumeration practical; yet it would be undeniably useful to somehow summarize information about the set of all possible optimal alignments. To this end, we say that an aligned pair is optimizing if it appears in at least one optimal alignment. The set of all optimizing aligned pairs is clearly the union of all aligned pairs from all optimal alignments, which gives a short, O(n2)-sized, synopsis of the set of all optimal alignments (Fig. 4). Furthermore, the set of optimizing pairs can be computed fairly efficiently in at worst O(n4) time. The procedure for enumerating optimizing pairs basically uses the same pair-removal trick described earlier for generating new optimal alignments. The pseudo-code for this procedure is shown below, with an explanation for each line afterwards.

Bottom Line: The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments.Our method detects topologically similar regions in large networks that are statistically significant.It does this independent of protein sequence or any other information external to network topology.

View Article: PubMed Central - PubMed

Affiliation: Department of Computing, Imperial College London SW7 2AZ, UK.

ABSTRACT
Important biological information is encoded in the topology of biological networks. Comparative analyses of biological networks are proving to be valuable, as they can lead to transfer of knowledge between species and give deeper insights into biological function, disease, and evolution. We introduce a new method that uses the Hungarian algorithm to produce optimal global alignment between two networks using any cost function. We design a cost function based solely on network topology and use it in our network alignment. Our method can be applied to any two networks, not just biological ones, since it is based only on network topology. We use our new method to align protein-protein interaction networks of two eukaryotic species and demonstrate that our alignment exposes large and topologically complex regions of network similarity. At the same time, our alignment is biologically valid, since many of the aligned protein pairs perform the same biological function. From the alignment, we predict function of yet unannotated proteins, many of which we validate in the literature. Also, we apply our method to find topological similarities between metabolic networks of different species and build phylogenetic trees based on our network alignment score. The phylogenetic trees obtained in this way bear a striking resemblance to the ones obtained by sequence alignments. Our method detects topologically similar regions in large networks that are statistically significant. It does this independent of protein sequence or any other information external to network topology.

No MeSH data available.