Limits...
Local network topology in human protein interaction data predicts functional association.

Li H, Liang S - PLoS ONE (2009)

Bottom Line: The application of our algorithms to human PPI data yielded 4,233 significant functional associations among 1,754 proteins.Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations.Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotation in this post-genomic era.

View Article: PubMed Central - PubMed

Affiliation: Department of Bioinformatics & Computational Biology, The University of Texas M. D. Anderson Cancer Center, Houston, Texas, United States of America.

ABSTRACT
The use of high-throughput techniques to generate large volumes of protein-protein interaction (PPI) data has increased the need for methods that systematically and automatically suggest functional relationships among proteins. In a yeast PPI network, previous work has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional association. In this study we improved the prediction scheme by developing a new algorithm and applied it on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting function-associated protein pairs. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as benchmarks to compare and evaluate the function relevance. The application of our algorithms to human PPI data yielded 4,233 significant functional associations among 1,754 proteins. Further functional comparisons between them allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made functional inferences from detailed analysis on one subcluster highly enriched in the TGF-beta signaling pathway (P<10(-50)). Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotation in this post-genomic era.

Show MeSH
TGF-β signaling pathway–related subcluster.(a) One subcluster identified by our method consists of proteins presumably involved in the TGF-β signaling pathway. (b) Detailed interpretation of the relationships between each protein from the subcluster. On the basis of the Ingenuity Pathway Analysis 5.0, the 35 blue-green proteins on the circle participate in the TGF-β signaling pathway, and the 10 red proteins inside the circle are unrelated. The violet proteins outside the circle are common neighbors that do not belong to the subcluster in panel a. Red lines represent significant protein pairs, green lines represent direct protein–protein interactions, and yellow lines represent both.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2713831&req=5

pone-0006410-g005: TGF-β signaling pathway–related subcluster.(a) One subcluster identified by our method consists of proteins presumably involved in the TGF-β signaling pathway. (b) Detailed interpretation of the relationships between each protein from the subcluster. On the basis of the Ingenuity Pathway Analysis 5.0, the 35 blue-green proteins on the circle participate in the TGF-β signaling pathway, and the 10 red proteins inside the circle are unrelated. The violet proteins outside the circle are common neighbors that do not belong to the subcluster in panel a. Red lines represent significant protein pairs, green lines represent direct protein–protein interactions, and yellow lines represent both.

Mentions: The TGF-β signaling pathway–related subcluster (Fig. 5a) has a total of 45 protein members, 35 of which are known to participate in the TGF-β signaling pathway, according to the Ingenuity database. The probability of observing this by chance is <10−54, according to the calculation from Ingenuity software (right-tailed Fisher's exact test). With respect to this extreme P value, we reasoned that probably all the cluster members cooperate to mediate signal transduction. To investigate the role of the other 10 proteins in the TGF-β signaling pathway, we generated a functional relationship network using Osprey software (http://biodata.mshri.on.ca/osprey) [36] to explicitly elucidate the relationships between the 45 proteins (Fig. 5b): the 10 proteins not related to TGF-β according to the Ingenuity database are located inside a circle, whereas the other 35 TGF-β member proteins lie on the circle; common neighbors which do not belong to the 45-member subcluster stay outside the circle.


Local network topology in human protein interaction data predicts functional association.

Li H, Liang S - PLoS ONE (2009)

TGF-β signaling pathway–related subcluster.(a) One subcluster identified by our method consists of proteins presumably involved in the TGF-β signaling pathway. (b) Detailed interpretation of the relationships between each protein from the subcluster. On the basis of the Ingenuity Pathway Analysis 5.0, the 35 blue-green proteins on the circle participate in the TGF-β signaling pathway, and the 10 red proteins inside the circle are unrelated. The violet proteins outside the circle are common neighbors that do not belong to the subcluster in panel a. Red lines represent significant protein pairs, green lines represent direct protein–protein interactions, and yellow lines represent both.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2713831&req=5

pone-0006410-g005: TGF-β signaling pathway–related subcluster.(a) One subcluster identified by our method consists of proteins presumably involved in the TGF-β signaling pathway. (b) Detailed interpretation of the relationships between each protein from the subcluster. On the basis of the Ingenuity Pathway Analysis 5.0, the 35 blue-green proteins on the circle participate in the TGF-β signaling pathway, and the 10 red proteins inside the circle are unrelated. The violet proteins outside the circle are common neighbors that do not belong to the subcluster in panel a. Red lines represent significant protein pairs, green lines represent direct protein–protein interactions, and yellow lines represent both.
Mentions: The TGF-β signaling pathway–related subcluster (Fig. 5a) has a total of 45 protein members, 35 of which are known to participate in the TGF-β signaling pathway, according to the Ingenuity database. The probability of observing this by chance is <10−54, according to the calculation from Ingenuity software (right-tailed Fisher's exact test). With respect to this extreme P value, we reasoned that probably all the cluster members cooperate to mediate signal transduction. To investigate the role of the other 10 proteins in the TGF-β signaling pathway, we generated a functional relationship network using Osprey software (http://biodata.mshri.on.ca/osprey) [36] to explicitly elucidate the relationships between the 45 proteins (Fig. 5b): the 10 proteins not related to TGF-β according to the Ingenuity database are located inside a circle, whereas the other 35 TGF-β member proteins lie on the circle; common neighbors which do not belong to the 45-member subcluster stay outside the circle.

Bottom Line: The application of our algorithms to human PPI data yielded 4,233 significant functional associations among 1,754 proteins.Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations.Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotation in this post-genomic era.

View Article: PubMed Central - PubMed

Affiliation: Department of Bioinformatics & Computational Biology, The University of Texas M. D. Anderson Cancer Center, Houston, Texas, United States of America.

ABSTRACT
The use of high-throughput techniques to generate large volumes of protein-protein interaction (PPI) data has increased the need for methods that systematically and automatically suggest functional relationships among proteins. In a yeast PPI network, previous work has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional association. In this study we improved the prediction scheme by developing a new algorithm and applied it on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting function-associated protein pairs. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as benchmarks to compare and evaluate the function relevance. The application of our algorithms to human PPI data yielded 4,233 significant functional associations among 1,754 proteins. Further functional comparisons between them allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made functional inferences from detailed analysis on one subcluster highly enriched in the TGF-beta signaling pathway (P<10(-50)). Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotation in this post-genomic era.

Show MeSH