Limits...
Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks.

Peng W, Wang J, Wang W, Liu Q, Wu FX, Pan Y - BMC Syst Biol (2012)

Bottom Line: While using as many as possible reference organisms can improve the performance of ION.Additionally, ION also shows good prediction performance in E. coli K-12.The accuracy of predicting essential proteins can be improved by integrating the orthology with PPI networks.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Information Science and Engineering, Central South University, Changsha, Hunan 410083, People's Republic of China.

ABSTRACT

Background: Identification of essential proteins plays a significant role in understanding minimal requirements for the cellular survival and development. Many computational methods have been proposed for predicting essential proteins by using the topological features of protein-protein interaction (PPI) networks. However, most of these methods ignored intrinsic biological meaning of proteins. Moreover, PPI data contains many false positives and false negatives. To overcome these limitations, recently many research groups have started to focus on identification of essential proteins by integrating PPI networks with other biological information. However, none of their methods has widely been acknowledged.

Results: By considering the facts that essential proteins are more evolutionarily conserved than nonessential proteins and essential proteins frequently bind each other, we propose an iteration method for predicting essential proteins by integrating the orthology with PPI networks, named by ION. Differently from other methods, ION identifies essential proteins depending on not only the connections between proteins but also their orthologous properties and features of their neighbors. ION is implemented to predict essential proteins in S. cerevisiae. Experimental results show that ION can achieve higher identification accuracy than eight other existing centrality methods in terms of area under the curve (AUC). Moreover, ION identifies a large amount of essential proteins which have been ignored by eight other existing centrality methods because of their low-connectivity. Many proteins ranked in top 100 by ION are both essential and belong to the complexes with certain biological functions. Furthermore, no matter how many reference organisms were selected, ION outperforms all eight other existing centrality methods. While using as many as possible reference organisms can improve the performance of ION. Additionally, ION also shows good prediction performance in E. coli K-12.

Conclusions: The accuracy of predicting essential proteins can be improved by integrating the orthology with PPI networks.

Show MeSH
Proteins ranked in top 100 by ION, PeC, NC and DC and the complexes they belong to. The figure shows the proteins ranked in top 100 by ION, PeC, NC and DC, and the networks constructed by these proteins. The proteins included in a red square belong to a common complex. The yellow nodes denote true essential proteins. In (a), the nodes with the shape of round rectangle represent the different proteins detected by ION while ignored by all of the eight other existing centrality methods.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3472210&req=5

Figure 7: Proteins ranked in top 100 by ION, PeC, NC and DC and the complexes they belong to. The figure shows the proteins ranked in top 100 by ION, PeC, NC and DC, and the networks constructed by these proteins. The proteins included in a red square belong to a common complex. The yellow nodes denote true essential proteins. In (a), the nodes with the shape of round rectangle represent the different proteins detected by ION while ignored by all of the eight other existing centrality methods.

Mentions: Since ION is designed by considering the orthology, connectivity, modularity and neighbor dependency of proteins, the proteins with high ranking scores computed by ION should be conserved, essential and connect with each other. To verify this hypothesis, we select a list of proteins ranked in top 100 by ION, NC, PeC and DC, respectively. According to the known 408 manually annotated complexes[47], proteins in each list are annotated with the index of complexes which they belong to. The interaction networks of these proteins are visualized by using the software CYTOSCAPE[48]. Figure7 shows these networks. The proteins included in a red square belong to a common complex. The yellow nodes denote true essential proteins. Table4 lists the statistic information of these proteins. More detailed information about these proteins and the corresponding complexes is listed in theAdditional file 3. From Figure7 and Table4, compared with PeC, NC and DC,we can clearly see that more true essential proteins are detected by ION, but also more of these proteins ranked in top 100 by ION belong to the complexes with certain biological functions. The average count that the proteins ranked in top 100 by ION have orthologs in reference organisms is about 93, 78 out of 100 these proteins are essential and 72 out of 100 these proteins belong to the complexes. By contrast, the average count that the proteins ranked in top 100 by PeC has orthologs in reference organisms is about 78, 74 out of 100 these proteins are essential and 57 out of 100 these proteins belong to the complexes. Additionally, as indicated in Table4, the sub-complexes containing the proteins ranked in top 100 by ION have higher interaction rate with known complexes than that containing the proteins ranked by other methods. For example, there 18 proteins ranked in top 100 by ION belong to complex 370. The complex 370 is 19/22 S regulator and its GO term is GO: 0008541 with function of proteasome regulatory particle, lid subcomplex. For PeC and NC, there are only 14 proteins and 13 proteins in top 100 proteins ranked by them belong to the complex 370, respectively.


Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks.

Peng W, Wang J, Wang W, Liu Q, Wu FX, Pan Y - BMC Syst Biol (2012)

Proteins ranked in top 100 by ION, PeC, NC and DC and the complexes they belong to. The figure shows the proteins ranked in top 100 by ION, PeC, NC and DC, and the networks constructed by these proteins. The proteins included in a red square belong to a common complex. The yellow nodes denote true essential proteins. In (a), the nodes with the shape of round rectangle represent the different proteins detected by ION while ignored by all of the eight other existing centrality methods.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3472210&req=5

Figure 7: Proteins ranked in top 100 by ION, PeC, NC and DC and the complexes they belong to. The figure shows the proteins ranked in top 100 by ION, PeC, NC and DC, and the networks constructed by these proteins. The proteins included in a red square belong to a common complex. The yellow nodes denote true essential proteins. In (a), the nodes with the shape of round rectangle represent the different proteins detected by ION while ignored by all of the eight other existing centrality methods.
Mentions: Since ION is designed by considering the orthology, connectivity, modularity and neighbor dependency of proteins, the proteins with high ranking scores computed by ION should be conserved, essential and connect with each other. To verify this hypothesis, we select a list of proteins ranked in top 100 by ION, NC, PeC and DC, respectively. According to the known 408 manually annotated complexes[47], proteins in each list are annotated with the index of complexes which they belong to. The interaction networks of these proteins are visualized by using the software CYTOSCAPE[48]. Figure7 shows these networks. The proteins included in a red square belong to a common complex. The yellow nodes denote true essential proteins. Table4 lists the statistic information of these proteins. More detailed information about these proteins and the corresponding complexes is listed in theAdditional file 3. From Figure7 and Table4, compared with PeC, NC and DC,we can clearly see that more true essential proteins are detected by ION, but also more of these proteins ranked in top 100 by ION belong to the complexes with certain biological functions. The average count that the proteins ranked in top 100 by ION have orthologs in reference organisms is about 93, 78 out of 100 these proteins are essential and 72 out of 100 these proteins belong to the complexes. By contrast, the average count that the proteins ranked in top 100 by PeC has orthologs in reference organisms is about 78, 74 out of 100 these proteins are essential and 57 out of 100 these proteins belong to the complexes. Additionally, as indicated in Table4, the sub-complexes containing the proteins ranked in top 100 by ION have higher interaction rate with known complexes than that containing the proteins ranked by other methods. For example, there 18 proteins ranked in top 100 by ION belong to complex 370. The complex 370 is 19/22 S regulator and its GO term is GO: 0008541 with function of proteasome regulatory particle, lid subcomplex. For PeC and NC, there are only 14 proteins and 13 proteins in top 100 proteins ranked by them belong to the complex 370, respectively.

Bottom Line: While using as many as possible reference organisms can improve the performance of ION.Additionally, ION also shows good prediction performance in E. coli K-12.The accuracy of predicting essential proteins can be improved by integrating the orthology with PPI networks.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Information Science and Engineering, Central South University, Changsha, Hunan 410083, People's Republic of China.

ABSTRACT

Background: Identification of essential proteins plays a significant role in understanding minimal requirements for the cellular survival and development. Many computational methods have been proposed for predicting essential proteins by using the topological features of protein-protein interaction (PPI) networks. However, most of these methods ignored intrinsic biological meaning of proteins. Moreover, PPI data contains many false positives and false negatives. To overcome these limitations, recently many research groups have started to focus on identification of essential proteins by integrating PPI networks with other biological information. However, none of their methods has widely been acknowledged.

Results: By considering the facts that essential proteins are more evolutionarily conserved than nonessential proteins and essential proteins frequently bind each other, we propose an iteration method for predicting essential proteins by integrating the orthology with PPI networks, named by ION. Differently from other methods, ION identifies essential proteins depending on not only the connections between proteins but also their orthologous properties and features of their neighbors. ION is implemented to predict essential proteins in S. cerevisiae. Experimental results show that ION can achieve higher identification accuracy than eight other existing centrality methods in terms of area under the curve (AUC). Moreover, ION identifies a large amount of essential proteins which have been ignored by eight other existing centrality methods because of their low-connectivity. Many proteins ranked in top 100 by ION are both essential and belong to the complexes with certain biological functions. Furthermore, no matter how many reference organisms were selected, ION outperforms all eight other existing centrality methods. While using as many as possible reference organisms can improve the performance of ION. Additionally, ION also shows good prediction performance in E. coli K-12.

Conclusions: The accuracy of predicting essential proteins can be improved by integrating the orthology with PPI networks.

Show MeSH