Limits...
Genome-scale identification and characterization of moonlighting proteins.

Khan I, Chen Y, Dong T, Hong X, Takeuchi R, Mori H, Kihara D - Biol. Direct (2014)

Bottom Line: We found that the GO annotations of moonlighting proteins can be clustered into multiple groups reflecting their diverse functions.We found that moonlighting proteins physically interact with a higher number of distinct functional classes of proteins than non-moonlighting ones and also found that most of the physically interacting partners of moonlighting proteins share the latter's primary functions.Interestingly, we also found that moonlighting proteins tend to interact with other moonlighting proteins.

View Article: PubMed Central - PubMed

ABSTRACT

Background: Moonlighting proteins perform two or more cellular functions, which are selected based on various contexts including the cell type they are expressed, their oligomerization status, and the binding of different ligands at different sites. To understand overall landscape of their functional diversity, it is important to establish methods that can identify moonlighting proteins in a systematic fashion. Here, we have developed a computational framework to find moonlighting proteins on a genome scale and identified multiple proteomic characteristics of these proteins.

Results: First, we analyzed Gene Ontology (GO) annotations of known moonlighting proteins. We found that the GO annotations of moonlighting proteins can be clustered into multiple groups reflecting their diverse functions. Then, by considering the observed GO term separations, we identified 33 novel moonlighting proteins in Escherichia coli and confirmed them by literature review. Next, we analyzed moonlighting proteins in terms of protein-protein interaction, gene expression, phylogenetic profile, and genetic interaction networks. We found that moonlighting proteins physically interact with a higher number of distinct functional classes of proteins than non-moonlighting ones and also found that most of the physically interacting partners of moonlighting proteins share the latter's primary functions. Interestingly, we also found that moonlighting proteins tend to interact with other moonlighting proteins. In terms of gene expression and phylogenetically related proteins, a weak trend was observed that moonlighting proteins interact with more functionally diverse proteins. Structural characteristics of moonlighting proteins, i.e. intrinsic disordered regions and ligand binding sites were also investigated.

Conclusion: Additional functions of moonlighting proteins are difficult to identify by experiments and these proteins also pose a significant challenge for computational function annotation. Our method enables identification of novel moonlighting proteins from current functional annotations in public databases. Moreover, we showed that potential moonlighting proteins without sufficient functional annotations can be identified by analyzing available omics-scale data. Our findings open up new possibilities for investigating the multi-functional nature of proteins at the systems level and for exploring the complex functional interplay of proteins in a cell.

Reviewers: This article was reviewed by Michael Galperin, Eugine Koonin, and Nick Grishin.

Show MeSH

Related in: MedlinePlus

Interacting proteins of moonlighting and non-moonlighting proteins. Physically interacting proteins were obtained from the STRING database. (A) Histogram of the number of interacting proteins. Five datasets are shown: known moonlighting proteins in the MPR1-3 sets (MPR-ALL), the identified moonlighting proteins in E. coli (Ecoli-MP), moonlighting proteins detected in E. coli that have clear experimental evidences for the dual functions and classified into the category 1 (Ecoli-MP-Cat1), E. coli proteins whose multi-functionality originates from different domains (Ecoli-MultiDomain) and non-moonlighting proteins in E. coli. Values on the y-axis are the fraction of the proteins among the entire proteins in each dataset. The bin size used was five. (B), average number of clusters of interacting proteins clustered using the funsim score (Eqn. 4). Seven datasets are plotted: MPR1, MPR2, MPR3, Ecoli-MP, Ecoli-MP-Cat1, Ecoli-MultiDomain, and Ecoli-nonMP. (C) Clustering was performed using the funsim score of BP terms only (Eqn. 3).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4307903&req=5

Fig5: Interacting proteins of moonlighting and non-moonlighting proteins. Physically interacting proteins were obtained from the STRING database. (A) Histogram of the number of interacting proteins. Five datasets are shown: known moonlighting proteins in the MPR1-3 sets (MPR-ALL), the identified moonlighting proteins in E. coli (Ecoli-MP), moonlighting proteins detected in E. coli that have clear experimental evidences for the dual functions and classified into the category 1 (Ecoli-MP-Cat1), E. coli proteins whose multi-functionality originates from different domains (Ecoli-MultiDomain) and non-moonlighting proteins in E. coli. Values on the y-axis are the fraction of the proteins among the entire proteins in each dataset. The bin size used was five. (B), average number of clusters of interacting proteins clustered using the funsim score (Eqn. 4). Seven datasets are plotted: MPR1, MPR2, MPR3, Ecoli-MP, Ecoli-MP-Cat1, Ecoli-MultiDomain, and Ecoli-nonMP. (C) Clustering was performed using the funsim score of BP terms only (Eqn. 3).

Mentions: First, we examined the number of interacting proteins of moonlighting and non-moonlighting proteins (Figure 5A). In addition to the E. coli moonlighting and non-moonlighting proteins, histograms for the MPR1-3 sets are shown for comparison. Among the E. coli MP set, 11 proteins in the first category (those that have clear experimental evidence of their dual functions) were also separately plotted to verify that the observed trend for the entire E. coli MP set was consistent with its most reliable subset. Overall MP and nonMP have similar distributions with the largest peak at 0–5 interacting proteins. A small peak at 20–25 interacting proteins was observed for E. coli MP. This peak consists of two proteins, pepA (P68767) and frdB (P0AC47).Figure 5


Genome-scale identification and characterization of moonlighting proteins.

Khan I, Chen Y, Dong T, Hong X, Takeuchi R, Mori H, Kihara D - Biol. Direct (2014)

Interacting proteins of moonlighting and non-moonlighting proteins. Physically interacting proteins were obtained from the STRING database. (A) Histogram of the number of interacting proteins. Five datasets are shown: known moonlighting proteins in the MPR1-3 sets (MPR-ALL), the identified moonlighting proteins in E. coli (Ecoli-MP), moonlighting proteins detected in E. coli that have clear experimental evidences for the dual functions and classified into the category 1 (Ecoli-MP-Cat1), E. coli proteins whose multi-functionality originates from different domains (Ecoli-MultiDomain) and non-moonlighting proteins in E. coli. Values on the y-axis are the fraction of the proteins among the entire proteins in each dataset. The bin size used was five. (B), average number of clusters of interacting proteins clustered using the funsim score (Eqn. 4). Seven datasets are plotted: MPR1, MPR2, MPR3, Ecoli-MP, Ecoli-MP-Cat1, Ecoli-MultiDomain, and Ecoli-nonMP. (C) Clustering was performed using the funsim score of BP terms only (Eqn. 3).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4307903&req=5

Fig5: Interacting proteins of moonlighting and non-moonlighting proteins. Physically interacting proteins were obtained from the STRING database. (A) Histogram of the number of interacting proteins. Five datasets are shown: known moonlighting proteins in the MPR1-3 sets (MPR-ALL), the identified moonlighting proteins in E. coli (Ecoli-MP), moonlighting proteins detected in E. coli that have clear experimental evidences for the dual functions and classified into the category 1 (Ecoli-MP-Cat1), E. coli proteins whose multi-functionality originates from different domains (Ecoli-MultiDomain) and non-moonlighting proteins in E. coli. Values on the y-axis are the fraction of the proteins among the entire proteins in each dataset. The bin size used was five. (B), average number of clusters of interacting proteins clustered using the funsim score (Eqn. 4). Seven datasets are plotted: MPR1, MPR2, MPR3, Ecoli-MP, Ecoli-MP-Cat1, Ecoli-MultiDomain, and Ecoli-nonMP. (C) Clustering was performed using the funsim score of BP terms only (Eqn. 3).
Mentions: First, we examined the number of interacting proteins of moonlighting and non-moonlighting proteins (Figure 5A). In addition to the E. coli moonlighting and non-moonlighting proteins, histograms for the MPR1-3 sets are shown for comparison. Among the E. coli MP set, 11 proteins in the first category (those that have clear experimental evidence of their dual functions) were also separately plotted to verify that the observed trend for the entire E. coli MP set was consistent with its most reliable subset. Overall MP and nonMP have similar distributions with the largest peak at 0–5 interacting proteins. A small peak at 20–25 interacting proteins was observed for E. coli MP. This peak consists of two proteins, pepA (P68767) and frdB (P0AC47).Figure 5

Bottom Line: We found that the GO annotations of moonlighting proteins can be clustered into multiple groups reflecting their diverse functions.We found that moonlighting proteins physically interact with a higher number of distinct functional classes of proteins than non-moonlighting ones and also found that most of the physically interacting partners of moonlighting proteins share the latter's primary functions.Interestingly, we also found that moonlighting proteins tend to interact with other moonlighting proteins.

View Article: PubMed Central - PubMed

ABSTRACT

Background: Moonlighting proteins perform two or more cellular functions, which are selected based on various contexts including the cell type they are expressed, their oligomerization status, and the binding of different ligands at different sites. To understand overall landscape of their functional diversity, it is important to establish methods that can identify moonlighting proteins in a systematic fashion. Here, we have developed a computational framework to find moonlighting proteins on a genome scale and identified multiple proteomic characteristics of these proteins.

Results: First, we analyzed Gene Ontology (GO) annotations of known moonlighting proteins. We found that the GO annotations of moonlighting proteins can be clustered into multiple groups reflecting their diverse functions. Then, by considering the observed GO term separations, we identified 33 novel moonlighting proteins in Escherichia coli and confirmed them by literature review. Next, we analyzed moonlighting proteins in terms of protein-protein interaction, gene expression, phylogenetic profile, and genetic interaction networks. We found that moonlighting proteins physically interact with a higher number of distinct functional classes of proteins than non-moonlighting ones and also found that most of the physically interacting partners of moonlighting proteins share the latter's primary functions. Interestingly, we also found that moonlighting proteins tend to interact with other moonlighting proteins. In terms of gene expression and phylogenetically related proteins, a weak trend was observed that moonlighting proteins interact with more functionally diverse proteins. Structural characteristics of moonlighting proteins, i.e. intrinsic disordered regions and ligand binding sites were also investigated.

Conclusion: Additional functions of moonlighting proteins are difficult to identify by experiments and these proteins also pose a significant challenge for computational function annotation. Our method enables identification of novel moonlighting proteins from current functional annotations in public databases. Moreover, we showed that potential moonlighting proteins without sufficient functional annotations can be identified by analyzing available omics-scale data. Our findings open up new possibilities for investigating the multi-functional nature of proteins at the systems level and for exploring the complex functional interplay of proteins in a cell.

Reviewers: This article was reviewed by Michael Galperin, Eugine Koonin, and Nick Grishin.

Show MeSH
Related in: MedlinePlus