Limits...
Genome-scale identification and characterization of moonlighting proteins.

Khan I, Chen Y, Dong T, Hong X, Takeuchi R, Mori H, Kihara D - Biol. Direct (2014)

Bottom Line: We found that the GO annotations of moonlighting proteins can be clustered into multiple groups reflecting their diverse functions.We found that moonlighting proteins physically interact with a higher number of distinct functional classes of proteins than non-moonlighting ones and also found that most of the physically interacting partners of moonlighting proteins share the latter's primary functions.Interestingly, we also found that moonlighting proteins tend to interact with other moonlighting proteins.

View Article: PubMed Central - PubMed

ABSTRACT

Background: Moonlighting proteins perform two or more cellular functions, which are selected based on various contexts including the cell type they are expressed, their oligomerization status, and the binding of different ligands at different sites. To understand overall landscape of their functional diversity, it is important to establish methods that can identify moonlighting proteins in a systematic fashion. Here, we have developed a computational framework to find moonlighting proteins on a genome scale and identified multiple proteomic characteristics of these proteins.

Results: First, we analyzed Gene Ontology (GO) annotations of known moonlighting proteins. We found that the GO annotations of moonlighting proteins can be clustered into multiple groups reflecting their diverse functions. Then, by considering the observed GO term separations, we identified 33 novel moonlighting proteins in Escherichia coli and confirmed them by literature review. Next, we analyzed moonlighting proteins in terms of protein-protein interaction, gene expression, phylogenetic profile, and genetic interaction networks. We found that moonlighting proteins physically interact with a higher number of distinct functional classes of proteins than non-moonlighting ones and also found that most of the physically interacting partners of moonlighting proteins share the latter's primary functions. Interestingly, we also found that moonlighting proteins tend to interact with other moonlighting proteins. In terms of gene expression and phylogenetically related proteins, a weak trend was observed that moonlighting proteins interact with more functionally diverse proteins. Structural characteristics of moonlighting proteins, i.e. intrinsic disordered regions and ligand binding sites were also investigated.

Conclusion: Additional functions of moonlighting proteins are difficult to identify by experiments and these proteins also pose a significant challenge for computational function annotation. Our method enables identification of novel moonlighting proteins from current functional annotations in public databases. Moreover, we showed that potential moonlighting proteins without sufficient functional annotations can be identified by analyzing available omics-scale data. Our findings open up new possibilities for investigating the multi-functional nature of proteins at the systems level and for exploring the complex functional interplay of proteins in a cell.

Reviewers: This article was reviewed by Michael Galperin, Eugine Koonin, and Nick Grishin.

Show MeSH

Related in: MedlinePlus

Gene expression profile analysis. Average number of clusters of interacting proteins relative to the number of proteins interacting by gene expression. Proteins considered to be interacting are the top 2% of proteins in the Gene Expression network of E. coli sorted in terms of the Pearson correlation coefficient. (A) Histogram of number of interacting proteins. (B) Functional clustering using Funsim (BP, MF, CC) score thresholds between 0.1 and 1.0. (C) Functional clustering using Funsim (BP) score thresholds between 0.1 and 1.0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4307903&req=5

Fig7: Gene expression profile analysis. Average number of clusters of interacting proteins relative to the number of proteins interacting by gene expression. Proteins considered to be interacting are the top 2% of proteins in the Gene Expression network of E. coli sorted in terms of the Pearson correlation coefficient. (A) Histogram of number of interacting proteins. (B) Functional clustering using Funsim (BP, MF, CC) score thresholds between 0.1 and 1.0. (C) Functional clustering using Funsim (BP) score thresholds between 0.1 and 1.0.

Mentions: Next, we investigated functions of co-expressed genes with moonlighting proteins in E. coli. The E. coli gene expression data were taken from the COLOMBOS database [115], which contains expression data of 4295 genes in 2369 contrasts. We calculated the Pearson correlation coefficient of expression levels of each pair of genes and selected pairs as co-expressed if the absolute value of the correlation coefficient is ranked within the top 2% largest values among all the pairs. The number of co-expressed genes of moonlighting and non-moonlighting proteins do not have large difference, except for a peak observed at 65 for the moonlighting proteins (Figure 7A), which consists of four moonlighting proteins (P77489, P0A8Q3, P0AC47, and P25516). Then, similar to the analysis in Figure 5B and 5C, we computed functional clustering profile for co-expressed genes of E. coli moonlighting proteins to see if co-expressed genes have functional divergence. The clustering profile using the funsim score (Figure 7B) and the BP-funsim score (Figure 7C) showed that the moonlighting proteins have a slightly larger average number of clusters of functionally similar proteins per co-expressed genes than that for non-moonlighting proteins, although this difference is not statistically significant (Additional file 1: Table S1). The same conclusion was obtained when we defined co-expressed genes as those which have over 0.4 of the correlation coefficient value (data not shown).Figure 7


Genome-scale identification and characterization of moonlighting proteins.

Khan I, Chen Y, Dong T, Hong X, Takeuchi R, Mori H, Kihara D - Biol. Direct (2014)

Gene expression profile analysis. Average number of clusters of interacting proteins relative to the number of proteins interacting by gene expression. Proteins considered to be interacting are the top 2% of proteins in the Gene Expression network of E. coli sorted in terms of the Pearson correlation coefficient. (A) Histogram of number of interacting proteins. (B) Functional clustering using Funsim (BP, MF, CC) score thresholds between 0.1 and 1.0. (C) Functional clustering using Funsim (BP) score thresholds between 0.1 and 1.0.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4307903&req=5

Fig7: Gene expression profile analysis. Average number of clusters of interacting proteins relative to the number of proteins interacting by gene expression. Proteins considered to be interacting are the top 2% of proteins in the Gene Expression network of E. coli sorted in terms of the Pearson correlation coefficient. (A) Histogram of number of interacting proteins. (B) Functional clustering using Funsim (BP, MF, CC) score thresholds between 0.1 and 1.0. (C) Functional clustering using Funsim (BP) score thresholds between 0.1 and 1.0.
Mentions: Next, we investigated functions of co-expressed genes with moonlighting proteins in E. coli. The E. coli gene expression data were taken from the COLOMBOS database [115], which contains expression data of 4295 genes in 2369 contrasts. We calculated the Pearson correlation coefficient of expression levels of each pair of genes and selected pairs as co-expressed if the absolute value of the correlation coefficient is ranked within the top 2% largest values among all the pairs. The number of co-expressed genes of moonlighting and non-moonlighting proteins do not have large difference, except for a peak observed at 65 for the moonlighting proteins (Figure 7A), which consists of four moonlighting proteins (P77489, P0A8Q3, P0AC47, and P25516). Then, similar to the analysis in Figure 5B and 5C, we computed functional clustering profile for co-expressed genes of E. coli moonlighting proteins to see if co-expressed genes have functional divergence. The clustering profile using the funsim score (Figure 7B) and the BP-funsim score (Figure 7C) showed that the moonlighting proteins have a slightly larger average number of clusters of functionally similar proteins per co-expressed genes than that for non-moonlighting proteins, although this difference is not statistically significant (Additional file 1: Table S1). The same conclusion was obtained when we defined co-expressed genes as those which have over 0.4 of the correlation coefficient value (data not shown).Figure 7

Bottom Line: We found that the GO annotations of moonlighting proteins can be clustered into multiple groups reflecting their diverse functions.We found that moonlighting proteins physically interact with a higher number of distinct functional classes of proteins than non-moonlighting ones and also found that most of the physically interacting partners of moonlighting proteins share the latter's primary functions.Interestingly, we also found that moonlighting proteins tend to interact with other moonlighting proteins.

View Article: PubMed Central - PubMed

ABSTRACT

Background: Moonlighting proteins perform two or more cellular functions, which are selected based on various contexts including the cell type they are expressed, their oligomerization status, and the binding of different ligands at different sites. To understand overall landscape of their functional diversity, it is important to establish methods that can identify moonlighting proteins in a systematic fashion. Here, we have developed a computational framework to find moonlighting proteins on a genome scale and identified multiple proteomic characteristics of these proteins.

Results: First, we analyzed Gene Ontology (GO) annotations of known moonlighting proteins. We found that the GO annotations of moonlighting proteins can be clustered into multiple groups reflecting their diverse functions. Then, by considering the observed GO term separations, we identified 33 novel moonlighting proteins in Escherichia coli and confirmed them by literature review. Next, we analyzed moonlighting proteins in terms of protein-protein interaction, gene expression, phylogenetic profile, and genetic interaction networks. We found that moonlighting proteins physically interact with a higher number of distinct functional classes of proteins than non-moonlighting ones and also found that most of the physically interacting partners of moonlighting proteins share the latter's primary functions. Interestingly, we also found that moonlighting proteins tend to interact with other moonlighting proteins. In terms of gene expression and phylogenetically related proteins, a weak trend was observed that moonlighting proteins interact with more functionally diverse proteins. Structural characteristics of moonlighting proteins, i.e. intrinsic disordered regions and ligand binding sites were also investigated.

Conclusion: Additional functions of moonlighting proteins are difficult to identify by experiments and these proteins also pose a significant challenge for computational function annotation. Our method enables identification of novel moonlighting proteins from current functional annotations in public databases. Moreover, we showed that potential moonlighting proteins without sufficient functional annotations can be identified by analyzing available omics-scale data. Our findings open up new possibilities for investigating the multi-functional nature of proteins at the systems level and for exploring the complex functional interplay of proteins in a cell.

Reviewers: This article was reviewed by Michael Galperin, Eugine Koonin, and Nick Grishin.

Show MeSH
Related in: MedlinePlus