Limits...
Codon Bias Patterns of E. coli's Interacting Proteins.

Dilucca M, Cimini G, Semmoloni A, Deiana A, Giansanti A - PLoS ONE (2015)

Bottom Line: We show that CompAI and tAI capture similar information by being positively correlated with gene conservation, measured by the Evolutionary Retention Index (ERI), and essentiality, whereas, CAI and Nc appear to be less sensitive to evolutionary-functional parameters.Notably, the rate of variation of tAI and CompAI with ERI allows to obtain sets of genes that consistently belong to specific clusters of orthologous genes (COGs).Notably, CompAI may potentially correlate with translation speed measurements, by accounting for the specific delay induced by wobble-pairing between codons and anticodons.

View Article: PubMed Central - PubMed

Affiliation: Dipartimento di Fisica, Sapienza University of Rome, Rome, Italy.

ABSTRACT
Synonymous codons, i.e., DNA nucleotide triplets coding for the same amino acid, are used differently across the variety of living organisms. The biological meaning of this phenomenon, known as codon usage bias, is still controversial. In order to shed light on this point, we propose a new codon bias index, CompAI, that is based on the competition between cognate and near-cognate tRNAs during translation, without being tuned to the usage bias of highly expressed genes. We perform a genome-wide evaluation of codon bias for E.coli, comparing CompAI with other widely used indices: tAI, CAI, and Nc. We show that CompAI and tAI capture similar information by being positively correlated with gene conservation, measured by the Evolutionary Retention Index (ERI), and essentiality, whereas, CAI and Nc appear to be less sensitive to evolutionary-functional parameters. Notably, the rate of variation of tAI and CompAI with ERI allows to obtain sets of genes that consistently belong to specific clusters of orthologous genes (COGs). We also investigate the correlation of codon bias at the genomic level with the network features of protein-protein interactions in E.coli. We find that the most densely connected communities of the network share a similar level of codon bias (as measured by CompAI and tAI). Conversely, a small difference in codon bias between two genes is, statistically, a prerequisite for the corresponding proteins to interact. Importantly, among all codon bias indices, CompAI turns out to have the most coherent distribution over the communities of the interactome, pointing to the significance of competition among cognate and near-cognate tRNAs for explaining codon usage adaptation. Notably, CompAI may potentially correlate with translation speed measurements, by accounting for the specific delay induced by wobble-pairing between codons and anticodons.

Show MeSH
Histogram of the Z-score for Pr(link/d) for each pair of genes and their respectively encoded proteins.d is the Euclidean distance between pairs of genes in the space of the first two PCA components of codon bias, and Pr(link/d) is the conditional probability of having a link in the PIN between two proteins given that their encoding genes are localized within a distances d in the PC1-PC2 plane. The Z-score is obtained as Z[Pr(link/d)] = [Pr(link/d) − 〈Pr(link/d)〉Ω]/σΩ[Pr(link/d)]. The gray dashed lines mark the significance interval of ±3σ.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4643964&req=5

pone.0142127.g009: Histogram of the Z-score for Pr(link/d) for each pair of genes and their respectively encoded proteins.d is the Euclidean distance between pairs of genes in the space of the first two PCA components of codon bias, and Pr(link/d) is the conditional probability of having a link in the PIN between two proteins given that their encoding genes are localized within a distances d in the PC1-PC2 plane. The Z-score is obtained as Z[Pr(link/d)] = [Pr(link/d) − 〈Pr(link/d)〉Ω]/σΩ[Pr(link/d)]. The gray dashed lines mark the significance interval of ±3σ.

Mentions: Finally we perform PCA over the space of the four codon bias indices (CompAI, CAI, tAI, Nc) measured for each E.coli gene. The two first principal components (PC1 and PC2) turn out to represent as much as 85% of the total variance of codon bias over the genome (left plot of Fig 7). Projection of the first two principal components on the individual codon bias indices (loadings) shows that none of the four indices predominantly contributes to the data variability (right plot of Fig 7). Thus, the placement of a gene in the PC1-PC2 plane depends on a weighted contribution of all the indices. Interestingly, the genes encoding for the proteins of the eight top MCODE communities are well localized and separated in this reduced space (Fig 8). In particular, the first community (i.e., the core of ribosomal proteins characterized by high values of both CompAI and tAI) is located in the upper left part of the graph, isolated from the others. This represents an important evidence: proteins that belong to the densest connected cores of the interactome are well-localized in the space of the two principal components. In other words, if a set of proteins are physically and functionally connected in a module, then their corresponding genes should share common codon bias features. Conversely, we can obtain an estimate for the conditional probability Pr(link/d) of a functional interaction between proteins, provided that their relative genes fall within a distance d in the plane of the two principal components PC1 and PC2. Reasonably, we compare Pr(link/d) estimated on the real interactome with 〈Pr(link/d)〉Ω estimated on the Configuration Model (CM) which, we recall, is a degree-conserving randomization (re-wiring) of the network. Fig 9 shows the Z-score for Pr(link/d) as a function of d, and reveals a peculiar behavior: for small distances (d ≤ 2) the probability of finding a connection between two proteins is much higher than what could have been expected from a (degree-conserving) random link placement. Conversely, for medium distances (3 ≤ d ≤ 9), the linking probability is lower than that of the CM, whereas, the real network and the CM become compatible for large distances, where, however, connections are rather few. This analysis shows that sets of genes sharing similar codon usage patterns encode for proteins that are much more likely to interact than in situations where chance alone is responsible for the structure of the interactome.


Codon Bias Patterns of E. coli's Interacting Proteins.

Dilucca M, Cimini G, Semmoloni A, Deiana A, Giansanti A - PLoS ONE (2015)

Histogram of the Z-score for Pr(link/d) for each pair of genes and their respectively encoded proteins.d is the Euclidean distance between pairs of genes in the space of the first two PCA components of codon bias, and Pr(link/d) is the conditional probability of having a link in the PIN between two proteins given that their encoding genes are localized within a distances d in the PC1-PC2 plane. The Z-score is obtained as Z[Pr(link/d)] = [Pr(link/d) − 〈Pr(link/d)〉Ω]/σΩ[Pr(link/d)]. The gray dashed lines mark the significance interval of ±3σ.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4643964&req=5

pone.0142127.g009: Histogram of the Z-score for Pr(link/d) for each pair of genes and their respectively encoded proteins.d is the Euclidean distance between pairs of genes in the space of the first two PCA components of codon bias, and Pr(link/d) is the conditional probability of having a link in the PIN between two proteins given that their encoding genes are localized within a distances d in the PC1-PC2 plane. The Z-score is obtained as Z[Pr(link/d)] = [Pr(link/d) − 〈Pr(link/d)〉Ω]/σΩ[Pr(link/d)]. The gray dashed lines mark the significance interval of ±3σ.
Mentions: Finally we perform PCA over the space of the four codon bias indices (CompAI, CAI, tAI, Nc) measured for each E.coli gene. The two first principal components (PC1 and PC2) turn out to represent as much as 85% of the total variance of codon bias over the genome (left plot of Fig 7). Projection of the first two principal components on the individual codon bias indices (loadings) shows that none of the four indices predominantly contributes to the data variability (right plot of Fig 7). Thus, the placement of a gene in the PC1-PC2 plane depends on a weighted contribution of all the indices. Interestingly, the genes encoding for the proteins of the eight top MCODE communities are well localized and separated in this reduced space (Fig 8). In particular, the first community (i.e., the core of ribosomal proteins characterized by high values of both CompAI and tAI) is located in the upper left part of the graph, isolated from the others. This represents an important evidence: proteins that belong to the densest connected cores of the interactome are well-localized in the space of the two principal components. In other words, if a set of proteins are physically and functionally connected in a module, then their corresponding genes should share common codon bias features. Conversely, we can obtain an estimate for the conditional probability Pr(link/d) of a functional interaction between proteins, provided that their relative genes fall within a distance d in the plane of the two principal components PC1 and PC2. Reasonably, we compare Pr(link/d) estimated on the real interactome with 〈Pr(link/d)〉Ω estimated on the Configuration Model (CM) which, we recall, is a degree-conserving randomization (re-wiring) of the network. Fig 9 shows the Z-score for Pr(link/d) as a function of d, and reveals a peculiar behavior: for small distances (d ≤ 2) the probability of finding a connection between two proteins is much higher than what could have been expected from a (degree-conserving) random link placement. Conversely, for medium distances (3 ≤ d ≤ 9), the linking probability is lower than that of the CM, whereas, the real network and the CM become compatible for large distances, where, however, connections are rather few. This analysis shows that sets of genes sharing similar codon usage patterns encode for proteins that are much more likely to interact than in situations where chance alone is responsible for the structure of the interactome.

Bottom Line: We show that CompAI and tAI capture similar information by being positively correlated with gene conservation, measured by the Evolutionary Retention Index (ERI), and essentiality, whereas, CAI and Nc appear to be less sensitive to evolutionary-functional parameters.Notably, the rate of variation of tAI and CompAI with ERI allows to obtain sets of genes that consistently belong to specific clusters of orthologous genes (COGs).Notably, CompAI may potentially correlate with translation speed measurements, by accounting for the specific delay induced by wobble-pairing between codons and anticodons.

View Article: PubMed Central - PubMed

Affiliation: Dipartimento di Fisica, Sapienza University of Rome, Rome, Italy.

ABSTRACT
Synonymous codons, i.e., DNA nucleotide triplets coding for the same amino acid, are used differently across the variety of living organisms. The biological meaning of this phenomenon, known as codon usage bias, is still controversial. In order to shed light on this point, we propose a new codon bias index, CompAI, that is based on the competition between cognate and near-cognate tRNAs during translation, without being tuned to the usage bias of highly expressed genes. We perform a genome-wide evaluation of codon bias for E.coli, comparing CompAI with other widely used indices: tAI, CAI, and Nc. We show that CompAI and tAI capture similar information by being positively correlated with gene conservation, measured by the Evolutionary Retention Index (ERI), and essentiality, whereas, CAI and Nc appear to be less sensitive to evolutionary-functional parameters. Notably, the rate of variation of tAI and CompAI with ERI allows to obtain sets of genes that consistently belong to specific clusters of orthologous genes (COGs). We also investigate the correlation of codon bias at the genomic level with the network features of protein-protein interactions in E.coli. We find that the most densely connected communities of the network share a similar level of codon bias (as measured by CompAI and tAI). Conversely, a small difference in codon bias between two genes is, statistically, a prerequisite for the corresponding proteins to interact. Importantly, among all codon bias indices, CompAI turns out to have the most coherent distribution over the communities of the interactome, pointing to the significance of competition among cognate and near-cognate tRNAs for explaining codon usage adaptation. Notably, CompAI may potentially correlate with translation speed measurements, by accounting for the specific delay induced by wobble-pairing between codons and anticodons.

Show MeSH