Limits...
Codon and Amino Acid Usage Are Shaped by Selection Across Divergent Model Organisms of the Pancrustacea.

Whittle CA, Extavour CG - G3 (Bethesda) (2015)

Bottom Line: We report strong signals of AT3 optimal codons (those favored in highly expressed genes) in G. bimaculatus and O. fasciatus, whereas weaker signs of GC3 optimal codons were found in P. hawaiensis, suggesting selection on codon usage in all three organisms.Further, in G. bimaculatus and O. fasciatus, high expression was associated with lowered frequency of amino acids with large size/complexity (S/C) scores in favor of those with intermediate S/C values; thus, selection may favor smaller amino acids while retaining those of moderate size for protein stability or conformation.Together, based on examination of 1,680,067, 1,667,783, and 1,326,896 codon sites in G. bimaculatus, O. fasciatus, and P. hawaiensis, respectively, we conclude that translational selection shapes codon and amino acid usage in these three Pancrustacean arthropods.

View Article: PubMed Central - PubMed

Affiliation: Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138.

No MeSH data available.


The two amino acids with the largest positive (left) and negative (right) correlation to Fop/expression level in (A) G. bimaculatus; (B) O. fasciatus; and (C) P. hawaiensis. Fop was binned into four categories as shown. Spearman R correlations in Table 2 were calculated with the use of all (unbinned) data points.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4632051&req=5

fig3: The two amino acids with the largest positive (left) and negative (right) correlation to Fop/expression level in (A) G. bimaculatus; (B) O. fasciatus; and (C) P. hawaiensis. Fop was binned into four categories as shown. Spearman R correlations in Table 2 were calculated with the use of all (unbinned) data points.

Mentions: We further studied the relationship between expression and the frequency of each of the individual 20 amino acids. For this, we used Fop as a proxy for the relative expression level at the genome-wide level (Coghlan and Wolfe 2000; Drummond et al. 2005; Popescu et al. 2006; Wall et al. 2005), which is apt to be less noisy than RPM or RPKM at intermediate expression levels (outside the upper and lower 5%; see Identification of optimal codons). In addition, we wished to assess the relationship between Fop and amino acid in and of itself. The Spearman rank correlations between Fop and amino acid frequency per CDS at the genome-wide level are shown in Table 2. The results showed that for G. bimaculatus, 15 amino acids exhibited a statistically significant correlation. The negative R values, which indicate amino acids used rarely in genes with high Fop/expression level, include four of the six amino acids with high S/C scores (>40, Table S3) namely Arg, Met, His, and Trp (the latter two are nonsignificant), consistent both with the inverse relationship described above between PrHigh SC and Fop (see Optimal codon usage correlates to amino acid size and complexity), and suggest selection against these amino acids in highly expressed genes. The positive correlations indicate that the more frequently an amino acid appeared in a CDS sequence, the more likely it was to display elevated optimal codon usage. Accordingly, the amino acids most favored under high expression (with R > 0.269, P < 10−15) included Glu and Asp, which have moderate S/C scores (between 32.7 and 36.5, Table S3). The amino acids Asn, Lys and Ile also exhibited substantial positive correlations with frequency in CDS and expression levels (R = 0.169, 0.151, and 0.115, respectively, P < 10−15), and have moderate or low S/C scores (33.7, 30.1, and 16.0, respectively, Table S3). The two amino acids with the largest positive R values with respect to Fop, namely Glu and Asp, and those with the most negative values, Arg and Thr, are illustrated in Figure 3A (Fop values are binned into four distinct categories: Fop < 0.3, ≥3 Fop < 0.4, ≥0.4 Fop < 0.5, and Fop ≥0.5). These results further confirm the striking shifts in amino acid frequency under high Fop/expression. In summary, it is evident that specific amino acids are preferred under high expression in G. bimaculatus, and these tend to be of moderate or low size and complexity.


Codon and Amino Acid Usage Are Shaped by Selection Across Divergent Model Organisms of the Pancrustacea.

Whittle CA, Extavour CG - G3 (Bethesda) (2015)

The two amino acids with the largest positive (left) and negative (right) correlation to Fop/expression level in (A) G. bimaculatus; (B) O. fasciatus; and (C) P. hawaiensis. Fop was binned into four categories as shown. Spearman R correlations in Table 2 were calculated with the use of all (unbinned) data points.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4632051&req=5

fig3: The two amino acids with the largest positive (left) and negative (right) correlation to Fop/expression level in (A) G. bimaculatus; (B) O. fasciatus; and (C) P. hawaiensis. Fop was binned into four categories as shown. Spearman R correlations in Table 2 were calculated with the use of all (unbinned) data points.
Mentions: We further studied the relationship between expression and the frequency of each of the individual 20 amino acids. For this, we used Fop as a proxy for the relative expression level at the genome-wide level (Coghlan and Wolfe 2000; Drummond et al. 2005; Popescu et al. 2006; Wall et al. 2005), which is apt to be less noisy than RPM or RPKM at intermediate expression levels (outside the upper and lower 5%; see Identification of optimal codons). In addition, we wished to assess the relationship between Fop and amino acid in and of itself. The Spearman rank correlations between Fop and amino acid frequency per CDS at the genome-wide level are shown in Table 2. The results showed that for G. bimaculatus, 15 amino acids exhibited a statistically significant correlation. The negative R values, which indicate amino acids used rarely in genes with high Fop/expression level, include four of the six amino acids with high S/C scores (>40, Table S3) namely Arg, Met, His, and Trp (the latter two are nonsignificant), consistent both with the inverse relationship described above between PrHigh SC and Fop (see Optimal codon usage correlates to amino acid size and complexity), and suggest selection against these amino acids in highly expressed genes. The positive correlations indicate that the more frequently an amino acid appeared in a CDS sequence, the more likely it was to display elevated optimal codon usage. Accordingly, the amino acids most favored under high expression (with R > 0.269, P < 10−15) included Glu and Asp, which have moderate S/C scores (between 32.7 and 36.5, Table S3). The amino acids Asn, Lys and Ile also exhibited substantial positive correlations with frequency in CDS and expression levels (R = 0.169, 0.151, and 0.115, respectively, P < 10−15), and have moderate or low S/C scores (33.7, 30.1, and 16.0, respectively, Table S3). The two amino acids with the largest positive R values with respect to Fop, namely Glu and Asp, and those with the most negative values, Arg and Thr, are illustrated in Figure 3A (Fop values are binned into four distinct categories: Fop < 0.3, ≥3 Fop < 0.4, ≥0.4 Fop < 0.5, and Fop ≥0.5). These results further confirm the striking shifts in amino acid frequency under high Fop/expression. In summary, it is evident that specific amino acids are preferred under high expression in G. bimaculatus, and these tend to be of moderate or low size and complexity.

Bottom Line: We report strong signals of AT3 optimal codons (those favored in highly expressed genes) in G. bimaculatus and O. fasciatus, whereas weaker signs of GC3 optimal codons were found in P. hawaiensis, suggesting selection on codon usage in all three organisms.Further, in G. bimaculatus and O. fasciatus, high expression was associated with lowered frequency of amino acids with large size/complexity (S/C) scores in favor of those with intermediate S/C values; thus, selection may favor smaller amino acids while retaining those of moderate size for protein stability or conformation.Together, based on examination of 1,680,067, 1,667,783, and 1,326,896 codon sites in G. bimaculatus, O. fasciatus, and P. hawaiensis, respectively, we conclude that translational selection shapes codon and amino acid usage in these three Pancrustacean arthropods.

View Article: PubMed Central - PubMed

Affiliation: Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138.

No MeSH data available.