Limits...
Estimating Gene Expression and Codon-Specific Translational Efficiencies, Mutation Biases, and Selection Coefficients from Genomic Data Alone.

Gilchrist MA, Chen WC, Shah P, Landerer CL, Zaretzki R - Genome Biol Evol (2015)

Bottom Line: We also observe strong agreement between our parameter estimates and those derived from alternative data sets.Our estimates of codon-specific translational inefficiencies and tRNA copy number-based estimates of ribosome pausing time ([Formula: see text]), and mRNA and ribosome profiling footprint-based estimates of gene expression ([Formula: see text]) are also highly correlated, thus supporting the hypothesis that selection against translational inefficiency is an important force driving the evolution of CUB.In conclusion, our method demonstrates that an enormous amount of biologically important information is encoded within genome scale patterns of codon usage, accessing this information does not require gene expression measurements, but instead carefully formulated biologically interpretable models.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville National Institute for Mathematical and Biological Synthesis, Knoxville, Tennessee mikeg@utk.edu.

Show MeSH

Related in: MedlinePlus

Model predictions and observed codon usage frequencies as a function ofestimated protein synthesis rate  for the S. cerevisiae S288cgenome. The units for  are proteins/t and timet is scaled such that the prior for satisfies . Each amino acid is represented by aseparate subplot. Solid, dashed, and dotted lines represent thewithout, with ROC SEMPPR model fits, and a simple logisticregression approach where the estimation error in  is ignored, respectively. None of theparameter estimates’ 95% CIs overlaps with 0 except. Genes are binned by their expression levelswith solid dots indicating the mean codon frequency of the genes in therespective bin. Error bars indicate the standard deviation in codonfrequency across genes within a bin. For each amino acid, the codon favoredby natural selection for reducing translational inefficiency is indicated bya . The four  indicate codons that have been previouslyidentified as optimal but our ROC SEMPPR model fits indicate these codonsactually are the second most efficient codons. A histogram of the values is presented in the lower rightcorner. Estimates of protein synthesis rates  are based on the with ROC SEMPPR model fits, thus representing ourbest estimate of their values.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4494061&req=5

evv087-F6: Model predictions and observed codon usage frequencies as a function ofestimated protein synthesis rate for the S. cerevisiae S288cgenome. The units for are proteins/t and timet is scaled such that the prior for satisfies . Each amino acid is represented by aseparate subplot. Solid, dashed, and dotted lines represent thewithout, with ROC SEMPPR model fits, and a simple logisticregression approach where the estimation error in is ignored, respectively. None of theparameter estimates’ 95% CIs overlaps with 0 except. Genes are binned by their expression levelswith solid dots indicating the mean codon frequency of the genes in therespective bin. Error bars indicate the standard deviation in codonfrequency across genes within a bin. For each amino acid, the codon favoredby natural selection for reducing translational inefficiency is indicated bya . The four indicate codons that have been previouslyidentified as optimal but our ROC SEMPPR model fits indicate these codonsactually are the second most efficient codons. A histogram of the values is presented in the lower rightcorner. Estimates of protein synthesis rates are based on the with ROC SEMPPR model fits, thus representing ourbest estimate of their values.

Mentions: As first shown in Shah and Gilchrist(2011), the relationship between codon usage and protein synthesis rate can range from simple and monotonic to complex. Figure 6 illustrates how codon usage changesacross approximately 2 orders of magnitudes of for each of the multicodon amino acids. Both ROC SEMPPR’swith and without model fits accurately predict how CUB changes withprotein synthesis rates (fig. 6). Indeed,the predicted changes in CUB between the with andwithout ROC SEMPPR model fits are almost indistinguishablefrom one another, reflecting the strong agreement between their estimates ofΔM and Δη across models as discussed above. Fig. 6.—


Estimating Gene Expression and Codon-Specific Translational Efficiencies, Mutation Biases, and Selection Coefficients from Genomic Data Alone.

Gilchrist MA, Chen WC, Shah P, Landerer CL, Zaretzki R - Genome Biol Evol (2015)

Model predictions and observed codon usage frequencies as a function ofestimated protein synthesis rate  for the S. cerevisiae S288cgenome. The units for  are proteins/t and timet is scaled such that the prior for satisfies . Each amino acid is represented by aseparate subplot. Solid, dashed, and dotted lines represent thewithout, with ROC SEMPPR model fits, and a simple logisticregression approach where the estimation error in  is ignored, respectively. None of theparameter estimates’ 95% CIs overlaps with 0 except. Genes are binned by their expression levelswith solid dots indicating the mean codon frequency of the genes in therespective bin. Error bars indicate the standard deviation in codonfrequency across genes within a bin. For each amino acid, the codon favoredby natural selection for reducing translational inefficiency is indicated bya . The four  indicate codons that have been previouslyidentified as optimal but our ROC SEMPPR model fits indicate these codonsactually are the second most efficient codons. A histogram of the values is presented in the lower rightcorner. Estimates of protein synthesis rates  are based on the with ROC SEMPPR model fits, thus representing ourbest estimate of their values.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4494061&req=5

evv087-F6: Model predictions and observed codon usage frequencies as a function ofestimated protein synthesis rate for the S. cerevisiae S288cgenome. The units for are proteins/t and timet is scaled such that the prior for satisfies . Each amino acid is represented by aseparate subplot. Solid, dashed, and dotted lines represent thewithout, with ROC SEMPPR model fits, and a simple logisticregression approach where the estimation error in is ignored, respectively. None of theparameter estimates’ 95% CIs overlaps with 0 except. Genes are binned by their expression levelswith solid dots indicating the mean codon frequency of the genes in therespective bin. Error bars indicate the standard deviation in codonfrequency across genes within a bin. For each amino acid, the codon favoredby natural selection for reducing translational inefficiency is indicated bya . The four indicate codons that have been previouslyidentified as optimal but our ROC SEMPPR model fits indicate these codonsactually are the second most efficient codons. A histogram of the values is presented in the lower rightcorner. Estimates of protein synthesis rates are based on the with ROC SEMPPR model fits, thus representing ourbest estimate of their values.
Mentions: As first shown in Shah and Gilchrist(2011), the relationship between codon usage and protein synthesis rate can range from simple and monotonic to complex. Figure 6 illustrates how codon usage changesacross approximately 2 orders of magnitudes of for each of the multicodon amino acids. Both ROC SEMPPR’swith and without model fits accurately predict how CUB changes withprotein synthesis rates (fig. 6). Indeed,the predicted changes in CUB between the with andwithout ROC SEMPPR model fits are almost indistinguishablefrom one another, reflecting the strong agreement between their estimates ofΔM and Δη across models as discussed above. Fig. 6.—

Bottom Line: We also observe strong agreement between our parameter estimates and those derived from alternative data sets.Our estimates of codon-specific translational inefficiencies and tRNA copy number-based estimates of ribosome pausing time ([Formula: see text]), and mRNA and ribosome profiling footprint-based estimates of gene expression ([Formula: see text]) are also highly correlated, thus supporting the hypothesis that selection against translational inefficiency is an important force driving the evolution of CUB.In conclusion, our method demonstrates that an enormous amount of biologically important information is encoded within genome scale patterns of codon usage, accessing this information does not require gene expression measurements, but instead carefully formulated biologically interpretable models.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville National Institute for Mathematical and Biological Synthesis, Knoxville, Tennessee mikeg@utk.edu.

Show MeSH
Related in: MedlinePlus