Limits...
Estimating Gene Expression and Codon-Specific Translational Efficiencies, Mutation Biases, and Selection Coefficients from Genomic Data Alone.

Gilchrist MA, Chen WC, Shah P, Landerer CL, Zaretzki R - Genome Biol Evol (2015)

Bottom Line: We also observe strong agreement between our parameter estimates and those derived from alternative data sets.Our estimates of codon-specific translational inefficiencies and tRNA copy number-based estimates of ribosome pausing time ([Formula: see text]), and mRNA and ribosome profiling footprint-based estimates of gene expression ([Formula: see text]) are also highly correlated, thus supporting the hypothesis that selection against translational inefficiency is an important force driving the evolution of CUB.In conclusion, our method demonstrates that an enormous amount of biologically important information is encoded within genome scale patterns of codon usage, accessing this information does not require gene expression measurements, but instead carefully formulated biologically interpretable models.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville National Institute for Mathematical and Biological Synthesis, Knoxville, Tennessee mikeg@utk.edu.

Show MeSH

Related in: MedlinePlus

Comparison of gene-specific selection coefficients on synonymous codon usage from the without model fit to the S.cerevisiae genome and those from fitting the FMutSel model fromYang and Nielsen (2008) for106 yeast genes used in Rokaset al. (2003) as estimated by Kubatko LS, ShahP, Herbei R, Gilchrist M (unpublished data). For more details, see the maintext. Selection coefficient S was calculated on agene-by-gene basis and relative to the most translationally efficient codonfor a given amino acid (which is the codon listed first in the legend).Lines indicate linear regression line best fit and the correspondingcorrelation coefficients are listed as well with an indicating model fits withP < 0.05. Under the FMutSel model, monomorphic sitesacross species can lead to estimates of S =−∞, these observations are plotted on thex axis.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4494061&req=5

evv087-F8: Comparison of gene-specific selection coefficients on synonymous codon usage from the without model fit to the S.cerevisiae genome and those from fitting the FMutSel model fromYang and Nielsen (2008) for106 yeast genes used in Rokaset al. (2003) as estimated by Kubatko LS, ShahP, Herbei R, Gilchrist M (unpublished data). For more details, see the maintext. Selection coefficient S was calculated on agene-by-gene basis and relative to the most translationally efficient codonfor a given amino acid (which is the codon listed first in the legend).Lines indicate linear regression line best fit and the correspondingcorrelation coefficients are listed as well with an indicating model fits withP < 0.05. Under the FMutSel model, monomorphic sitesacross species can lead to estimates of S =−∞, these observations are plotted on thex axis.

Mentions: Figure 8 compares ourwithout ROC SEMPPR-based estimates of S withthose estimated using the FMutSel phylogenetic model of Yang and Nielsen (2008) using PAML (Yang 2007) for the 106 genes in the Rokas et al. (2003) data set. Overall, weobserve reasonable qualitative agreement between the two models with the majority ofcodon-specific predictions having correlation coefficients . Unfortunately, although PAML providesmaximum-likelihood point estimates of parameters, it does not provide any confidenceintervals for these parameters. Given the large number of parameters (>60)estimated from each coding sequence by FMutSel, the confidence intervals for eachparameter are likely to be large and, hence, could explain much of the variation weobserve between ROC SEMPPR and FMutSel parameter estimates. Nonetheless, for85% of the codons examined (34/40), we observe is a significant(P < 0.05) and positive linear relationship between the ROCSEMPPR and the FMutSel estimates of S (see supplementary table S11, Supplementary Material online). Of the remaining six codons, halfexhibit a positive, but nonsignificant relationship between ROC SEMPPR andFMutSel’s estimates of S, whereas the other half exhibit anegative, but again nonsignificant, relationship between estimates ofS. Thus for 92% of the codons, both the ROC SEMPPR andFMutSel estimates of S agree qualitatively. Fig. 8.—


Estimating Gene Expression and Codon-Specific Translational Efficiencies, Mutation Biases, and Selection Coefficients from Genomic Data Alone.

Gilchrist MA, Chen WC, Shah P, Landerer CL, Zaretzki R - Genome Biol Evol (2015)

Comparison of gene-specific selection coefficients on synonymous codon usage from the without model fit to the S.cerevisiae genome and those from fitting the FMutSel model fromYang and Nielsen (2008) for106 yeast genes used in Rokaset al. (2003) as estimated by Kubatko LS, ShahP, Herbei R, Gilchrist M (unpublished data). For more details, see the maintext. Selection coefficient S was calculated on agene-by-gene basis and relative to the most translationally efficient codonfor a given amino acid (which is the codon listed first in the legend).Lines indicate linear regression line best fit and the correspondingcorrelation coefficients are listed as well with an indicating model fits withP < 0.05. Under the FMutSel model, monomorphic sitesacross species can lead to estimates of S =−∞, these observations are plotted on thex axis.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4494061&req=5

evv087-F8: Comparison of gene-specific selection coefficients on synonymous codon usage from the without model fit to the S.cerevisiae genome and those from fitting the FMutSel model fromYang and Nielsen (2008) for106 yeast genes used in Rokaset al. (2003) as estimated by Kubatko LS, ShahP, Herbei R, Gilchrist M (unpublished data). For more details, see the maintext. Selection coefficient S was calculated on agene-by-gene basis and relative to the most translationally efficient codonfor a given amino acid (which is the codon listed first in the legend).Lines indicate linear regression line best fit and the correspondingcorrelation coefficients are listed as well with an indicating model fits withP < 0.05. Under the FMutSel model, monomorphic sitesacross species can lead to estimates of S =−∞, these observations are plotted on thex axis.
Mentions: Figure 8 compares ourwithout ROC SEMPPR-based estimates of S withthose estimated using the FMutSel phylogenetic model of Yang and Nielsen (2008) using PAML (Yang 2007) for the 106 genes in the Rokas et al. (2003) data set. Overall, weobserve reasonable qualitative agreement between the two models with the majority ofcodon-specific predictions having correlation coefficients . Unfortunately, although PAML providesmaximum-likelihood point estimates of parameters, it does not provide any confidenceintervals for these parameters. Given the large number of parameters (>60)estimated from each coding sequence by FMutSel, the confidence intervals for eachparameter are likely to be large and, hence, could explain much of the variation weobserve between ROC SEMPPR and FMutSel parameter estimates. Nonetheless, for85% of the codons examined (34/40), we observe is a significant(P < 0.05) and positive linear relationship between the ROCSEMPPR and the FMutSel estimates of S (see supplementary table S11, Supplementary Material online). Of the remaining six codons, halfexhibit a positive, but nonsignificant relationship between ROC SEMPPR andFMutSel’s estimates of S, whereas the other half exhibit anegative, but again nonsignificant, relationship between estimates ofS. Thus for 92% of the codons, both the ROC SEMPPR andFMutSel estimates of S agree qualitatively. Fig. 8.—

Bottom Line: We also observe strong agreement between our parameter estimates and those derived from alternative data sets.Our estimates of codon-specific translational inefficiencies and tRNA copy number-based estimates of ribosome pausing time ([Formula: see text]), and mRNA and ribosome profiling footprint-based estimates of gene expression ([Formula: see text]) are also highly correlated, thus supporting the hypothesis that selection against translational inefficiency is an important force driving the evolution of CUB.In conclusion, our method demonstrates that an enormous amount of biologically important information is encoded within genome scale patterns of codon usage, accessing this information does not require gene expression measurements, but instead carefully formulated biologically interpretable models.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville National Institute for Mathematical and Biological Synthesis, Knoxville, Tennessee mikeg@utk.edu.

Show MeSH
Related in: MedlinePlus