Estimating Gene Expression and Codon-Specific Translational Efficiencies, Mutation Biases, and Selection Coefficients from Genomic Data Alone.
Bottom Line: We also observe strong agreement between our parameter estimates and those derived from alternative data sets.Our estimates of codon-specific translational inefficiencies and tRNA copy number-based estimates of ribosome pausing time ([Formula: see text]), and mRNA and ribosome profiling footprint-based estimates of gene expression ([Formula: see text]) are also highly correlated, thus supporting the hypothesis that selection against translational inefficiency is an important force driving the evolution of CUB.In conclusion, our method demonstrates that an enormous amount of biologically important information is encoded within genome scale patterns of codon usage, accessing this information does not require gene expression measurements, but instead carefully formulated biologically interpretable models.
Affiliation: Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville National Institute for Mathematical and Biological Synthesis, Knoxville, Tennessee firstname.lastname@example.org.Show MeSH
Related in: MedlinePlus
Mentions: The assumptions of the ROC SEMPPR model imply that the codon-specific translationalinefficiencies are independent of codon position within a sequence. As a result, therelative strength of purifying selection on synonymous codon j incomparison to codon i in a gene with an average protein synthesisrate is (5)S(Δηi,j,Φ)=−Δηi,jΦ. We remind the reader that includes the effective population size,, in its definition. As a result, our selectioncoefficients S are measured relative to the strength of geneticdrift, , as is commonly done. The distribution ofS across all genes for each alternative to an amino acid’sreference codon is illustrated in figure7 and summarized in table 1.Tables with genome-wide gene and codon-specific estimates of S canbe found in the supplementary material, Supplementary Material online. Recall that S isscaled by and that the distribution of values across genes appears to follow a heavy taileddistribution. As a result even though, by definition, the average value of is 1, the large majority of genes have values less than 1. As a result, although purifyingselection on synonymous codons is universal, its selection coefficients are usuallyquite small (i.e., ). Nevertheless, because our framework utilizesinformation on CUB held across genes, we can clearly detect the signature ofselection at the genome level, specifically in the form of values whose posterior CIs differ from 0, whereasother approaches might fail. Fig. 7.—
Affiliation: Department of Ecology & Evolutionary Biology, University of Tennessee, Knoxville National Institute for Mathematical and Biological Synthesis, Knoxville, Tennessee email@example.com.