Limits...
Global Shifts in Genome and Proteome Composition Are Very Tightly Coupled.

Brbić M, Warnecke T, Kriško A, Supek F - Genome Biol Evol (2015)

Bottom Line: We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes.Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome.We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level.

View Article: PubMed Central - PubMed

Affiliation: Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia.

Show MeSH

Related in: MedlinePlus

Differences in environment-specific trends in dinucleotide composition of 1st, 2nd, and 3rd codon sites in protein-coding genes. Shifts in G + C and dinucleotide frequencies between thermophilic and nonthermophilic (A), halophilic and nonhalophilic (B), strictly anaerobic and aerotolerant (C), and psychrophilic and nonpsychrophilic (D) organisms at different codon positions. Bars show AUROC scores, a measure of separability of two distributions by the given feature, where 0.5 signifies maximal overlap, and most extreme values (0 or 1) indicate complete separation of, for example, thermophiles and mesophiles by the frequency of given dinucleotide; values less than 0.5 and greater than 0.5 here indicate opposite directions of the shift. Error bars are 95% CI of the AUROC. Asterisks show significant differences in the environment-associated shifts between codon positions at less than 10% FDR.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4494046&req=5

evv088-F6: Differences in environment-specific trends in dinucleotide composition of 1st, 2nd, and 3rd codon sites in protein-coding genes. Shifts in G + C and dinucleotide frequencies between thermophilic and nonthermophilic (A), halophilic and nonhalophilic (B), strictly anaerobic and aerotolerant (C), and psychrophilic and nonpsychrophilic (D) organisms at different codon positions. Bars show AUROC scores, a measure of separability of two distributions by the given feature, where 0.5 signifies maximal overlap, and most extreme values (0 or 1) indicate complete separation of, for example, thermophiles and mesophiles by the frequency of given dinucleotide; values less than 0.5 and greater than 0.5 here indicate opposite directions of the shift. Error bars are 95% CI of the AUROC. Asterisks show significant differences in the environment-associated shifts between codon positions at less than 10% FDR.

Mentions: We separately analyzed different positions in coding DNA and calculated the G + C and dinucleotide frequencies for each of three codon positions. Under “first codon position,” we assume the first and the second nucleotide in the codon, the “second position” are the second and the third nucleotides, whereas the “third position” are the third nucleotide and the first one in the next codon. For each environment and each dinucleotide frequency we determined the Mann–Whitney statistic separately (using R), and normalized it to the readily interpretable AUROC score by dividing with the product of the sample sizes for the two classes. The analyses in figure 6 implicitly account for phylogenetic relatedness, as the first sites are compared with second sites (and 2nd vs. 3rd, and 3rd vs. 1st) in the exact same set of genomes. In other words, if a high AUROC score is purely due to phylogenetic signal confounded with the environmental labels, it should be equally so at all codon sites, and no significant difference in AUROC scores will be found.


Global Shifts in Genome and Proteome Composition Are Very Tightly Coupled.

Brbić M, Warnecke T, Kriško A, Supek F - Genome Biol Evol (2015)

Differences in environment-specific trends in dinucleotide composition of 1st, 2nd, and 3rd codon sites in protein-coding genes. Shifts in G + C and dinucleotide frequencies between thermophilic and nonthermophilic (A), halophilic and nonhalophilic (B), strictly anaerobic and aerotolerant (C), and psychrophilic and nonpsychrophilic (D) organisms at different codon positions. Bars show AUROC scores, a measure of separability of two distributions by the given feature, where 0.5 signifies maximal overlap, and most extreme values (0 or 1) indicate complete separation of, for example, thermophiles and mesophiles by the frequency of given dinucleotide; values less than 0.5 and greater than 0.5 here indicate opposite directions of the shift. Error bars are 95% CI of the AUROC. Asterisks show significant differences in the environment-associated shifts between codon positions at less than 10% FDR.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4494046&req=5

evv088-F6: Differences in environment-specific trends in dinucleotide composition of 1st, 2nd, and 3rd codon sites in protein-coding genes. Shifts in G + C and dinucleotide frequencies between thermophilic and nonthermophilic (A), halophilic and nonhalophilic (B), strictly anaerobic and aerotolerant (C), and psychrophilic and nonpsychrophilic (D) organisms at different codon positions. Bars show AUROC scores, a measure of separability of two distributions by the given feature, where 0.5 signifies maximal overlap, and most extreme values (0 or 1) indicate complete separation of, for example, thermophiles and mesophiles by the frequency of given dinucleotide; values less than 0.5 and greater than 0.5 here indicate opposite directions of the shift. Error bars are 95% CI of the AUROC. Asterisks show significant differences in the environment-associated shifts between codon positions at less than 10% FDR.
Mentions: We separately analyzed different positions in coding DNA and calculated the G + C and dinucleotide frequencies for each of three codon positions. Under “first codon position,” we assume the first and the second nucleotide in the codon, the “second position” are the second and the third nucleotides, whereas the “third position” are the third nucleotide and the first one in the next codon. For each environment and each dinucleotide frequency we determined the Mann–Whitney statistic separately (using R), and normalized it to the readily interpretable AUROC score by dividing with the product of the sample sizes for the two classes. The analyses in figure 6 implicitly account for phylogenetic relatedness, as the first sites are compared with second sites (and 2nd vs. 3rd, and 3rd vs. 1st) in the exact same set of genomes. In other words, if a high AUROC score is purely due to phylogenetic signal confounded with the environmental labels, it should be equally so at all codon sites, and no significant difference in AUROC scores will be found.

Bottom Line: We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes.Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome.We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level.

View Article: PubMed Central - PubMed

Affiliation: Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia.

Show MeSH
Related in: MedlinePlus