Limits...
Global Shifts in Genome and Proteome Composition Are Very Tightly Coupled.

Brbić M, Warnecke T, Kriško A, Supek F - Genome Biol Evol (2015)

Bottom Line: We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes.Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome.We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level.

View Article: PubMed Central - PubMed

Affiliation: Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia.

Show MeSH

Related in: MedlinePlus

Composition of noncoding DNA in 49 fungal genomes is highly predictive of the corresponding proteome composition. (A) Explained variance (as squared Pearson correlation coefficient, R2) in amino acid usage of proteomes in a multiple regression against different sets of features; obtained by considering only the G + C content (blue bars), and by progressively including also the dinucleotide frequencies (red), the trinucleotides (teal), and phylogenetic groups (purple). Error bars are standard deviations from ten runs of cross-validation. (B) The median variance explained using the same sets of features over all 20 amino acids. (C) Cross-validation ROC curves describing the accuracy of discrimination of 13 thermophilic fungi by their AAC (orange) or by the genome composition-normalized AAC (the “AAC residuals,” blue). Inlaid numbers are AUROC scores.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4494046&req=5

evv088-F4: Composition of noncoding DNA in 49 fungal genomes is highly predictive of the corresponding proteome composition. (A) Explained variance (as squared Pearson correlation coefficient, R2) in amino acid usage of proteomes in a multiple regression against different sets of features; obtained by considering only the G + C content (blue bars), and by progressively including also the dinucleotide frequencies (red), the trinucleotides (teal), and phylogenetic groups (purple). Error bars are standard deviations from ten runs of cross-validation. (B) The median variance explained using the same sets of features over all 20 amino acids. (C) Cross-validation ROC curves describing the accuracy of discrimination of 13 thermophilic fungi by their AAC (orange) or by the genome composition-normalized AAC (the “AAC residuals,” blue). Inlaid numbers are AUROC scores.

Mentions: Next, we examined the genomes and proteomes of 49 fungi, of which 13 were thermophilic. Results are broadly consistent with our findings in prokaryotes: The G + C content of noncoding DNA—here encompassing introns and intergenic regions—can explain 60% of the variability in AAC across fungi (fig. 4A and B). Incorporating di- and trinucleotide composition as features in the regression leads to enhanced predictive power (R2 = 0.73), with the further addition of phylogenetic categories leading to 80% of variance in proteome composition being accounted for. As observed for prokaryotes, thermophilic fungi can be recognized with high accuracy from the AAC of their proteomes (AUROC = 0.940; fig. 4C), whereas prediction from AAC residuals after nucleotide composition is factored out is considerably less accurate (AUROC = 0.639; fig. 4C). These findings indicate that the putatively adaptive signatures in AAC emanate from the nucleotide level not only in prokaryotes but also in eukaryotes.Fig. 4.—


Global Shifts in Genome and Proteome Composition Are Very Tightly Coupled.

Brbić M, Warnecke T, Kriško A, Supek F - Genome Biol Evol (2015)

Composition of noncoding DNA in 49 fungal genomes is highly predictive of the corresponding proteome composition. (A) Explained variance (as squared Pearson correlation coefficient, R2) in amino acid usage of proteomes in a multiple regression against different sets of features; obtained by considering only the G + C content (blue bars), and by progressively including also the dinucleotide frequencies (red), the trinucleotides (teal), and phylogenetic groups (purple). Error bars are standard deviations from ten runs of cross-validation. (B) The median variance explained using the same sets of features over all 20 amino acids. (C) Cross-validation ROC curves describing the accuracy of discrimination of 13 thermophilic fungi by their AAC (orange) or by the genome composition-normalized AAC (the “AAC residuals,” blue). Inlaid numbers are AUROC scores.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4494046&req=5

evv088-F4: Composition of noncoding DNA in 49 fungal genomes is highly predictive of the corresponding proteome composition. (A) Explained variance (as squared Pearson correlation coefficient, R2) in amino acid usage of proteomes in a multiple regression against different sets of features; obtained by considering only the G + C content (blue bars), and by progressively including also the dinucleotide frequencies (red), the trinucleotides (teal), and phylogenetic groups (purple). Error bars are standard deviations from ten runs of cross-validation. (B) The median variance explained using the same sets of features over all 20 amino acids. (C) Cross-validation ROC curves describing the accuracy of discrimination of 13 thermophilic fungi by their AAC (orange) or by the genome composition-normalized AAC (the “AAC residuals,” blue). Inlaid numbers are AUROC scores.
Mentions: Next, we examined the genomes and proteomes of 49 fungi, of which 13 were thermophilic. Results are broadly consistent with our findings in prokaryotes: The G + C content of noncoding DNA—here encompassing introns and intergenic regions—can explain 60% of the variability in AAC across fungi (fig. 4A and B). Incorporating di- and trinucleotide composition as features in the regression leads to enhanced predictive power (R2 = 0.73), with the further addition of phylogenetic categories leading to 80% of variance in proteome composition being accounted for. As observed for prokaryotes, thermophilic fungi can be recognized with high accuracy from the AAC of their proteomes (AUROC = 0.940; fig. 4C), whereas prediction from AAC residuals after nucleotide composition is factored out is considerably less accurate (AUROC = 0.639; fig. 4C). These findings indicate that the putatively adaptive signatures in AAC emanate from the nucleotide level not only in prokaryotes but also in eukaryotes.Fig. 4.—

Bottom Line: We disentangle these effects by systematically evaluating the correspondence between intergenic nucleotide composition, where protein-level selection is absent, the AAC, and ecological parameters of 909 prokaryotes.Moreover, highly expressed genes do not exhibit more prominent environment-related AAC signatures than lowly expressed genes, despite contributing more to the effective proteome.We discuss these results in light of contravening evidence from biophysical data and further reading frame-specific analyses that suggest that adaptation takes place at the protein level.

View Article: PubMed Central - PubMed

Affiliation: Division of Electronics, Rudjer Boskovic Institute, Zagreb, Croatia Molecular Basis of Ageing, Mediterranean Institute for Life Sciences (MedILS), Split, Croatia.

Show MeSH
Related in: MedlinePlus