Limits...
Increased polymorphism near low-complexity sequences across the genomes of Plasmodium falciparum isolates.

Haerty W, Golding GB - Genome Biol Evol (2011)

Bottom Line: However, due to the scarcity of individual genomes within a species, most of the analyses so far have been performed at the species level with the implicit assumption that the variation both in composition and size within species is too small relative to the between-species divergence to affect the conclusions of the analysis.We report that more than half of the 7,711 low-complexity sequences found within aligned coding sequences are variable in size among strains.This observation strongly suggests the joint effects of lowered selective constraints on low-complexity sequences and a mutagenic effect of these simple sequences.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology, McMaster University, Hamilton, Ontario, Canada.

ABSTRACT
Low-complexity regions (LCRs) within proteins sequences are often considered to evolve neutrally even though recent studies reported evidence for selection acting on some of them. Because of their widespread distribution among eukaryotes genomes and the potential deleterious effect of expansion/contraction of some of them in humans, low-complexity sequences are of major interest and numerous studies have attempted to describe their dynamic between genomes as well as the factors correlated to their variation and to assess their selective value. However, due to the scarcity of individual genomes within a species, most of the analyses so far have been performed at the species level with the implicit assumption that the variation both in composition and size within species is too small relative to the between-species divergence to affect the conclusions of the analysis. Here we used the available genomes of 14 Plasmodium falciparum isolates to assess the relationship between low-complexity sequence variation and factors such as nucleotide polymorphism across strains, sequence composition, and protein expression. We report that more than half of the 7,711 low-complexity sequences found within aligned coding sequences are variable in size among strains. Across strains, we observed an increasing density of polymorphic sites toward the LCR boundaries. This observation strongly suggests the joint effects of lowered selective constraints on low-complexity sequences and a mutagenic effect of these simple sequences.

Show MeSH

Related in: MedlinePlus

(A) Comparison of the average amino acid composition of low-complexity sequences and the remaining proteome in five strains of P. falciparum with genome annotations. (B) Average amino acid composition of variable and nonvariable low-complexity sequences. The error bars represent the standard error.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3140889&req=5

fig1: (A) Comparison of the average amino acid composition of low-complexity sequences and the remaining proteome in five strains of P. falciparum with genome annotations. (B) Average amino acid composition of variable and nonvariable low-complexity sequences. The error bars represent the standard error.

Mentions: Low-complexity sequences are strongly enriched in asparagine and aspartic acid in comparison to the other amino acids (fig. 1A). We observed differences in the composition of low-complexity sequences between our results and those reported by DePristo et al. (2006). The differences originate from the use of different parameter values to identify LCR. DePristo et al. (2006) used the default values for SEG (Wootton and Federhen 1993), whereas we chose more stringent values for our analysis as indicated by the reduced number of LCR detected.


Increased polymorphism near low-complexity sequences across the genomes of Plasmodium falciparum isolates.

Haerty W, Golding GB - Genome Biol Evol (2011)

(A) Comparison of the average amino acid composition of low-complexity sequences and the remaining proteome in five strains of P. falciparum with genome annotations. (B) Average amino acid composition of variable and nonvariable low-complexity sequences. The error bars represent the standard error.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3140889&req=5

fig1: (A) Comparison of the average amino acid composition of low-complexity sequences and the remaining proteome in five strains of P. falciparum with genome annotations. (B) Average amino acid composition of variable and nonvariable low-complexity sequences. The error bars represent the standard error.
Mentions: Low-complexity sequences are strongly enriched in asparagine and aspartic acid in comparison to the other amino acids (fig. 1A). We observed differences in the composition of low-complexity sequences between our results and those reported by DePristo et al. (2006). The differences originate from the use of different parameter values to identify LCR. DePristo et al. (2006) used the default values for SEG (Wootton and Federhen 1993), whereas we chose more stringent values for our analysis as indicated by the reduced number of LCR detected.

Bottom Line: However, due to the scarcity of individual genomes within a species, most of the analyses so far have been performed at the species level with the implicit assumption that the variation both in composition and size within species is too small relative to the between-species divergence to affect the conclusions of the analysis.We report that more than half of the 7,711 low-complexity sequences found within aligned coding sequences are variable in size among strains.This observation strongly suggests the joint effects of lowered selective constraints on low-complexity sequences and a mutagenic effect of these simple sequences.

View Article: PubMed Central - PubMed

Affiliation: Department of Biology, McMaster University, Hamilton, Ontario, Canada.

ABSTRACT
Low-complexity regions (LCRs) within proteins sequences are often considered to evolve neutrally even though recent studies reported evidence for selection acting on some of them. Because of their widespread distribution among eukaryotes genomes and the potential deleterious effect of expansion/contraction of some of them in humans, low-complexity sequences are of major interest and numerous studies have attempted to describe their dynamic between genomes as well as the factors correlated to their variation and to assess their selective value. However, due to the scarcity of individual genomes within a species, most of the analyses so far have been performed at the species level with the implicit assumption that the variation both in composition and size within species is too small relative to the between-species divergence to affect the conclusions of the analysis. Here we used the available genomes of 14 Plasmodium falciparum isolates to assess the relationship between low-complexity sequence variation and factors such as nucleotide polymorphism across strains, sequence composition, and protein expression. We report that more than half of the 7,711 low-complexity sequences found within aligned coding sequences are variable in size among strains. Across strains, we observed an increasing density of polymorphic sites toward the LCR boundaries. This observation strongly suggests the joint effects of lowered selective constraints on low-complexity sequences and a mutagenic effect of these simple sequences.

Show MeSH
Related in: MedlinePlus