Limits...
Extensive purifying selection acting on synonymous sites in HIV-1 Group M sequences.

Ngandu NK, Scheffler K, Moore P, Woodman Z, Martin D, Seoighe C - Virol. J. (2008)

Bottom Line: Synonymous substitution rates were found to vary significantly within and between genes.We found evidence of strong purifying selection pressure affecting synonymous mutations in fourteen regions with known functions.We also found four conserved regions located in env and vpu which have not been characterized previously.

View Article: PubMed Central - HTML - PubMed

Affiliation: National Bioinformatics Network Node, Institute of Infectious Diseases and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Anzio Road, Observatory, 7925, South Africa. nobubelo@cbio.uct.ac.za

ABSTRACT

Background: Positive selection pressure acting on protein-coding sequences is usually inferred when the rate of nonsynonymous substitution is greater than the synonymous rate. However, purifying selection acting directly on the nucleotide sequence can lower the synonymous substitution rate. This could result in false inference of positive selection because when synonymous changes at some sites are under purifying selection, the average synonymous rate is an underestimate of the neutral rate of evolution. Even though HIV-1 coding sequences contain a number of regions that function at the nucleotide level, and are thus likely to be affected by purifying selection, studies of positive selection assume that synonymous substitutions can be used to estimate the neutral rate of evolution.

Results: We modelled site-to-site variation in the synonymous substitution rate across coding regions of the HIV-1 genome. Synonymous substitution rates were found to vary significantly within and between genes. Surprisingly, regions of the genome that encode proteins in more than one frame had significantly higher synonymous substitution rates than regions coding in a single frame. We found evidence of strong purifying selection pressure affecting synonymous mutations in fourteen regions with known functions. These included an exonic splicing enhancer, the rev-responsive element, the poly-purine tract and a transcription factor binding site. A further five highly conserved regions were located within known functional domains. We also found four conserved regions located in env and vpu which have not been characterized previously.

Conclusion: We provide the coordinates of genomic regions with markedly lower synonymous substitution rates, which are putatively under the influence of strong purifying selection pressure at the nucleotide level as well as regions encoding proteins in more than one frame. These regions should be excluded from studies of positive selection acting on HIV-1 coding regions.

Show MeSH
Mean (blue) synonymous substitution rates observed across gag, pol, vif and vpr genes. Mean dS was calculated over sliding windows of three codons. Horizontal lines mark the most stringent (red) and less stringent (purple) significance thresholds. (a) dS across the gag gene. 'sl4'; the fourth stem loop of the encapsidation signal, 'INS1'; a motif within the first inhibitory sequence region, 'INS2'; a motif within the second inhibitory sequence region. (b) dS across the pol gene. 'crs'; start of the cis-repressive sequence, horizontal dotted line is the nuclease hypersensitive region and sites 'B', 'G', 'C' and 'D' are confirmed transcription factor binding sites known as site-B, GC-box, site-C and site-D respectively. 'p1'and 'p2'; conserved sites within nuclease hypersensitive region. "ese"; exonic splicing enhancer. (c) dS across the vif gene. (d) dS across the vpr gene. 'ssa3'; 3' splice acceptor site A3, 'rnase'; RNae-V1 cleavage site.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2666660&req=5

Figure 3: Mean (blue) synonymous substitution rates observed across gag, pol, vif and vpr genes. Mean dS was calculated over sliding windows of three codons. Horizontal lines mark the most stringent (red) and less stringent (purple) significance thresholds. (a) dS across the gag gene. 'sl4'; the fourth stem loop of the encapsidation signal, 'INS1'; a motif within the first inhibitory sequence region, 'INS2'; a motif within the second inhibitory sequence region. (b) dS across the pol gene. 'crs'; start of the cis-repressive sequence, horizontal dotted line is the nuclease hypersensitive region and sites 'B', 'G', 'C' and 'D' are confirmed transcription factor binding sites known as site-B, GC-box, site-C and site-D respectively. 'p1'and 'p2'; conserved sites within nuclease hypersensitive region. "ese"; exonic splicing enhancer. (c) dS across the vif gene. (d) dS across the vpr gene. 'ssa3'; 3' splice acceptor site A3, 'rnase'; RNae-V1 cleavage site.

Mentions: We found evidence of strong purifying selection acting directly on the nucleotide sequence at twenty-three sites across the HIV-1 genome (Figures 3, 4, 5 and 6). Fourteen of these regions (marked in black in Figures 3 and 4) coincided exactly with well characterized functional motifs while for another five (marked in green in Figure 3), we were able to identify possible functions based on the known functions of the sequence domains in which they were situated. We could not, however, find plausible explanations for high degrees of sequence conservation observed within a twelve-nucleotide region of the env gene and three other regions in vpu (marked in red in Figure 4). Sequence logos illustrating the conservation in each of these twenty-three significantly conserved regions are shown in Figure 5 (for those with known specific function) and Figure 6 (for those with predicted and unknown functions).


Extensive purifying selection acting on synonymous sites in HIV-1 Group M sequences.

Ngandu NK, Scheffler K, Moore P, Woodman Z, Martin D, Seoighe C - Virol. J. (2008)

Mean (blue) synonymous substitution rates observed across gag, pol, vif and vpr genes. Mean dS was calculated over sliding windows of three codons. Horizontal lines mark the most stringent (red) and less stringent (purple) significance thresholds. (a) dS across the gag gene. 'sl4'; the fourth stem loop of the encapsidation signal, 'INS1'; a motif within the first inhibitory sequence region, 'INS2'; a motif within the second inhibitory sequence region. (b) dS across the pol gene. 'crs'; start of the cis-repressive sequence, horizontal dotted line is the nuclease hypersensitive region and sites 'B', 'G', 'C' and 'D' are confirmed transcription factor binding sites known as site-B, GC-box, site-C and site-D respectively. 'p1'and 'p2'; conserved sites within nuclease hypersensitive region. "ese"; exonic splicing enhancer. (c) dS across the vif gene. (d) dS across the vpr gene. 'ssa3'; 3' splice acceptor site A3, 'rnase'; RNae-V1 cleavage site.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2666660&req=5

Figure 3: Mean (blue) synonymous substitution rates observed across gag, pol, vif and vpr genes. Mean dS was calculated over sliding windows of three codons. Horizontal lines mark the most stringent (red) and less stringent (purple) significance thresholds. (a) dS across the gag gene. 'sl4'; the fourth stem loop of the encapsidation signal, 'INS1'; a motif within the first inhibitory sequence region, 'INS2'; a motif within the second inhibitory sequence region. (b) dS across the pol gene. 'crs'; start of the cis-repressive sequence, horizontal dotted line is the nuclease hypersensitive region and sites 'B', 'G', 'C' and 'D' are confirmed transcription factor binding sites known as site-B, GC-box, site-C and site-D respectively. 'p1'and 'p2'; conserved sites within nuclease hypersensitive region. "ese"; exonic splicing enhancer. (c) dS across the vif gene. (d) dS across the vpr gene. 'ssa3'; 3' splice acceptor site A3, 'rnase'; RNae-V1 cleavage site.
Mentions: We found evidence of strong purifying selection acting directly on the nucleotide sequence at twenty-three sites across the HIV-1 genome (Figures 3, 4, 5 and 6). Fourteen of these regions (marked in black in Figures 3 and 4) coincided exactly with well characterized functional motifs while for another five (marked in green in Figure 3), we were able to identify possible functions based on the known functions of the sequence domains in which they were situated. We could not, however, find plausible explanations for high degrees of sequence conservation observed within a twelve-nucleotide region of the env gene and three other regions in vpu (marked in red in Figure 4). Sequence logos illustrating the conservation in each of these twenty-three significantly conserved regions are shown in Figure 5 (for those with known specific function) and Figure 6 (for those with predicted and unknown functions).

Bottom Line: Synonymous substitution rates were found to vary significantly within and between genes.We found evidence of strong purifying selection pressure affecting synonymous mutations in fourteen regions with known functions.We also found four conserved regions located in env and vpu which have not been characterized previously.

View Article: PubMed Central - HTML - PubMed

Affiliation: National Bioinformatics Network Node, Institute of Infectious Diseases and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Anzio Road, Observatory, 7925, South Africa. nobubelo@cbio.uct.ac.za

ABSTRACT

Background: Positive selection pressure acting on protein-coding sequences is usually inferred when the rate of nonsynonymous substitution is greater than the synonymous rate. However, purifying selection acting directly on the nucleotide sequence can lower the synonymous substitution rate. This could result in false inference of positive selection because when synonymous changes at some sites are under purifying selection, the average synonymous rate is an underestimate of the neutral rate of evolution. Even though HIV-1 coding sequences contain a number of regions that function at the nucleotide level, and are thus likely to be affected by purifying selection, studies of positive selection assume that synonymous substitutions can be used to estimate the neutral rate of evolution.

Results: We modelled site-to-site variation in the synonymous substitution rate across coding regions of the HIV-1 genome. Synonymous substitution rates were found to vary significantly within and between genes. Surprisingly, regions of the genome that encode proteins in more than one frame had significantly higher synonymous substitution rates than regions coding in a single frame. We found evidence of strong purifying selection pressure affecting synonymous mutations in fourteen regions with known functions. These included an exonic splicing enhancer, the rev-responsive element, the poly-purine tract and a transcription factor binding site. A further five highly conserved regions were located within known functional domains. We also found four conserved regions located in env and vpu which have not been characterized previously.

Conclusion: We provide the coordinates of genomic regions with markedly lower synonymous substitution rates, which are putatively under the influence of strong purifying selection pressure at the nucleotide level as well as regions encoding proteins in more than one frame. These regions should be excluded from studies of positive selection acting on HIV-1 coding regions.

Show MeSH