Limits...
Patterns of recombination in HIV-1M are influenced by selection disfavouring the survival of recombinants with disrupted genomic RNA and protein structures.

Golden M, Muhire BM, Semegni Y, Martin DP - PLoS ONE (2014)

Bottom Line: We confirm that chimaeric Gag p24, reverse transcriptase, integrase, gp120 and Nef proteins that are expressed by natural HIV-1 recombinants have significantly lower degrees of predicted folding disruption than randomly generated recombinants.Similarly, we use a novel single-stranded RNA folding disruption test to show that there is significant, albeit weak, evidence that natural HIV recombinants tend to have genomic secondary structures that more closely resemble parental structures than do randomly generated recombinants.These results are consistent with the hypothesis that natural selection has acted both in the short term to purge recombinants with disrupted RNA and protein folds, and in the longer term to modify the genome architecture of HIV to ensure that recombination prone sites correspond with those where recombination will be minimally deleterious.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, University of Oxford, Oxford, United Kingdom.

ABSTRACT
Genetic recombination is a major contributor to the ongoing diversification of HIV. It is clearly apparent that across the HIV-genome there are defined recombination hot and cold spots which tend to co-localise both with genomic secondary structures and with either inter-gene boundaries or intra-gene domain boundaries. There is also good evidence that most recombination breakpoints that are detectable within the genes of natural HIV recombinants are likely to be minimally disruptive of intra-protein amino acid contacts and that these breakpoints should therefore have little impact on protein folding. Here we further investigate the impact on patterns of genetic recombination in HIV of selection favouring the maintenance of functional RNA and protein structures. We confirm that chimaeric Gag p24, reverse transcriptase, integrase, gp120 and Nef proteins that are expressed by natural HIV-1 recombinants have significantly lower degrees of predicted folding disruption than randomly generated recombinants. Similarly, we use a novel single-stranded RNA folding disruption test to show that there is significant, albeit weak, evidence that natural HIV recombinants tend to have genomic secondary structures that more closely resemble parental structures than do randomly generated recombinants. These results are consistent with the hypothesis that natural selection has acted both in the short term to purge recombinants with disrupted RNA and protein folds, and in the longer term to modify the genome architecture of HIV to ensure that recombination prone sites correspond with those where recombination will be minimally deleterious.

Show MeSH

Related in: MedlinePlus

Diagrammatic representation of how simulated recombinants were generated.For a particular recombination event specifying a major parent, a minor parent, and a pair of recombination breakpoint locations delineating a fragment of sequence derived from the minor parent (containing in this particular case two nucleotides that vary between the major and minor parents), an in silico mimic of the real recombinant sequence is created using the minor and the major parent sequences. Following that, a set of N simulated recombinants is generated in a similar way to the mimic recombinant, but using random starting and ending positions, whilst maintaining the same number of either variable nucleotides (for the RNA folding tests) or non-synonymous codon sites (for the protein folding tests) between the randomized breakpoint sites as occur in the mimic recombinant. In this example the mimic and simulated recombinants all have two such “informative” sites between the 3′ and 5′ breakpoints that are not identical between the parental sequences.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4061080&req=5

pone-0100400-g001: Diagrammatic representation of how simulated recombinants were generated.For a particular recombination event specifying a major parent, a minor parent, and a pair of recombination breakpoint locations delineating a fragment of sequence derived from the minor parent (containing in this particular case two nucleotides that vary between the major and minor parents), an in silico mimic of the real recombinant sequence is created using the minor and the major parent sequences. Following that, a set of N simulated recombinants is generated in a similar way to the mimic recombinant, but using random starting and ending positions, whilst maintaining the same number of either variable nucleotides (for the RNA folding tests) or non-synonymous codon sites (for the protein folding tests) between the randomized breakpoint sites as occur in the mimic recombinant. In this example the mimic and simulated recombinants all have two such “informative” sites between the 3′ and 5′ breakpoints that are not identical between the parental sequences.

Mentions: The protein folding and secondary structure disruption tests performed here both relied on a permutation test involving the generation of sets of simulated recombinants with precisely the same genetic distances to the parental viruses but with breakpoints in random genome locations. Genetic distances were computed as the number of nucleotide differences between a pair of sequences (treating gap characters inserted during alignment as a fifth nucleotide state). Given the breakpoint positions and parental sequences associated with a particular recombination event, an in silico generated recombinant sequence with breakpoint positions corresponding to those of a real recombinant was produced from the minor and major parental sequences. In silico generated recombinants of this type were called “mimic” or M-recombinants in that although they resembled actual recombinants at the moment when these were generated, they were not expected to be identical to these actual recombinants. This is because the parental sequences, rather than being the actual parental sequences of the recombinant (it is extremely unlikely that these would ever be sampled), were simply those identified in our dataset as most closely resembling the actual parents. For each detected recombination event we refer hereafter to the 5′ breakpoint in its corresponding M-recombinant as the “start position”, the 3′ breakpoint as the “end position”, and the number of sites differing between the major and minor parents between these two positions as the “event-length”. For each of the M-recombinants, multiple simulated recombinant sequences, called S-recombinants, were then generated from the same major and minor parental sequences and with the same event-length but with randomly selected start positions (such that the end positions were simply determined by the event-length). Start positions that resulted in end positions falling beyond either the end of the gene of interest (for the protein folding disruption tests) or the end of the genome alignment (for the RNA folding disruption tests) were excluded. Keeping the event-length constant between the M- and S-recombinants ensured that these all had either exactly the same number of polymorphic translated amino acid sites (for the protein folding test) or exactly the same genetic distance (for the RNA folding tests) from both the major and minor parental sequences – something that was crucial for our permutation-based tests of recombination-induced protein and nucleic acid structural disruption. See Figure 1 for a diagrammatic representation of this procedure.


Patterns of recombination in HIV-1M are influenced by selection disfavouring the survival of recombinants with disrupted genomic RNA and protein structures.

Golden M, Muhire BM, Semegni Y, Martin DP - PLoS ONE (2014)

Diagrammatic representation of how simulated recombinants were generated.For a particular recombination event specifying a major parent, a minor parent, and a pair of recombination breakpoint locations delineating a fragment of sequence derived from the minor parent (containing in this particular case two nucleotides that vary between the major and minor parents), an in silico mimic of the real recombinant sequence is created using the minor and the major parent sequences. Following that, a set of N simulated recombinants is generated in a similar way to the mimic recombinant, but using random starting and ending positions, whilst maintaining the same number of either variable nucleotides (for the RNA folding tests) or non-synonymous codon sites (for the protein folding tests) between the randomized breakpoint sites as occur in the mimic recombinant. In this example the mimic and simulated recombinants all have two such “informative” sites between the 3′ and 5′ breakpoints that are not identical between the parental sequences.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4061080&req=5

pone-0100400-g001: Diagrammatic representation of how simulated recombinants were generated.For a particular recombination event specifying a major parent, a minor parent, and a pair of recombination breakpoint locations delineating a fragment of sequence derived from the minor parent (containing in this particular case two nucleotides that vary between the major and minor parents), an in silico mimic of the real recombinant sequence is created using the minor and the major parent sequences. Following that, a set of N simulated recombinants is generated in a similar way to the mimic recombinant, but using random starting and ending positions, whilst maintaining the same number of either variable nucleotides (for the RNA folding tests) or non-synonymous codon sites (for the protein folding tests) between the randomized breakpoint sites as occur in the mimic recombinant. In this example the mimic and simulated recombinants all have two such “informative” sites between the 3′ and 5′ breakpoints that are not identical between the parental sequences.
Mentions: The protein folding and secondary structure disruption tests performed here both relied on a permutation test involving the generation of sets of simulated recombinants with precisely the same genetic distances to the parental viruses but with breakpoints in random genome locations. Genetic distances were computed as the number of nucleotide differences between a pair of sequences (treating gap characters inserted during alignment as a fifth nucleotide state). Given the breakpoint positions and parental sequences associated with a particular recombination event, an in silico generated recombinant sequence with breakpoint positions corresponding to those of a real recombinant was produced from the minor and major parental sequences. In silico generated recombinants of this type were called “mimic” or M-recombinants in that although they resembled actual recombinants at the moment when these were generated, they were not expected to be identical to these actual recombinants. This is because the parental sequences, rather than being the actual parental sequences of the recombinant (it is extremely unlikely that these would ever be sampled), were simply those identified in our dataset as most closely resembling the actual parents. For each detected recombination event we refer hereafter to the 5′ breakpoint in its corresponding M-recombinant as the “start position”, the 3′ breakpoint as the “end position”, and the number of sites differing between the major and minor parents between these two positions as the “event-length”. For each of the M-recombinants, multiple simulated recombinant sequences, called S-recombinants, were then generated from the same major and minor parental sequences and with the same event-length but with randomly selected start positions (such that the end positions were simply determined by the event-length). Start positions that resulted in end positions falling beyond either the end of the gene of interest (for the protein folding disruption tests) or the end of the genome alignment (for the RNA folding disruption tests) were excluded. Keeping the event-length constant between the M- and S-recombinants ensured that these all had either exactly the same number of polymorphic translated amino acid sites (for the protein folding test) or exactly the same genetic distance (for the RNA folding tests) from both the major and minor parental sequences – something that was crucial for our permutation-based tests of recombination-induced protein and nucleic acid structural disruption. See Figure 1 for a diagrammatic representation of this procedure.

Bottom Line: We confirm that chimaeric Gag p24, reverse transcriptase, integrase, gp120 and Nef proteins that are expressed by natural HIV-1 recombinants have significantly lower degrees of predicted folding disruption than randomly generated recombinants.Similarly, we use a novel single-stranded RNA folding disruption test to show that there is significant, albeit weak, evidence that natural HIV recombinants tend to have genomic secondary structures that more closely resemble parental structures than do randomly generated recombinants.These results are consistent with the hypothesis that natural selection has acted both in the short term to purge recombinants with disrupted RNA and protein folds, and in the longer term to modify the genome architecture of HIV to ensure that recombination prone sites correspond with those where recombination will be minimally deleterious.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, University of Oxford, Oxford, United Kingdom.

ABSTRACT
Genetic recombination is a major contributor to the ongoing diversification of HIV. It is clearly apparent that across the HIV-genome there are defined recombination hot and cold spots which tend to co-localise both with genomic secondary structures and with either inter-gene boundaries or intra-gene domain boundaries. There is also good evidence that most recombination breakpoints that are detectable within the genes of natural HIV recombinants are likely to be minimally disruptive of intra-protein amino acid contacts and that these breakpoints should therefore have little impact on protein folding. Here we further investigate the impact on patterns of genetic recombination in HIV of selection favouring the maintenance of functional RNA and protein structures. We confirm that chimaeric Gag p24, reverse transcriptase, integrase, gp120 and Nef proteins that are expressed by natural HIV-1 recombinants have significantly lower degrees of predicted folding disruption than randomly generated recombinants. Similarly, we use a novel single-stranded RNA folding disruption test to show that there is significant, albeit weak, evidence that natural HIV recombinants tend to have genomic secondary structures that more closely resemble parental structures than do randomly generated recombinants. These results are consistent with the hypothesis that natural selection has acted both in the short term to purge recombinants with disrupted RNA and protein folds, and in the longer term to modify the genome architecture of HIV to ensure that recombination prone sites correspond with those where recombination will be minimally deleterious.

Show MeSH
Related in: MedlinePlus