Limits...
An integrated map of HIV genome-wide variation from a population perspective.

Li G, Piampongsant S, Faria NR, Voet A, Pineda-Peña AC, Khouri R, Lemey P, Vandamme AM, Theys K - Retrovirology (2015)

Bottom Line: We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes.This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs.Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures.

View Article: PubMed Central - PubMed

ABSTRACT

Background: The HIV pandemic is characterized by extensive genetic variability, which has challenged the development of HIV drugs and vaccines. Although HIV genomes have been classified into different types, groups, subtypes and recombinants, a comprehensive study that maps HIV genome-wide diversity at the population level is still lacking to date. This study aims to characterize HIV genomic diversity in large-scale sequence populations, and to identify driving factors that shape HIV genome diversity.

Results: A total of 2996 full-length genomic sequences from 1705 patients infected with 16 major HIV groups, subtypes and circulating recombinant forms (CRFs) were analyzed along with structural, immunological and peptide inhibitor information. Average nucleotide diversity of HIV genomes was almost 50% between HIV-1 and HIV-2 types, 37.5% between HIV-1 groups, 14.7% between HIV-1 subtypes, 8.2% within individual HIV-1 subtypes and less than 1% within single patients. Along the HIV genome, diversity patterns and compositions of nucleotides and amino acids were highly similar across different groups, subtypes and CRFs. Current HIV-derived peptide inhibitors were predominantly derived from conserved, solvent accessible and intrinsically ordered structures in the HIV-1 subtype B genome. We identified these conserved regions in Capsid, Nucleocapsid, Protease, Integrase, Reverse transcriptase, Vpr and the GP41 N terminus as potential drug targets. In the analysis of factors that impact HIV-1 genomic diversity, we focused on protein multimerization, immunological constraints and HIV-human protein interactions. We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes. Moreover, intrinsic disorder regions in HIV-1 proteins coincided with high levels of amino acid diversity, facilitating a large number of interactions between HIV-1 and human proteins.

Conclusions: This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs. Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures.

Show MeSH

Related in: MedlinePlus

Characterization of HIV-derived peptide inhibitors. (A) Cartoon representation of GP41 structure. The red structure indicates the region from which peptide inhibitor T20 was derived (PDB: 3H01). (B) Bar plot of sequence similarities between peptide inhibitor sequences and the sequences of HIV-derived regions in the consensus genome of different HIV clades. X-axis presents the HIV groups, subtypes and CRFs. Y-axis shows the sequence similarity between peptide inhibitor sequences and the sequences of HIV-derived regions in the consensus genomes of HIV groups, subtypes or CRFs. (C) Amino acid replacements between peptide inhibitor sequences and HIV-derived regions in the subtype B genome. The percentage values (%) are colored using heat maps. (D) Distribution (bee-swarm) plots of amino acid diversity in the full-length subtype B genome (black crosses), peptide-derived regions (blue diamonds) and peptide-derived regions of those inhibitors whose IC50/EC50 are less than 1 μM (red circles). Each shape represents the amino acid diversity at one protein position. Two-sample Kolmogorov-Smirnov tests were performed to compare diversity distributions (significance level: 0.05). (E) Plot of amino acid diversity (x-axis), disorder score (y-axis) and solvent accessible surface area of peptide-inhibitor-derived regions (contour map, darker red indicates larger accessible surface areas). GP41 inhibitor T20 is also annotated. For individual peptide inhibitors, the average amino acid diversity, disorder score and solvent accessible surface areas are shown in Additional file 1: Figure S9, S10 and S11, respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4358901&req=5

Fig5: Characterization of HIV-derived peptide inhibitors. (A) Cartoon representation of GP41 structure. The red structure indicates the region from which peptide inhibitor T20 was derived (PDB: 3H01). (B) Bar plot of sequence similarities between peptide inhibitor sequences and the sequences of HIV-derived regions in the consensus genome of different HIV clades. X-axis presents the HIV groups, subtypes and CRFs. Y-axis shows the sequence similarity between peptide inhibitor sequences and the sequences of HIV-derived regions in the consensus genomes of HIV groups, subtypes or CRFs. (C) Amino acid replacements between peptide inhibitor sequences and HIV-derived regions in the subtype B genome. The percentage values (%) are colored using heat maps. (D) Distribution (bee-swarm) plots of amino acid diversity in the full-length subtype B genome (black crosses), peptide-derived regions (blue diamonds) and peptide-derived regions of those inhibitors whose IC50/EC50 are less than 1 μM (red circles). Each shape represents the amino acid diversity at one protein position. Two-sample Kolmogorov-Smirnov tests were performed to compare diversity distributions (significance level: 0.05). (E) Plot of amino acid diversity (x-axis), disorder score (y-axis) and solvent accessible surface area of peptide-inhibitor-derived regions (contour map, darker red indicates larger accessible surface areas). GP41 inhibitor T20 is also annotated. For individual peptide inhibitors, the average amino acid diversity, disorder score and solvent accessible surface areas are shown in Additional file 1: Figure S9, S10 and S11, respectively.

Mentions: We investigated the 121 HIV-derived peptide inhibitors reported between 1993 and 2013 (Additional file 2: Table S4). Figure 5A illustrates the GP41 structure and the GP41-derived region of T20 as an example of HIV-derived peptide inhibitors. Peptide inhibitors had on average a length of 25 AAs (range: 3 to 73), a charge of +0.27 at pH 7.2 and a molecular weight of 2953 g/mol. Most common amino acids in these peptide inhibitors were leucine, glutamic acid and isoleucine (Additional file 1: Figure S8). Comparisons between the 121 peptide sequences and the consensus sequences of 16 HIV group, subtype and CRF genomes showed the highest sequence similarity with subtype B (79.8%) (Figure 5B). Aspartic acid to asparagine (25.7%) was the most common amino acid substitution between the consensus subtype B sequence and the peptide inhibitor sequences (Figure 5C).Figure 5


An integrated map of HIV genome-wide variation from a population perspective.

Li G, Piampongsant S, Faria NR, Voet A, Pineda-Peña AC, Khouri R, Lemey P, Vandamme AM, Theys K - Retrovirology (2015)

Characterization of HIV-derived peptide inhibitors. (A) Cartoon representation of GP41 structure. The red structure indicates the region from which peptide inhibitor T20 was derived (PDB: 3H01). (B) Bar plot of sequence similarities between peptide inhibitor sequences and the sequences of HIV-derived regions in the consensus genome of different HIV clades. X-axis presents the HIV groups, subtypes and CRFs. Y-axis shows the sequence similarity between peptide inhibitor sequences and the sequences of HIV-derived regions in the consensus genomes of HIV groups, subtypes or CRFs. (C) Amino acid replacements between peptide inhibitor sequences and HIV-derived regions in the subtype B genome. The percentage values (%) are colored using heat maps. (D) Distribution (bee-swarm) plots of amino acid diversity in the full-length subtype B genome (black crosses), peptide-derived regions (blue diamonds) and peptide-derived regions of those inhibitors whose IC50/EC50 are less than 1 μM (red circles). Each shape represents the amino acid diversity at one protein position. Two-sample Kolmogorov-Smirnov tests were performed to compare diversity distributions (significance level: 0.05). (E) Plot of amino acid diversity (x-axis), disorder score (y-axis) and solvent accessible surface area of peptide-inhibitor-derived regions (contour map, darker red indicates larger accessible surface areas). GP41 inhibitor T20 is also annotated. For individual peptide inhibitors, the average amino acid diversity, disorder score and solvent accessible surface areas are shown in Additional file 1: Figure S9, S10 and S11, respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4358901&req=5

Fig5: Characterization of HIV-derived peptide inhibitors. (A) Cartoon representation of GP41 structure. The red structure indicates the region from which peptide inhibitor T20 was derived (PDB: 3H01). (B) Bar plot of sequence similarities between peptide inhibitor sequences and the sequences of HIV-derived regions in the consensus genome of different HIV clades. X-axis presents the HIV groups, subtypes and CRFs. Y-axis shows the sequence similarity between peptide inhibitor sequences and the sequences of HIV-derived regions in the consensus genomes of HIV groups, subtypes or CRFs. (C) Amino acid replacements between peptide inhibitor sequences and HIV-derived regions in the subtype B genome. The percentage values (%) are colored using heat maps. (D) Distribution (bee-swarm) plots of amino acid diversity in the full-length subtype B genome (black crosses), peptide-derived regions (blue diamonds) and peptide-derived regions of those inhibitors whose IC50/EC50 are less than 1 μM (red circles). Each shape represents the amino acid diversity at one protein position. Two-sample Kolmogorov-Smirnov tests were performed to compare diversity distributions (significance level: 0.05). (E) Plot of amino acid diversity (x-axis), disorder score (y-axis) and solvent accessible surface area of peptide-inhibitor-derived regions (contour map, darker red indicates larger accessible surface areas). GP41 inhibitor T20 is also annotated. For individual peptide inhibitors, the average amino acid diversity, disorder score and solvent accessible surface areas are shown in Additional file 1: Figure S9, S10 and S11, respectively.
Mentions: We investigated the 121 HIV-derived peptide inhibitors reported between 1993 and 2013 (Additional file 2: Table S4). Figure 5A illustrates the GP41 structure and the GP41-derived region of T20 as an example of HIV-derived peptide inhibitors. Peptide inhibitors had on average a length of 25 AAs (range: 3 to 73), a charge of +0.27 at pH 7.2 and a molecular weight of 2953 g/mol. Most common amino acids in these peptide inhibitors were leucine, glutamic acid and isoleucine (Additional file 1: Figure S8). Comparisons between the 121 peptide sequences and the consensus sequences of 16 HIV group, subtype and CRF genomes showed the highest sequence similarity with subtype B (79.8%) (Figure 5B). Aspartic acid to asparagine (25.7%) was the most common amino acid substitution between the consensus subtype B sequence and the peptide inhibitor sequences (Figure 5C).Figure 5

Bottom Line: We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes.This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs.Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures.

View Article: PubMed Central - PubMed

ABSTRACT

Background: The HIV pandemic is characterized by extensive genetic variability, which has challenged the development of HIV drugs and vaccines. Although HIV genomes have been classified into different types, groups, subtypes and recombinants, a comprehensive study that maps HIV genome-wide diversity at the population level is still lacking to date. This study aims to characterize HIV genomic diversity in large-scale sequence populations, and to identify driving factors that shape HIV genome diversity.

Results: A total of 2996 full-length genomic sequences from 1705 patients infected with 16 major HIV groups, subtypes and circulating recombinant forms (CRFs) were analyzed along with structural, immunological and peptide inhibitor information. Average nucleotide diversity of HIV genomes was almost 50% between HIV-1 and HIV-2 types, 37.5% between HIV-1 groups, 14.7% between HIV-1 subtypes, 8.2% within individual HIV-1 subtypes and less than 1% within single patients. Along the HIV genome, diversity patterns and compositions of nucleotides and amino acids were highly similar across different groups, subtypes and CRFs. Current HIV-derived peptide inhibitors were predominantly derived from conserved, solvent accessible and intrinsically ordered structures in the HIV-1 subtype B genome. We identified these conserved regions in Capsid, Nucleocapsid, Protease, Integrase, Reverse transcriptase, Vpr and the GP41 N terminus as potential drug targets. In the analysis of factors that impact HIV-1 genomic diversity, we focused on protein multimerization, immunological constraints and HIV-human protein interactions. We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes. Moreover, intrinsic disorder regions in HIV-1 proteins coincided with high levels of amino acid diversity, facilitating a large number of interactions between HIV-1 and human proteins.

Conclusions: This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs. Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures.

Show MeSH
Related in: MedlinePlus