Limits...
An integrated map of HIV genome-wide variation from a population perspective.

Li G, Piampongsant S, Faria NR, Voet A, Pineda-Peña AC, Khouri R, Lemey P, Vandamme AM, Theys K - Retrovirology (2015)

Bottom Line: We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes.This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs.Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures.

View Article: PubMed Central - PubMed

ABSTRACT

Background: The HIV pandemic is characterized by extensive genetic variability, which has challenged the development of HIV drugs and vaccines. Although HIV genomes have been classified into different types, groups, subtypes and recombinants, a comprehensive study that maps HIV genome-wide diversity at the population level is still lacking to date. This study aims to characterize HIV genomic diversity in large-scale sequence populations, and to identify driving factors that shape HIV genome diversity.

Results: A total of 2996 full-length genomic sequences from 1705 patients infected with 16 major HIV groups, subtypes and circulating recombinant forms (CRFs) were analyzed along with structural, immunological and peptide inhibitor information. Average nucleotide diversity of HIV genomes was almost 50% between HIV-1 and HIV-2 types, 37.5% between HIV-1 groups, 14.7% between HIV-1 subtypes, 8.2% within individual HIV-1 subtypes and less than 1% within single patients. Along the HIV genome, diversity patterns and compositions of nucleotides and amino acids were highly similar across different groups, subtypes and CRFs. Current HIV-derived peptide inhibitors were predominantly derived from conserved, solvent accessible and intrinsically ordered structures in the HIV-1 subtype B genome. We identified these conserved regions in Capsid, Nucleocapsid, Protease, Integrase, Reverse transcriptase, Vpr and the GP41 N terminus as potential drug targets. In the analysis of factors that impact HIV-1 genomic diversity, we focused on protein multimerization, immunological constraints and HIV-human protein interactions. We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes. Moreover, intrinsic disorder regions in HIV-1 proteins coincided with high levels of amino acid diversity, facilitating a large number of interactions between HIV-1 and human proteins.

Conclusions: This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs. Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures.

Show MeSH

Related in: MedlinePlus

Plots of amino acid and nucleotide diversity in the HIV full-length genome. (A) Amino acid diversity along the HIV full-length genome using the sliding windows (window size: 100AA; also see the plots of exact diversity values in Additional file 1: Figure S5). Each colored plot shows the density of amino acid diversity for one HIV group, subtype or CRF genome, indicated by the figure legend. Six layers are shown beneath the plots: (1) HIV-1 protein regions (HXB2 reference) are concatenated and shown with abbreviated names (e.g. MA: matrix); (2) peptide-inhibitor-derived region; (3) CD8+ T cell epitope position; (4) CD4+ T cell epitope position; (5) antibody epitope position; (6) HIV-2 protein region (BEN reference). (B) Nucleotide diversity along the full-length HIV genome using sliding windows (window size: 300 nucleotides; also see the plots of exact diversity values in Additional file 1: Figure S6). Each colored plot shows the density of nucleotide diversity for one HIV group, subtype or CRF genome, indicated by the figure legend. Annotated HIV-1 and HIV-2 reference genomes are shown beneath; each track contains one open reading frame (ORF). Long terminal regions in the HIV genome are not shown. (C) Contour map of inter-clade amino acid diversity between HIV-1 subtype B and the other HIV genomes. Inter-clade amino acid diversity was calculated by a sliding window of 30 amino acids over the HIV genome (low: ≤1 AA difference, high: ≥25 AA differences). Five colored layers beneath the contour map are annotated similarly in (A).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4358901&req=5

Fig2: Plots of amino acid and nucleotide diversity in the HIV full-length genome. (A) Amino acid diversity along the HIV full-length genome using the sliding windows (window size: 100AA; also see the plots of exact diversity values in Additional file 1: Figure S5). Each colored plot shows the density of amino acid diversity for one HIV group, subtype or CRF genome, indicated by the figure legend. Six layers are shown beneath the plots: (1) HIV-1 protein regions (HXB2 reference) are concatenated and shown with abbreviated names (e.g. MA: matrix); (2) peptide-inhibitor-derived region; (3) CD8+ T cell epitope position; (4) CD4+ T cell epitope position; (5) antibody epitope position; (6) HIV-2 protein region (BEN reference). (B) Nucleotide diversity along the full-length HIV genome using sliding windows (window size: 300 nucleotides; also see the plots of exact diversity values in Additional file 1: Figure S6). Each colored plot shows the density of nucleotide diversity for one HIV group, subtype or CRF genome, indicated by the figure legend. Annotated HIV-1 and HIV-2 reference genomes are shown beneath; each track contains one open reading frame (ORF). Long terminal regions in the HIV genome are not shown. (C) Contour map of inter-clade amino acid diversity between HIV-1 subtype B and the other HIV genomes. Inter-clade amino acid diversity was calculated by a sliding window of 30 amino acids over the HIV genome (low: ≤1 AA difference, high: ≥25 AA differences). Five colored layers beneath the contour map are annotated similarly in (A).

Mentions: We next quantified genomic diversity within and between individual HIV clades (Figure 1C, Additional file 1: Figure S3). Within each HIV clade, amino acid diversity was consistently higher than nucleotide diversity (Figure 1D). CRF01_AE showed the lowest genomic diversity (nucleotide: 5.7%, amino acid: 8.7%) among the 10 HIV-1 subtypes with at least 10 sequences available (Figure 1D). Sequence variability was not uniformly distributed along the full-length HIV genome, but similar patterns were consistently observed in HIV group, subtype and CRF genomes at the nucleotide and amino acid levels (Figure 2A, B). Moreover, the estimated geographical distribution of HIV-1 genomic diversity (Additional file 1: Figure S4) showed a good agreement with the reported geographical distribution of HIV-1 subtypes [29].Figure 2


An integrated map of HIV genome-wide variation from a population perspective.

Li G, Piampongsant S, Faria NR, Voet A, Pineda-Peña AC, Khouri R, Lemey P, Vandamme AM, Theys K - Retrovirology (2015)

Plots of amino acid and nucleotide diversity in the HIV full-length genome. (A) Amino acid diversity along the HIV full-length genome using the sliding windows (window size: 100AA; also see the plots of exact diversity values in Additional file 1: Figure S5). Each colored plot shows the density of amino acid diversity for one HIV group, subtype or CRF genome, indicated by the figure legend. Six layers are shown beneath the plots: (1) HIV-1 protein regions (HXB2 reference) are concatenated and shown with abbreviated names (e.g. MA: matrix); (2) peptide-inhibitor-derived region; (3) CD8+ T cell epitope position; (4) CD4+ T cell epitope position; (5) antibody epitope position; (6) HIV-2 protein region (BEN reference). (B) Nucleotide diversity along the full-length HIV genome using sliding windows (window size: 300 nucleotides; also see the plots of exact diversity values in Additional file 1: Figure S6). Each colored plot shows the density of nucleotide diversity for one HIV group, subtype or CRF genome, indicated by the figure legend. Annotated HIV-1 and HIV-2 reference genomes are shown beneath; each track contains one open reading frame (ORF). Long terminal regions in the HIV genome are not shown. (C) Contour map of inter-clade amino acid diversity between HIV-1 subtype B and the other HIV genomes. Inter-clade amino acid diversity was calculated by a sliding window of 30 amino acids over the HIV genome (low: ≤1 AA difference, high: ≥25 AA differences). Five colored layers beneath the contour map are annotated similarly in (A).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4358901&req=5

Fig2: Plots of amino acid and nucleotide diversity in the HIV full-length genome. (A) Amino acid diversity along the HIV full-length genome using the sliding windows (window size: 100AA; also see the plots of exact diversity values in Additional file 1: Figure S5). Each colored plot shows the density of amino acid diversity for one HIV group, subtype or CRF genome, indicated by the figure legend. Six layers are shown beneath the plots: (1) HIV-1 protein regions (HXB2 reference) are concatenated and shown with abbreviated names (e.g. MA: matrix); (2) peptide-inhibitor-derived region; (3) CD8+ T cell epitope position; (4) CD4+ T cell epitope position; (5) antibody epitope position; (6) HIV-2 protein region (BEN reference). (B) Nucleotide diversity along the full-length HIV genome using sliding windows (window size: 300 nucleotides; also see the plots of exact diversity values in Additional file 1: Figure S6). Each colored plot shows the density of nucleotide diversity for one HIV group, subtype or CRF genome, indicated by the figure legend. Annotated HIV-1 and HIV-2 reference genomes are shown beneath; each track contains one open reading frame (ORF). Long terminal regions in the HIV genome are not shown. (C) Contour map of inter-clade amino acid diversity between HIV-1 subtype B and the other HIV genomes. Inter-clade amino acid diversity was calculated by a sliding window of 30 amino acids over the HIV genome (low: ≤1 AA difference, high: ≥25 AA differences). Five colored layers beneath the contour map are annotated similarly in (A).
Mentions: We next quantified genomic diversity within and between individual HIV clades (Figure 1C, Additional file 1: Figure S3). Within each HIV clade, amino acid diversity was consistently higher than nucleotide diversity (Figure 1D). CRF01_AE showed the lowest genomic diversity (nucleotide: 5.7%, amino acid: 8.7%) among the 10 HIV-1 subtypes with at least 10 sequences available (Figure 1D). Sequence variability was not uniformly distributed along the full-length HIV genome, but similar patterns were consistently observed in HIV group, subtype and CRF genomes at the nucleotide and amino acid levels (Figure 2A, B). Moreover, the estimated geographical distribution of HIV-1 genomic diversity (Additional file 1: Figure S4) showed a good agreement with the reported geographical distribution of HIV-1 subtypes [29].Figure 2

Bottom Line: We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes.This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs.Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures.

View Article: PubMed Central - PubMed

ABSTRACT

Background: The HIV pandemic is characterized by extensive genetic variability, which has challenged the development of HIV drugs and vaccines. Although HIV genomes have been classified into different types, groups, subtypes and recombinants, a comprehensive study that maps HIV genome-wide diversity at the population level is still lacking to date. This study aims to characterize HIV genomic diversity in large-scale sequence populations, and to identify driving factors that shape HIV genome diversity.

Results: A total of 2996 full-length genomic sequences from 1705 patients infected with 16 major HIV groups, subtypes and circulating recombinant forms (CRFs) were analyzed along with structural, immunological and peptide inhibitor information. Average nucleotide diversity of HIV genomes was almost 50% between HIV-1 and HIV-2 types, 37.5% between HIV-1 groups, 14.7% between HIV-1 subtypes, 8.2% within individual HIV-1 subtypes and less than 1% within single patients. Along the HIV genome, diversity patterns and compositions of nucleotides and amino acids were highly similar across different groups, subtypes and CRFs. Current HIV-derived peptide inhibitors were predominantly derived from conserved, solvent accessible and intrinsically ordered structures in the HIV-1 subtype B genome. We identified these conserved regions in Capsid, Nucleocapsid, Protease, Integrase, Reverse transcriptase, Vpr and the GP41 N terminus as potential drug targets. In the analysis of factors that impact HIV-1 genomic diversity, we focused on protein multimerization, immunological constraints and HIV-human protein interactions. We found that amino acid diversity in monomeric proteins was higher than in multimeric proteins, and diversified positions were preferably located within human CD4 T cell and antibody epitopes. Moreover, intrinsic disorder regions in HIV-1 proteins coincided with high levels of amino acid diversity, facilitating a large number of interactions between HIV-1 and human proteins.

Conclusions: This first large-scale analysis provided a detailed mapping of HIV genomic diversity and highlighted drug-target regions conserved across different groups, subtypes and CRFs. Our findings suggest that, in addition to the impact of protein multimerization and immune selective pressure on HIV-1 diversity, HIV-human protein interactions are facilitated by high variability within intrinsically disordered structures.

Show MeSH
Related in: MedlinePlus