Limits...
The Influence of Hepatitis C Virus Genetic Region on Phylogenetic Clustering Analysis.

Lamoury FM, Jacka B, Bartlett S, Bull RA, Wong A, Amin J, Schinkel J, Poon AF, Matthews GV, Grebely J, Dore GJ, Applegate TL - PLoS ONE (2015)

Bottom Line: Our results demonstrated that the genomic region of HCV analysed influenced phylogenetic tree topology and clustering results.The HCV Core region alone was not suitable for clustering analysis; NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome.The Core-E2 region, which represented the highest genetic diversity and longest sequence length in this study, provides an ideal method for clustering analysis to address a range of molecular epidemiological questions.

View Article: PubMed Central - PubMed

Affiliation: The Kirby Institute, University of New South Wales Australia, Sydney, Australia.

ABSTRACT
Sequencing is important for understanding the molecular epidemiology and viral evolution of hepatitis C virus (HCV) infection. To date, there is little standardisation among sequencing protocols, in-part due to the high genetic diversity that is observed within HCV. This study aimed to develop a novel, practical sequencing protocol that covered both conserved and variable regions of the viral genome and assess the influence of each subregion, sequence concatenation and unrelated reference sequences on phylogenetic clustering analysis. The Core to the hypervariable region 1 (HVR1) of envelope-2 (E2) and non-structural-5B (NS5B) regions of the HCV genome were amplified and sequenced from participants from the Australian Trial in Acute Hepatitis C (ATAHC), a prospective study of the natural history and treatment of recent HCV infection. Phylogenetic trees were constructed using a general time-reversible substitution model and sensitivity analyses were completed for every subregion. Pairwise distance, genetic distance and bootstrap support were computed to assess the impact of HCV region on clustering results as measured by the identification and percentage of participants falling within all clusters, cluster size, average patristic distance, and bootstrap value. The Robinson-Foulds metrics was also used to compare phylogenetic trees among the different HCV regions. Our results demonstrated that the genomic region of HCV analysed influenced phylogenetic tree topology and clustering results. The HCV Core region alone was not suitable for clustering analysis; NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome. The Core-E2 region, which represented the highest genetic diversity and longest sequence length in this study, provides an ideal method for clustering analysis to address a range of molecular epidemiological questions.

No MeSH data available.


Related in: MedlinePlus

Flow chart describing the selection of ATAHC participants for inclusion in this analysis.Among the ATAHC participants (n = 163), 143 samples were tested and sequences were generated for Core-E2 and NS5B regions. 50 participants with both Core-E2 and NS5B sequences were used for clustering analysis.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4507989&req=5

pone.0131437.g003: Flow chart describing the selection of ATAHC participants for inclusion in this analysis.Among the ATAHC participants (n = 163), 143 samples were tested and sequences were generated for Core-E2 and NS5B regions. 50 participants with both Core-E2 and NS5B sequences were used for clustering analysis.

Mentions: Overall, samples with detectable HCV RNA were available from 143 of 163 participants enrolled in the ATAHC study. In total, 106 Core-E2 and 128 NS5B sequences were generated giving a success rate of 74% and 90% respectively. The genotype distribution among these 128 samples is 49% GT1a, 40% GT3a, 6% GT1b, 3% GT2a, 2% GT2b and 1% GT6k. As a greater number of related sequences were found within GT1a participants, only GT1a participants for whom both Core-E2 and NS5B amplicons were available were included in the phylogenetic analysis (n = 50) (Fig 3).


The Influence of Hepatitis C Virus Genetic Region on Phylogenetic Clustering Analysis.

Lamoury FM, Jacka B, Bartlett S, Bull RA, Wong A, Amin J, Schinkel J, Poon AF, Matthews GV, Grebely J, Dore GJ, Applegate TL - PLoS ONE (2015)

Flow chart describing the selection of ATAHC participants for inclusion in this analysis.Among the ATAHC participants (n = 163), 143 samples were tested and sequences were generated for Core-E2 and NS5B regions. 50 participants with both Core-E2 and NS5B sequences were used for clustering analysis.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4507989&req=5

pone.0131437.g003: Flow chart describing the selection of ATAHC participants for inclusion in this analysis.Among the ATAHC participants (n = 163), 143 samples were tested and sequences were generated for Core-E2 and NS5B regions. 50 participants with both Core-E2 and NS5B sequences were used for clustering analysis.
Mentions: Overall, samples with detectable HCV RNA were available from 143 of 163 participants enrolled in the ATAHC study. In total, 106 Core-E2 and 128 NS5B sequences were generated giving a success rate of 74% and 90% respectively. The genotype distribution among these 128 samples is 49% GT1a, 40% GT3a, 6% GT1b, 3% GT2a, 2% GT2b and 1% GT6k. As a greater number of related sequences were found within GT1a participants, only GT1a participants for whom both Core-E2 and NS5B amplicons were available were included in the phylogenetic analysis (n = 50) (Fig 3).

Bottom Line: Our results demonstrated that the genomic region of HCV analysed influenced phylogenetic tree topology and clustering results.The HCV Core region alone was not suitable for clustering analysis; NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome.The Core-E2 region, which represented the highest genetic diversity and longest sequence length in this study, provides an ideal method for clustering analysis to address a range of molecular epidemiological questions.

View Article: PubMed Central - PubMed

Affiliation: The Kirby Institute, University of New South Wales Australia, Sydney, Australia.

ABSTRACT
Sequencing is important for understanding the molecular epidemiology and viral evolution of hepatitis C virus (HCV) infection. To date, there is little standardisation among sequencing protocols, in-part due to the high genetic diversity that is observed within HCV. This study aimed to develop a novel, practical sequencing protocol that covered both conserved and variable regions of the viral genome and assess the influence of each subregion, sequence concatenation and unrelated reference sequences on phylogenetic clustering analysis. The Core to the hypervariable region 1 (HVR1) of envelope-2 (E2) and non-structural-5B (NS5B) regions of the HCV genome were amplified and sequenced from participants from the Australian Trial in Acute Hepatitis C (ATAHC), a prospective study of the natural history and treatment of recent HCV infection. Phylogenetic trees were constructed using a general time-reversible substitution model and sensitivity analyses were completed for every subregion. Pairwise distance, genetic distance and bootstrap support were computed to assess the impact of HCV region on clustering results as measured by the identification and percentage of participants falling within all clusters, cluster size, average patristic distance, and bootstrap value. The Robinson-Foulds metrics was also used to compare phylogenetic trees among the different HCV regions. Our results demonstrated that the genomic region of HCV analysed influenced phylogenetic tree topology and clustering results. The HCV Core region alone was not suitable for clustering analysis; NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome. The Core-E2 region, which represented the highest genetic diversity and longest sequence length in this study, provides an ideal method for clustering analysis to address a range of molecular epidemiological questions.

No MeSH data available.


Related in: MedlinePlus