Limits...
The Influence of Hepatitis C Virus Genetic Region on Phylogenetic Clustering Analysis.

Lamoury FM, Jacka B, Bartlett S, Bull RA, Wong A, Amin J, Schinkel J, Poon AF, Matthews GV, Grebely J, Dore GJ, Applegate TL - PLoS ONE (2015)

Bottom Line: Our results demonstrated that the genomic region of HCV analysed influenced phylogenetic tree topology and clustering results.The HCV Core region alone was not suitable for clustering analysis; NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome.The Core-E2 region, which represented the highest genetic diversity and longest sequence length in this study, provides an ideal method for clustering analysis to address a range of molecular epidemiological questions.

View Article: PubMed Central - PubMed

Affiliation: The Kirby Institute, University of New South Wales Australia, Sydney, Australia.

ABSTRACT
Sequencing is important for understanding the molecular epidemiology and viral evolution of hepatitis C virus (HCV) infection. To date, there is little standardisation among sequencing protocols, in-part due to the high genetic diversity that is observed within HCV. This study aimed to develop a novel, practical sequencing protocol that covered both conserved and variable regions of the viral genome and assess the influence of each subregion, sequence concatenation and unrelated reference sequences on phylogenetic clustering analysis. The Core to the hypervariable region 1 (HVR1) of envelope-2 (E2) and non-structural-5B (NS5B) regions of the HCV genome were amplified and sequenced from participants from the Australian Trial in Acute Hepatitis C (ATAHC), a prospective study of the natural history and treatment of recent HCV infection. Phylogenetic trees were constructed using a general time-reversible substitution model and sensitivity analyses were completed for every subregion. Pairwise distance, genetic distance and bootstrap support were computed to assess the impact of HCV region on clustering results as measured by the identification and percentage of participants falling within all clusters, cluster size, average patristic distance, and bootstrap value. The Robinson-Foulds metrics was also used to compare phylogenetic trees among the different HCV regions. Our results demonstrated that the genomic region of HCV analysed influenced phylogenetic tree topology and clustering results. The HCV Core region alone was not suitable for clustering analysis; NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome. The Core-E2 region, which represented the highest genetic diversity and longest sequence length in this study, provides an ideal method for clustering analysis to address a range of molecular epidemiological questions.

No MeSH data available.


Related in: MedlinePlus

Weighted Robinson-Foulds tree distances among HCV regions compared to the Core-E2_NS5B tree have been compared to mean genetic distance of ATAHC HCV sequences.First values from weighted Robinson-Foulds tree distances computed by RAxML1.3 from HCV regions compared to the region Core-E2 concatenated to NS5B as reference with length of 1684bp and a mean genetic distance of 0.076).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4507989&req=5

pone.0131437.g010: Weighted Robinson-Foulds tree distances among HCV regions compared to the Core-E2_NS5B tree have been compared to mean genetic distance of ATAHC HCV sequences.First values from weighted Robinson-Foulds tree distances computed by RAxML1.3 from HCV regions compared to the region Core-E2 concatenated to NS5B as reference with length of 1684bp and a mean genetic distance of 0.076).

Mentions: The Robinson-Foulds (RF) tree topology, using Core E2 concatenated to NS5B as the reference sequence was compared with either mean genetic distance (Fig 10) or sequence length (S8 and S9 Figs). The non-concatenated regions showed the RF distance accumulation (increase of tree topology variation) from Core-E2 to Core, with NS5B demonstrating the highest RF distance.


The Influence of Hepatitis C Virus Genetic Region on Phylogenetic Clustering Analysis.

Lamoury FM, Jacka B, Bartlett S, Bull RA, Wong A, Amin J, Schinkel J, Poon AF, Matthews GV, Grebely J, Dore GJ, Applegate TL - PLoS ONE (2015)

Weighted Robinson-Foulds tree distances among HCV regions compared to the Core-E2_NS5B tree have been compared to mean genetic distance of ATAHC HCV sequences.First values from weighted Robinson-Foulds tree distances computed by RAxML1.3 from HCV regions compared to the region Core-E2 concatenated to NS5B as reference with length of 1684bp and a mean genetic distance of 0.076).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4507989&req=5

pone.0131437.g010: Weighted Robinson-Foulds tree distances among HCV regions compared to the Core-E2_NS5B tree have been compared to mean genetic distance of ATAHC HCV sequences.First values from weighted Robinson-Foulds tree distances computed by RAxML1.3 from HCV regions compared to the region Core-E2 concatenated to NS5B as reference with length of 1684bp and a mean genetic distance of 0.076).
Mentions: The Robinson-Foulds (RF) tree topology, using Core E2 concatenated to NS5B as the reference sequence was compared with either mean genetic distance (Fig 10) or sequence length (S8 and S9 Figs). The non-concatenated regions showed the RF distance accumulation (increase of tree topology variation) from Core-E2 to Core, with NS5B demonstrating the highest RF distance.

Bottom Line: Our results demonstrated that the genomic region of HCV analysed influenced phylogenetic tree topology and clustering results.The HCV Core region alone was not suitable for clustering analysis; NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome.The Core-E2 region, which represented the highest genetic diversity and longest sequence length in this study, provides an ideal method for clustering analysis to address a range of molecular epidemiological questions.

View Article: PubMed Central - PubMed

Affiliation: The Kirby Institute, University of New South Wales Australia, Sydney, Australia.

ABSTRACT
Sequencing is important for understanding the molecular epidemiology and viral evolution of hepatitis C virus (HCV) infection. To date, there is little standardisation among sequencing protocols, in-part due to the high genetic diversity that is observed within HCV. This study aimed to develop a novel, practical sequencing protocol that covered both conserved and variable regions of the viral genome and assess the influence of each subregion, sequence concatenation and unrelated reference sequences on phylogenetic clustering analysis. The Core to the hypervariable region 1 (HVR1) of envelope-2 (E2) and non-structural-5B (NS5B) regions of the HCV genome were amplified and sequenced from participants from the Australian Trial in Acute Hepatitis C (ATAHC), a prospective study of the natural history and treatment of recent HCV infection. Phylogenetic trees were constructed using a general time-reversible substitution model and sensitivity analyses were completed for every subregion. Pairwise distance, genetic distance and bootstrap support were computed to assess the impact of HCV region on clustering results as measured by the identification and percentage of participants falling within all clusters, cluster size, average patristic distance, and bootstrap value. The Robinson-Foulds metrics was also used to compare phylogenetic trees among the different HCV regions. Our results demonstrated that the genomic region of HCV analysed influenced phylogenetic tree topology and clustering results. The HCV Core region alone was not suitable for clustering analysis; NS5B concatenation, the inclusion of reference sequences and removal of HVR1 all influenced clustering outcome. The Core-E2 region, which represented the highest genetic diversity and longest sequence length in this study, provides an ideal method for clustering analysis to address a range of molecular epidemiological questions.

No MeSH data available.


Related in: MedlinePlus