Limits...
The evolutionary rates of HCV estimated with subtype 1a and 1b sequences over the ORF length and in different genomic regions.

Yuan M, Lu T, Li C, Lu L - PLoS ONE (2013)

Bottom Line: Significantly lower rates were estimated for 1b and some of the rate distribution curves resulted in a one-sided truncation, particularly under the exponential model.Therefore, an applied estimation of the HCV epidemic history requires the proper selection of the rate priors, which should match the actual dataset so that they can fit for the subtype, the genomic region and even the length.By referencing the findings here, future evolutionary analysis of the HCV subtype 1a and 1b datasets may become more accurate and hence prove useful for tracing their patterns.

View Article: PubMed Central - PubMed

Affiliation: Department of Pathology and Laboratory Medicine, Center for Viral Oncology, University of Kansas Medical Center, Kansas City, Kansas, United States of America.

ABSTRACT

Background: Considerable progress has been made in the HCV evolutionary analysis, since the software BEAST was released. However, prior information, especially the prior evolutionary rate, which plays a critical role in BEAST analysis, is always difficult to ascertain due to various uncertainties. Providing a proper prior HCV evolutionary rate is thus of great importance.

Methods/results: 176 full-length sequences of HCV subtype 1a and 144 of 1b were assembled by taking into consideration the balance of the sampling dates and the even dispersion in phylogenetic trees. According to the HCV genomic organization and biological functions, each dataset was partitioned into nine genomic regions and two routinely amplified regions. A uniform prior rate was applied to the BEAST analysis for each region and also the entire ORF. All the obtained posterior rates for 1a are of a magnitude of 10(-3) substitutions/site/year and in a bell-shaped distribution. Significantly lower rates were estimated for 1b and some of the rate distribution curves resulted in a one-sided truncation, particularly under the exponential model. This indicates that some of the rates for subtype 1b are less accurate, so they were adjusted by including more sequences to improve the temporal structure.

Conclusion: Among the various HCV subtypes and genomic regions, the evolutionary patterns are dissimilar. Therefore, an applied estimation of the HCV epidemic history requires the proper selection of the rate priors, which should match the actual dataset so that they can fit for the subtype, the genomic region and even the length. By referencing the findings here, future evolutionary analysis of the HCV subtype 1a and 1b datasets may become more accurate and hence prove useful for tracing their patterns.

Show MeSH

Related in: MedlinePlus

Root-to-tip regression to estimate the tMRCAs and clock rates.A simple linear regression of the root-to-tip genentic distances against the sampling dates was performed using the Path-o-gen software. The root was determined by maximizing the coefficent of determinant R2. The vertical axis measures the genetic distances between the samples and the root while the horizontal axis scales the sampling dates (year). For subtype 1a (A), the mean evolutionary rate (the slope of regression line) is 9.05E-4 substitution/site/year and the tMRCA (the X-intercept) is located at 1941. For subtype 1b (B), the mean evolutionary rate is 4.82E-4 and the tMRCA is located at 1808.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3675120&req=5

pone-0064698-g001: Root-to-tip regression to estimate the tMRCAs and clock rates.A simple linear regression of the root-to-tip genentic distances against the sampling dates was performed using the Path-o-gen software. The root was determined by maximizing the coefficent of determinant R2. The vertical axis measures the genetic distances between the samples and the root while the horizontal axis scales the sampling dates (year). For subtype 1a (A), the mean evolutionary rate (the slope of regression line) is 9.05E-4 substitution/site/year and the tMRCA (the X-intercept) is located at 1941. For subtype 1b (B), the mean evolutionary rate is 4.82E-4 and the tMRCA is located at 1808.

Mentions: Model-testing showed GTR+Γ+I to be the best among the 24 models based on the AICc. Using this model, ML trees were reconstructed and root-to-tip regression analyses were performed (Figure 1). For 1a, the estimated function of linear regression is d = 0.0009× (t-1941), where d is the distance from the samples to the selected root, while t is the sampling date. For 1b, the function is d = 0.00048× (t-1808). The molecular rates and tMRCAs were based on the slope and the X-intercept of the regression lines, respectively. The rates indicated that 1a evolved almost one fold faster and diverged approximately 133 years later than 1b. Root-to-tip regression was also performed for the nine genomic regions (Table 1). Unexpectedly, the molecular rate and the tMRCA estimated for the NS5B dataset of subtype 1b are unrealistic, for the former is negative while the latter occurs in the future, a result that may possibly be ascribed to the stochastic nature of the substitution process. The sequences sampled earlier, exhibiting a greater divergence from the root than those sampled later, may suggest that the evolution of the NS5B region for subtype 1b is not clock-like, or alternatively, that it only reflects one of the limitations of the root-to-tip regression analysis [26]. Regardless, all of the results showed that the nine genomic regions of 1a have a faster evolutionary rate than 1b, consistent with the analyses over the entire ORF and using the BEAST program that were described below.


The evolutionary rates of HCV estimated with subtype 1a and 1b sequences over the ORF length and in different genomic regions.

Yuan M, Lu T, Li C, Lu L - PLoS ONE (2013)

Root-to-tip regression to estimate the tMRCAs and clock rates.A simple linear regression of the root-to-tip genentic distances against the sampling dates was performed using the Path-o-gen software. The root was determined by maximizing the coefficent of determinant R2. The vertical axis measures the genetic distances between the samples and the root while the horizontal axis scales the sampling dates (year). For subtype 1a (A), the mean evolutionary rate (the slope of regression line) is 9.05E-4 substitution/site/year and the tMRCA (the X-intercept) is located at 1941. For subtype 1b (B), the mean evolutionary rate is 4.82E-4 and the tMRCA is located at 1808.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3675120&req=5

pone-0064698-g001: Root-to-tip regression to estimate the tMRCAs and clock rates.A simple linear regression of the root-to-tip genentic distances against the sampling dates was performed using the Path-o-gen software. The root was determined by maximizing the coefficent of determinant R2. The vertical axis measures the genetic distances between the samples and the root while the horizontal axis scales the sampling dates (year). For subtype 1a (A), the mean evolutionary rate (the slope of regression line) is 9.05E-4 substitution/site/year and the tMRCA (the X-intercept) is located at 1941. For subtype 1b (B), the mean evolutionary rate is 4.82E-4 and the tMRCA is located at 1808.
Mentions: Model-testing showed GTR+Γ+I to be the best among the 24 models based on the AICc. Using this model, ML trees were reconstructed and root-to-tip regression analyses were performed (Figure 1). For 1a, the estimated function of linear regression is d = 0.0009× (t-1941), where d is the distance from the samples to the selected root, while t is the sampling date. For 1b, the function is d = 0.00048× (t-1808). The molecular rates and tMRCAs were based on the slope and the X-intercept of the regression lines, respectively. The rates indicated that 1a evolved almost one fold faster and diverged approximately 133 years later than 1b. Root-to-tip regression was also performed for the nine genomic regions (Table 1). Unexpectedly, the molecular rate and the tMRCA estimated for the NS5B dataset of subtype 1b are unrealistic, for the former is negative while the latter occurs in the future, a result that may possibly be ascribed to the stochastic nature of the substitution process. The sequences sampled earlier, exhibiting a greater divergence from the root than those sampled later, may suggest that the evolution of the NS5B region for subtype 1b is not clock-like, or alternatively, that it only reflects one of the limitations of the root-to-tip regression analysis [26]. Regardless, all of the results showed that the nine genomic regions of 1a have a faster evolutionary rate than 1b, consistent with the analyses over the entire ORF and using the BEAST program that were described below.

Bottom Line: Significantly lower rates were estimated for 1b and some of the rate distribution curves resulted in a one-sided truncation, particularly under the exponential model.Therefore, an applied estimation of the HCV epidemic history requires the proper selection of the rate priors, which should match the actual dataset so that they can fit for the subtype, the genomic region and even the length.By referencing the findings here, future evolutionary analysis of the HCV subtype 1a and 1b datasets may become more accurate and hence prove useful for tracing their patterns.

View Article: PubMed Central - PubMed

Affiliation: Department of Pathology and Laboratory Medicine, Center for Viral Oncology, University of Kansas Medical Center, Kansas City, Kansas, United States of America.

ABSTRACT

Background: Considerable progress has been made in the HCV evolutionary analysis, since the software BEAST was released. However, prior information, especially the prior evolutionary rate, which plays a critical role in BEAST analysis, is always difficult to ascertain due to various uncertainties. Providing a proper prior HCV evolutionary rate is thus of great importance.

Methods/results: 176 full-length sequences of HCV subtype 1a and 144 of 1b were assembled by taking into consideration the balance of the sampling dates and the even dispersion in phylogenetic trees. According to the HCV genomic organization and biological functions, each dataset was partitioned into nine genomic regions and two routinely amplified regions. A uniform prior rate was applied to the BEAST analysis for each region and also the entire ORF. All the obtained posterior rates for 1a are of a magnitude of 10(-3) substitutions/site/year and in a bell-shaped distribution. Significantly lower rates were estimated for 1b and some of the rate distribution curves resulted in a one-sided truncation, particularly under the exponential model. This indicates that some of the rates for subtype 1b are less accurate, so they were adjusted by including more sequences to improve the temporal structure.

Conclusion: Among the various HCV subtypes and genomic regions, the evolutionary patterns are dissimilar. Therefore, an applied estimation of the HCV epidemic history requires the proper selection of the rate priors, which should match the actual dataset so that they can fit for the subtype, the genomic region and even the length. By referencing the findings here, future evolutionary analysis of the HCV subtype 1a and 1b datasets may become more accurate and hence prove useful for tracing their patterns.

Show MeSH
Related in: MedlinePlus