Limits...
The CNVrd2 package: measurement of copy number at complex loci using high-throughput sequencing data.

Nguyen HT, Merriman TR, Black MA - Front Genet (2014)

Bottom Line: The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population.The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci.The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A).

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, University of Otago Dunedin, New Zealand ; Department of Mathematics and Statistics, University of Otago Dunedin, New Zealand ; Department of Biochemistry, Virtual Institute of Statistical Genetics, University of Otago Dunedin, New Zealand.

ABSTRACT
Recent advances in high-throughout sequencing technologies have made it possible to accurately assign copy number (CN) at CN variable loci. However, current analytic methods often perform poorly in regions in which complex CN variation is observed. Here we report the development of a read depth-based approach, CNVrd2, for investigation of CN variation using high-throughput sequencing data. This methodology was developed using data from the 1000 Genomes Project from the CCL3L1 locus, and tested using data from the DEFB103A locus. In both cases, samples were selected for which paralog ratio test data were also available for comparison. The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts. The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci. The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A). CNVrd2 is available as an R package as part of the Bioconductor project: http://www.bioconductor.org/packages/release/bioc/html/CNVrd2.html.

No MeSH data available.


Related in: MedlinePlus

Concordance between the CNVrd and microarray based results. Window sizes are shown on the x-axis, while the y-axis shows the percentage of identical results.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4117933&req=5

Figure 6: Concordance between the CNVrd and microarray based results. Window sizes are shown on the x-axis, while the y-axis shows the percentage of identical results.

Mentions: Our previous work was focused on loci having low copy-number ranges. Even though this present study is concentrated on loci with CN range from zero to >5 CN, we also compared with CNVrd (Nguyen et al., 2013) to assess the performance of the modified pipeline, especially in the steps requiring obtaining and adjusting segmentation scores for each sample. Loci used in this comparison had copy number ranging from 0 to 5, and their boundaries were known (Table 3 and Nguyen et al., 2013). Therefore, a simple normal mixture model was used in CNVrd2 to cluster segmentation scores into different groups. CNVrd2 had more stable results than those of CNVrd at almost all window sizes (Figures 5, 6). Merging sub-regions and adjusting boundaries (Step D and E, Figure 1) produced more reliable results for CNVrd2 at two complex regions (chr17:34436344-34481815 and chr1:110222301-110242933). At these two regions, CNVrd showed very low concordance with microarray-based approaches (Figure 6).


The CNVrd2 package: measurement of copy number at complex loci using high-throughput sequencing data.

Nguyen HT, Merriman TR, Black MA - Front Genet (2014)

Concordance between the CNVrd and microarray based results. Window sizes are shown on the x-axis, while the y-axis shows the percentage of identical results.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4117933&req=5

Figure 6: Concordance between the CNVrd and microarray based results. Window sizes are shown on the x-axis, while the y-axis shows the percentage of identical results.
Mentions: Our previous work was focused on loci having low copy-number ranges. Even though this present study is concentrated on loci with CN range from zero to >5 CN, we also compared with CNVrd (Nguyen et al., 2013) to assess the performance of the modified pipeline, especially in the steps requiring obtaining and adjusting segmentation scores for each sample. Loci used in this comparison had copy number ranging from 0 to 5, and their boundaries were known (Table 3 and Nguyen et al., 2013). Therefore, a simple normal mixture model was used in CNVrd2 to cluster segmentation scores into different groups. CNVrd2 had more stable results than those of CNVrd at almost all window sizes (Figures 5, 6). Merging sub-regions and adjusting boundaries (Step D and E, Figure 1) produced more reliable results for CNVrd2 at two complex regions (chr17:34436344-34481815 and chr1:110222301-110242933). At these two regions, CNVrd showed very low concordance with microarray-based approaches (Figure 6).

Bottom Line: The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population.The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci.The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A).

View Article: PubMed Central - PubMed

Affiliation: Department of Biochemistry, University of Otago Dunedin, New Zealand ; Department of Mathematics and Statistics, University of Otago Dunedin, New Zealand ; Department of Biochemistry, Virtual Institute of Statistical Genetics, University of Otago Dunedin, New Zealand.

ABSTRACT
Recent advances in high-throughout sequencing technologies have made it possible to accurately assign copy number (CN) at CN variable loci. However, current analytic methods often perform poorly in regions in which complex CN variation is observed. Here we report the development of a read depth-based approach, CNVrd2, for investigation of CN variation using high-throughput sequencing data. This methodology was developed using data from the 1000 Genomes Project from the CCL3L1 locus, and tested using data from the DEFB103A locus. In both cases, samples were selected for which paralog ratio test data were also available for comparison. The CNVrd2 method first uses observed read-count ratios to refine segmentation results in one population. Then a linear regression model is applied to adjust the results across multiple populations, in combination with a Bayesian normal mixture model to cluster segmentation scores into groups for individual CN counts. The performance of CNVrd2 was compared to that of two other read depth-based methods (CNVnator, cn.mops) at the CCL3L1 and DEFB103A loci. The highest concordance with the paralog ratio test method was observed for CNVrd2 (77.8/90.4% for CNVrd2, 36.7/4.8% for cn.mops and 7.2/1% for CNVnator at CCL3L1 and DEF103A). CNVrd2 is available as an R package as part of the Bioconductor project: http://www.bioconductor.org/packages/release/bioc/html/CNVrd2.html.

No MeSH data available.


Related in: MedlinePlus