Limits...
Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays.

Seiser EL, Innocenti F - Cancer Inform (2015)

Bottom Line: QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated.Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods.Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.

View Article: PubMed Central - PubMed

Affiliation: Center for Pharmacogenomics and Individualized Therapy, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.

ABSTRACT
Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illumina genotype microarray data for copy number variant (CNV) discovery, although commonly utilized algorithms freely available to the public employ approaches based upon the use of hidden Markov models (HMMs). QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated. Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods. Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.

No MeSH data available.


Related in: MedlinePlus

Size of CNVs detected in the HapMap samples using QuantiSNP, PennCNV, and GenoCN. The HMM-based CNV detection tools were applied to Illumina Human610-Quad BeadChip v1.0 data from three HapMap samples of European ancestry (NA06985, NA06991, and NA06993). For each sample, boxplots were generated for CNV sizes from each HMM-based detection method. Boxplots on the left are CNV sizes measured in genomic length (base pairs), and boxplots on the right are CNV sizes measured by the number of genotype microarray probes in the detected region.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4310714&req=5

f1-cin-suppl.7-2014-077: Size of CNVs detected in the HapMap samples using QuantiSNP, PennCNV, and GenoCN. The HMM-based CNV detection tools were applied to Illumina Human610-Quad BeadChip v1.0 data from three HapMap samples of European ancestry (NA06985, NA06991, and NA06993). For each sample, boxplots were generated for CNV sizes from each HMM-based detection method. Boxplots on the left are CNV sizes measured in genomic length (base pairs), and boxplots on the right are CNV sizes measured by the number of genotype microarray probes in the detected region.

Mentions: To provide a performance evaluation between the three HMM-based CNV detection algorithms, data for three HapMap individuals of European ancestry (NA06985, NA06991, and NA06993) genotyped on the Illumina Human610-Quad BeadChip v1.0 were obtained from the NCBI GEO database44 (GEO accession: GSE17205). CNV detection for these samples was performed using the default settings in Quanti SNP, PennCNV, and GenoCN. Total CNV regions were identified, along with their associated copy number state, and region overlap between methods was defined as regions having at least one shared probe and copy numbers of the same type (gain or deletion). QuantiSNP and GenoCN identify more CNV regions in these samples when compared to PennCNV (Table 1). Moreover, the majority of CNV regions identified by PennCNV overlap with both QuantiSNP and GenoCN, and only a limited amount of unique CNV regions are identified by PennCNV. In contrast, the majority of CNV regions identified by QuantiSNP and GenoCN in the HapMap samples are unique to the algorithm. The distribution of CNV size for each sample, in terms of kilobases and also with regard to the number of probes on the genotyping array, was compared between detection methods (Fig. 1). For all three samples, PennCNV identified regions with the largest size (both in length and number of probes), whereas the CNV regions identified by GenoCN were predominantly the smallest (both in length and number of probes). Interestingly, instances were observed in which QuantiSNP and GenoCN identified multiple CNV regions within a single CNV region identified by another algorithm.


Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays.

Seiser EL, Innocenti F - Cancer Inform (2015)

Size of CNVs detected in the HapMap samples using QuantiSNP, PennCNV, and GenoCN. The HMM-based CNV detection tools were applied to Illumina Human610-Quad BeadChip v1.0 data from three HapMap samples of European ancestry (NA06985, NA06991, and NA06993). For each sample, boxplots were generated for CNV sizes from each HMM-based detection method. Boxplots on the left are CNV sizes measured in genomic length (base pairs), and boxplots on the right are CNV sizes measured by the number of genotype microarray probes in the detected region.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4310714&req=5

f1-cin-suppl.7-2014-077: Size of CNVs detected in the HapMap samples using QuantiSNP, PennCNV, and GenoCN. The HMM-based CNV detection tools were applied to Illumina Human610-Quad BeadChip v1.0 data from three HapMap samples of European ancestry (NA06985, NA06991, and NA06993). For each sample, boxplots were generated for CNV sizes from each HMM-based detection method. Boxplots on the left are CNV sizes measured in genomic length (base pairs), and boxplots on the right are CNV sizes measured by the number of genotype microarray probes in the detected region.
Mentions: To provide a performance evaluation between the three HMM-based CNV detection algorithms, data for three HapMap individuals of European ancestry (NA06985, NA06991, and NA06993) genotyped on the Illumina Human610-Quad BeadChip v1.0 were obtained from the NCBI GEO database44 (GEO accession: GSE17205). CNV detection for these samples was performed using the default settings in Quanti SNP, PennCNV, and GenoCN. Total CNV regions were identified, along with their associated copy number state, and region overlap between methods was defined as regions having at least one shared probe and copy numbers of the same type (gain or deletion). QuantiSNP and GenoCN identify more CNV regions in these samples when compared to PennCNV (Table 1). Moreover, the majority of CNV regions identified by PennCNV overlap with both QuantiSNP and GenoCN, and only a limited amount of unique CNV regions are identified by PennCNV. In contrast, the majority of CNV regions identified by QuantiSNP and GenoCN in the HapMap samples are unique to the algorithm. The distribution of CNV size for each sample, in terms of kilobases and also with regard to the number of probes on the genotyping array, was compared between detection methods (Fig. 1). For all three samples, PennCNV identified regions with the largest size (both in length and number of probes), whereas the CNV regions identified by GenoCN were predominantly the smallest (both in length and number of probes). Interestingly, instances were observed in which QuantiSNP and GenoCN identified multiple CNV regions within a single CNV region identified by another algorithm.

Bottom Line: QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated.Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods.Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.

View Article: PubMed Central - PubMed

Affiliation: Center for Pharmacogenomics and Individualized Therapy, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.

ABSTRACT
Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illumina genotype microarray data for copy number variant (CNV) discovery, although commonly utilized algorithms freely available to the public employ approaches based upon the use of hidden Markov models (HMMs). QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated. Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods. Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.

No MeSH data available.


Related in: MedlinePlus