Limits...
A scale-space method for detecting recurrent DNA copy number changes with analytical false discovery rate control.

van Dyk E, Reinders MJ, Wessels LF - Nucleic Acids Res. (2013)

Bottom Line: The method does not require segmentation or calling on the input dataset and therefore reduces the potential loss of information due to discretization.An important characteristic of the approach is that the error rate is controlled across all scales and that the algorithm outputs a single profile of significant events selected from the appropriate scales.Importantly, ADMIRE detects focal events that are missed by GISTIC, including two events involving known glioma tumor-suppressor genes: CDKN2C and NF1.

View Article: PubMed Central - PubMed

Affiliation: Bioinformatics and Statistics group, Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands.

ABSTRACT
Tumor formation is partially driven by DNA copy number changes, which are typically measured using array comparative genomic hybridization, SNP arrays and DNA sequencing platforms. Many techniques are available for detecting recurring aberrations across multiple tumor samples, including CMAR, STAC, GISTIC and KC-SMART. GISTIC is widely used and detects both broad and focal (potentially overlapping) recurring events. However, GISTIC performs false discovery rate control on probes instead of events. Here we propose Analytical Multi-scale Identification of Recurrent Events, a multi-scale Gaussian smoothing approach, for the detection of both broad and focal (potentially overlapping) recurring copy number alterations. Importantly, false discovery rate control is performed analytically (no need for permutations) on events rather than probes. The method does not require segmentation or calling on the input dataset and therefore reduces the potential loss of information due to discretization. An important characteristic of the approach is that the error rate is controlled across all scales and that the algorithm outputs a single profile of significant events selected from the appropriate scales. We perform extensive simulations and showcase its utility on a glioblastoma SNP array dataset. Importantly, ADMIRE detects focal events that are missed by GISTIC, including two events involving known glioma tumor-suppressor genes: CDKN2C and NF1.

Show MeSH

Related in: MedlinePlus

Comparison of detected recurring events detected by ADMIRE and GISTIC2.0 on the glioma dataset. (A) Summary of the recurrent aberrations found by both ADMIRE and GISTIC2.0 on the entire genome. (A.I) The SNP array profiles for 141 glioma samples. Red (green) represents amplifications (deletions). (A.II) The sum of all the SNP array profiles. (A.III) A multi-level representation of the recurring events found by ADMIRE at 25% event-based FDR. The first recursive level shows all the broad and focal events that are not embedded in broad events. The second level shows more focal (or less broad) events embedded in broad first-level events, etc. (A.IV) Results found by GISTIC2.0 at 25% probe-based FDR. The first level (+1/−1 for gains or losses, respectively) represents all the broad recurrent events found at the chromosome arm level. After removing segments that stretch across whole chromosome arms, all segments with q-values below 0.25 are represented on the second level. Finally, focal regions are detected using the RegBounder algorithm and represented on the third level. Therefore, red events (positive levels) represent recurring gains (levels move upwards) and black (negative levels) represents deletions (with levels moving downwards). (B) A zoom of the result in Panel A, showing the first part of chromosome 1p. (C) The top recursive level (most focal) event found by ADMIRE containing the CHD5 gene. It is interesting to note that GISTIC2.0 finds a much more focal area close to CHD5; however, with careful observation of the aggregated profile in (B.II) it is obvious that no focal event can be called with high significance by ADMIRE at this point. (D) Shows the recurring region found by ADMIRE containing the known glioma tumor suppressor gene CDKN2C that was missed by GISTIC2.0.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3643574&req=5

gkt155-F8: Comparison of detected recurring events detected by ADMIRE and GISTIC2.0 on the glioma dataset. (A) Summary of the recurrent aberrations found by both ADMIRE and GISTIC2.0 on the entire genome. (A.I) The SNP array profiles for 141 glioma samples. Red (green) represents amplifications (deletions). (A.II) The sum of all the SNP array profiles. (A.III) A multi-level representation of the recurring events found by ADMIRE at 25% event-based FDR. The first recursive level shows all the broad and focal events that are not embedded in broad events. The second level shows more focal (or less broad) events embedded in broad first-level events, etc. (A.IV) Results found by GISTIC2.0 at 25% probe-based FDR. The first level (+1/−1 for gains or losses, respectively) represents all the broad recurrent events found at the chromosome arm level. After removing segments that stretch across whole chromosome arms, all segments with q-values below 0.25 are represented on the second level. Finally, focal regions are detected using the RegBounder algorithm and represented on the third level. Therefore, red events (positive levels) represent recurring gains (levels move upwards) and black (negative levels) represents deletions (with levels moving downwards). (B) A zoom of the result in Panel A, showing the first part of chromosome 1p. (C) The top recursive level (most focal) event found by ADMIRE containing the CHD5 gene. It is interesting to note that GISTIC2.0 finds a much more focal area close to CHD5; however, with careful observation of the aggregated profile in (B.II) it is obvious that no focal event can be called with high significance by ADMIRE at this point. (D) Shows the recurring region found by ADMIRE containing the known glioma tumor suppressor gene CDKN2C that was missed by GISTIC2.0.

Mentions: We compare the recurring events found by both ADMIRE and the latest version of GISTIC2.0 at 25% FDR on the glioma dataset described earlier. The results in Figure 8 reveal that ADMIRE finds many more events (in total 223 focal and broad events) than GISTIC2.0 (50 focal and broad events). All the known glioma tumor suppressors and oncogenes found by GISTIC2.0 are also recovered by ADMIRE. Although GISTIC2.0 performs probe-based FDR, and is therefore expected to be optimistic (see Figure 4), there are many sources of power loss that are overcome by ADMIRE as follows:


A scale-space method for detecting recurrent DNA copy number changes with analytical false discovery rate control.

van Dyk E, Reinders MJ, Wessels LF - Nucleic Acids Res. (2013)

Comparison of detected recurring events detected by ADMIRE and GISTIC2.0 on the glioma dataset. (A) Summary of the recurrent aberrations found by both ADMIRE and GISTIC2.0 on the entire genome. (A.I) The SNP array profiles for 141 glioma samples. Red (green) represents amplifications (deletions). (A.II) The sum of all the SNP array profiles. (A.III) A multi-level representation of the recurring events found by ADMIRE at 25% event-based FDR. The first recursive level shows all the broad and focal events that are not embedded in broad events. The second level shows more focal (or less broad) events embedded in broad first-level events, etc. (A.IV) Results found by GISTIC2.0 at 25% probe-based FDR. The first level (+1/−1 for gains or losses, respectively) represents all the broad recurrent events found at the chromosome arm level. After removing segments that stretch across whole chromosome arms, all segments with q-values below 0.25 are represented on the second level. Finally, focal regions are detected using the RegBounder algorithm and represented on the third level. Therefore, red events (positive levels) represent recurring gains (levels move upwards) and black (negative levels) represents deletions (with levels moving downwards). (B) A zoom of the result in Panel A, showing the first part of chromosome 1p. (C) The top recursive level (most focal) event found by ADMIRE containing the CHD5 gene. It is interesting to note that GISTIC2.0 finds a much more focal area close to CHD5; however, with careful observation of the aggregated profile in (B.II) it is obvious that no focal event can be called with high significance by ADMIRE at this point. (D) Shows the recurring region found by ADMIRE containing the known glioma tumor suppressor gene CDKN2C that was missed by GISTIC2.0.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3643574&req=5

gkt155-F8: Comparison of detected recurring events detected by ADMIRE and GISTIC2.0 on the glioma dataset. (A) Summary of the recurrent aberrations found by both ADMIRE and GISTIC2.0 on the entire genome. (A.I) The SNP array profiles for 141 glioma samples. Red (green) represents amplifications (deletions). (A.II) The sum of all the SNP array profiles. (A.III) A multi-level representation of the recurring events found by ADMIRE at 25% event-based FDR. The first recursive level shows all the broad and focal events that are not embedded in broad events. The second level shows more focal (or less broad) events embedded in broad first-level events, etc. (A.IV) Results found by GISTIC2.0 at 25% probe-based FDR. The first level (+1/−1 for gains or losses, respectively) represents all the broad recurrent events found at the chromosome arm level. After removing segments that stretch across whole chromosome arms, all segments with q-values below 0.25 are represented on the second level. Finally, focal regions are detected using the RegBounder algorithm and represented on the third level. Therefore, red events (positive levels) represent recurring gains (levels move upwards) and black (negative levels) represents deletions (with levels moving downwards). (B) A zoom of the result in Panel A, showing the first part of chromosome 1p. (C) The top recursive level (most focal) event found by ADMIRE containing the CHD5 gene. It is interesting to note that GISTIC2.0 finds a much more focal area close to CHD5; however, with careful observation of the aggregated profile in (B.II) it is obvious that no focal event can be called with high significance by ADMIRE at this point. (D) Shows the recurring region found by ADMIRE containing the known glioma tumor suppressor gene CDKN2C that was missed by GISTIC2.0.
Mentions: We compare the recurring events found by both ADMIRE and the latest version of GISTIC2.0 at 25% FDR on the glioma dataset described earlier. The results in Figure 8 reveal that ADMIRE finds many more events (in total 223 focal and broad events) than GISTIC2.0 (50 focal and broad events). All the known glioma tumor suppressors and oncogenes found by GISTIC2.0 are also recovered by ADMIRE. Although GISTIC2.0 performs probe-based FDR, and is therefore expected to be optimistic (see Figure 4), there are many sources of power loss that are overcome by ADMIRE as follows:

Bottom Line: The method does not require segmentation or calling on the input dataset and therefore reduces the potential loss of information due to discretization.An important characteristic of the approach is that the error rate is controlled across all scales and that the algorithm outputs a single profile of significant events selected from the appropriate scales.Importantly, ADMIRE detects focal events that are missed by GISTIC, including two events involving known glioma tumor-suppressor genes: CDKN2C and NF1.

View Article: PubMed Central - PubMed

Affiliation: Bioinformatics and Statistics group, Division of Molecular Carcinogenesis, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands.

ABSTRACT
Tumor formation is partially driven by DNA copy number changes, which are typically measured using array comparative genomic hybridization, SNP arrays and DNA sequencing platforms. Many techniques are available for detecting recurring aberrations across multiple tumor samples, including CMAR, STAC, GISTIC and KC-SMART. GISTIC is widely used and detects both broad and focal (potentially overlapping) recurring events. However, GISTIC performs false discovery rate control on probes instead of events. Here we propose Analytical Multi-scale Identification of Recurrent Events, a multi-scale Gaussian smoothing approach, for the detection of both broad and focal (potentially overlapping) recurring copy number alterations. Importantly, false discovery rate control is performed analytically (no need for permutations) on events rather than probes. The method does not require segmentation or calling on the input dataset and therefore reduces the potential loss of information due to discretization. An important characteristic of the approach is that the error rate is controlled across all scales and that the algorithm outputs a single profile of significant events selected from the appropriate scales. We perform extensive simulations and showcase its utility on a glioblastoma SNP array dataset. Importantly, ADMIRE detects focal events that are missed by GISTIC, including two events involving known glioma tumor-suppressor genes: CDKN2C and NF1.

Show MeSH
Related in: MedlinePlus