Limits...
SIRAC: Supervised Identification of Regions of Aberration in aCGH datasets.

Lai C, Horlings HM, van de Vijver MJ, van Beers EH, Nederlof PM, Wessels LF, Reinders MJ - BMC Bioinformatics (2007)

Bottom Line: SIRAC does not need any preprocessing of the aCGH datasets, and requires only few, intuitive parameters.We illustrate the features of the algorithm with the use of a simple artificial dataset.The results on two breast cancer datasets show promising outcomes that are in agreement with previous findings, but SIRAC better pinpoints the dissimilarities between the classes of interest.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics group, Delft University, Delft, The Netherlands. c.lai@tudelft.nl

ABSTRACT

Background: Array comparative genome hybridization (aCGH) provides information about genomic aberrations. Alterations in the DNA copy number may cause the cell to malfunction, leading to cancer. Therefore, the identification of DNA amplifications or deletions across tumors may reveal key genes involved in cancer and improve our understanding of the underlying biological processes associated with the disease.

Results: We propose a supervised algorithm for the analysis of aCGH data and the identification of regions of chromosomal alteration (SIRAC). We first determine the DNA-probes that are important to distinguish the classes of interest, and then evaluate in a systematic and robust scheme if these relevant DNA-probes are closely located, i.e. form a region of amplification/deletion. SIRAC does not need any preprocessing of the aCGH datasets, and requires only few, intuitive parameters.

Conclusion: We illustrate the features of the algorithm with the use of a simple artificial dataset. The results on two breast cancer datasets show promising outcomes that are in agreement with previous findings, but SIRAC better pinpoints the dissimilarities between the classes of interest.

Show MeSH

Related in: MedlinePlus

Description of the SIRAC algorithm. Illustration of the algorithmic steps of SIRAC. The corresponding results for the NKI dataset are shown. The data is labeled according to the cancer subtypes introduced by Sorlie and Perou [30, 31, 32]; in this example the label Luminal A subtype versus all others subtypes is used. In Step 1 the relevant DNA-probes are selected. Each DNA-probe is plotted on the genomic location with two circles of different color representing the median value of the samples in the two classes. In Step 2, the vertical axis represents the different window sizes, the blue lines along the genome (the horizontal axis) show the regions judged significant by the algorithm. In the final step, Step 3, the number of window sizes for which the location is judged significant by the hyper-geometric test are shown along the vertical axis. The relevant region selected when s = 9 is highlight by the red curve.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2198922&req=5

Figure 1: Description of the SIRAC algorithm. Illustration of the algorithmic steps of SIRAC. The corresponding results for the NKI dataset are shown. The data is labeled according to the cancer subtypes introduced by Sorlie and Perou [30, 31, 32]; in this example the label Luminal A subtype versus all others subtypes is used. In Step 1 the relevant DNA-probes are selected. Each DNA-probe is plotted on the genomic location with two circles of different color representing the median value of the samples in the two classes. In Step 2, the vertical axis represents the different window sizes, the blue lines along the genome (the horizontal axis) show the regions judged significant by the algorithm. In the final step, Step 3, the number of window sizes for which the location is judged significant by the hyper-geometric test are shown along the vertical axis. The relevant region selected when s = 9 is highlight by the red curve.

Mentions: Figure 1 illustrates our algorithm SIRAC (Supervised Identification of Regions of Aberration in aCGH data). A detailed description is given in Appendix 1. An aCGH dataset D and its label set y provide the starting point. The procedure consists of three steps.


SIRAC: Supervised Identification of Regions of Aberration in aCGH datasets.

Lai C, Horlings HM, van de Vijver MJ, van Beers EH, Nederlof PM, Wessels LF, Reinders MJ - BMC Bioinformatics (2007)

Description of the SIRAC algorithm. Illustration of the algorithmic steps of SIRAC. The corresponding results for the NKI dataset are shown. The data is labeled according to the cancer subtypes introduced by Sorlie and Perou [30, 31, 32]; in this example the label Luminal A subtype versus all others subtypes is used. In Step 1 the relevant DNA-probes are selected. Each DNA-probe is plotted on the genomic location with two circles of different color representing the median value of the samples in the two classes. In Step 2, the vertical axis represents the different window sizes, the blue lines along the genome (the horizontal axis) show the regions judged significant by the algorithm. In the final step, Step 3, the number of window sizes for which the location is judged significant by the hyper-geometric test are shown along the vertical axis. The relevant region selected when s = 9 is highlight by the red curve.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2198922&req=5

Figure 1: Description of the SIRAC algorithm. Illustration of the algorithmic steps of SIRAC. The corresponding results for the NKI dataset are shown. The data is labeled according to the cancer subtypes introduced by Sorlie and Perou [30, 31, 32]; in this example the label Luminal A subtype versus all others subtypes is used. In Step 1 the relevant DNA-probes are selected. Each DNA-probe is plotted on the genomic location with two circles of different color representing the median value of the samples in the two classes. In Step 2, the vertical axis represents the different window sizes, the blue lines along the genome (the horizontal axis) show the regions judged significant by the algorithm. In the final step, Step 3, the number of window sizes for which the location is judged significant by the hyper-geometric test are shown along the vertical axis. The relevant region selected when s = 9 is highlight by the red curve.
Mentions: Figure 1 illustrates our algorithm SIRAC (Supervised Identification of Regions of Aberration in aCGH data). A detailed description is given in Appendix 1. An aCGH dataset D and its label set y provide the starting point. The procedure consists of three steps.

Bottom Line: SIRAC does not need any preprocessing of the aCGH datasets, and requires only few, intuitive parameters.We illustrate the features of the algorithm with the use of a simple artificial dataset.The results on two breast cancer datasets show promising outcomes that are in agreement with previous findings, but SIRAC better pinpoints the dissimilarities between the classes of interest.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics group, Delft University, Delft, The Netherlands. c.lai@tudelft.nl

ABSTRACT

Background: Array comparative genome hybridization (aCGH) provides information about genomic aberrations. Alterations in the DNA copy number may cause the cell to malfunction, leading to cancer. Therefore, the identification of DNA amplifications or deletions across tumors may reveal key genes involved in cancer and improve our understanding of the underlying biological processes associated with the disease.

Results: We propose a supervised algorithm for the analysis of aCGH data and the identification of regions of chromosomal alteration (SIRAC). We first determine the DNA-probes that are important to distinguish the classes of interest, and then evaluate in a systematic and robust scheme if these relevant DNA-probes are closely located, i.e. form a region of amplification/deletion. SIRAC does not need any preprocessing of the aCGH datasets, and requires only few, intuitive parameters.

Conclusion: We illustrate the features of the algorithm with the use of a simple artificial dataset. The results on two breast cancer datasets show promising outcomes that are in agreement with previous findings, but SIRAC better pinpoints the dissimilarities between the classes of interest.

Show MeSH
Related in: MedlinePlus