Limits...
CGHnormaliter: an iterative strategy to enhance normalization of array CGH data with imbalanced aberrations.

van Houte BP, Binsl TW, Hettling H, Pirovano W, Heringa J - BMC Genomics (2009)

Bottom Line: Results were compared to a conventional normalization approach and two more recent state-of-the-art aCGH normalization strategies.Our findings show that, compared to these three methods, CGHnormaliter yields a higher specificity and precision in terms of identifying the 'true' copy numbers.We demonstrate that the normalization of aCGH data can be significantly enhanced using an iterative procedure that effectively eliminates the effect of imbalanced copy numbers.

View Article: PubMed Central - HTML - PubMed

Affiliation: Centre for Integrative Bioinformatics VU (IBIVU), VU University Amsterdam, De Boelelaan 1081A, 1081 HV Amsterdam, the Netherlands.

ABSTRACT

Background: Array comparative genomic hybridization (aCGH) is a popular technique for detection of genomic copy number imbalances. These play a critical role in the onset of various types of cancer. In the analysis of aCGH data, normalization is deemed a critical pre-processing step. In general, aCGH normalization approaches are similar to those used for gene expression data, albeit both data-types differ inherently. A particular problem with aCGH data is that imbalanced copy numbers lead to improper normalization using conventional methods.

Results: In this study we present a novel method, called CGHnormaliter, which addresses this issue by means of an iterative normalization procedure. First, provisory balanced copy numbers are identified and subsequently used for normalization. These two steps are then iterated to refine the normalization. We tested our method on three well-studied tumor-related aCGH datasets with experimentally confirmed copy numbers. Results were compared to a conventional normalization approach and two more recent state-of-the-art aCGH normalization strategies. Our findings show that, compared to these three methods, CGHnormaliter yields a higher specificity and precision in terms of identifying the 'true' copy numbers.

Conclusion: We demonstrate that the normalization of aCGH data can be significantly enhanced using an iterative procedure that effectively eliminates the effect of imbalanced copy numbers. This also leads to a more reliable assessment of aberrations. An R-package containing the implementation of CGHnormaliter is available at http://www.ibi.vu.nl/programs/cghnormaliterwww.

Show MeSH

Related in: MedlinePlus

Example of the effects over-normalization using global-median normalization. Calling results on an ALL tumor sample (sample 4) after (A) global-median and (B) CGHnormaliter normalization are shown. In these figures normalized log2 intensity ratios and segments are represented by dots and blue horizontal lines, respectively. Aberration probabilities are indicated by the length of the green downward (gain) and red upward (loss) bars. Note that segments are designated gain or loss if their probabilities exceed 0.5. G-banding and FISH analyses revealed gains in 14 chromosomes (4, 5, 6, 7, 8, 10, 11, 12, 14, 17, 18, 21, 22, 23(X)) most of which are confirmed using CGHnormaliter. Over-normalization caused by global-median normalization instead leads to many incorrect calls.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2748095&req=5

Figure 3: Example of the effects over-normalization using global-median normalization. Calling results on an ALL tumor sample (sample 4) after (A) global-median and (B) CGHnormaliter normalization are shown. In these figures normalized log2 intensity ratios and segments are represented by dots and blue horizontal lines, respectively. Aberration probabilities are indicated by the length of the green downward (gain) and red upward (loss) bars. Note that segments are designated gain or loss if their probabilities exceed 0.5. G-banding and FISH analyses revealed gains in 14 chromosomes (4, 5, 6, 7, 8, 10, 11, 12, 14, 17, 18, 21, 22, 23(X)) most of which are confirmed using CGHnormaliter. Over-normalization caused by global-median normalization instead leads to many incorrect calls.

Mentions: In Figure 2A the average performance of all methods on the ALL dataset is displayed. From this figure it is clear that global-median normalization is outperformed by all other methods. popLowess and CGHnormaliter yield comparable results for all evaluation criteria (0.81 on average). Chen et al. performs slightly worse (0.77 on average) whereas global-median scores are considerably lower for sensitivity (0.57) and precision (0.62). We also investigated the underlying causes of the inferior performance of global-median normalization. As expected we found, particularly in cases where a large number of imbalanced aberrations occur, that global-median does not properly yield a normal copy number. In such cases, 'over-normalization' of the data occurs, leading to excessively shifted spot intensity ratios. A salient example is given in Figure 3, where calling results of a tumor sample are shown after global-median and CGHnormaliter normalization. In this sample gains were experimentally verified in 14 out of 24 chromosomes. In the global-median approach the median is rather high, leading to an overestimation of the number of losses and underestimation of the number of gains. In fact, only 11 out of 14 gains were (partially) recognized. CGHnormaliter (and also popLowess) attempts to correct for this problem and is able to properly identify 13 gains. Finally, in Table 2 we compare the effect of each normalization method on the resulting M values. It is clear that alternative strategies lead to considerably different shifts in the M values, whereas the final calling results are more similar (see Figure 2A).


CGHnormaliter: an iterative strategy to enhance normalization of array CGH data with imbalanced aberrations.

van Houte BP, Binsl TW, Hettling H, Pirovano W, Heringa J - BMC Genomics (2009)

Example of the effects over-normalization using global-median normalization. Calling results on an ALL tumor sample (sample 4) after (A) global-median and (B) CGHnormaliter normalization are shown. In these figures normalized log2 intensity ratios and segments are represented by dots and blue horizontal lines, respectively. Aberration probabilities are indicated by the length of the green downward (gain) and red upward (loss) bars. Note that segments are designated gain or loss if their probabilities exceed 0.5. G-banding and FISH analyses revealed gains in 14 chromosomes (4, 5, 6, 7, 8, 10, 11, 12, 14, 17, 18, 21, 22, 23(X)) most of which are confirmed using CGHnormaliter. Over-normalization caused by global-median normalization instead leads to many incorrect calls.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2748095&req=5

Figure 3: Example of the effects over-normalization using global-median normalization. Calling results on an ALL tumor sample (sample 4) after (A) global-median and (B) CGHnormaliter normalization are shown. In these figures normalized log2 intensity ratios and segments are represented by dots and blue horizontal lines, respectively. Aberration probabilities are indicated by the length of the green downward (gain) and red upward (loss) bars. Note that segments are designated gain or loss if their probabilities exceed 0.5. G-banding and FISH analyses revealed gains in 14 chromosomes (4, 5, 6, 7, 8, 10, 11, 12, 14, 17, 18, 21, 22, 23(X)) most of which are confirmed using CGHnormaliter. Over-normalization caused by global-median normalization instead leads to many incorrect calls.
Mentions: In Figure 2A the average performance of all methods on the ALL dataset is displayed. From this figure it is clear that global-median normalization is outperformed by all other methods. popLowess and CGHnormaliter yield comparable results for all evaluation criteria (0.81 on average). Chen et al. performs slightly worse (0.77 on average) whereas global-median scores are considerably lower for sensitivity (0.57) and precision (0.62). We also investigated the underlying causes of the inferior performance of global-median normalization. As expected we found, particularly in cases where a large number of imbalanced aberrations occur, that global-median does not properly yield a normal copy number. In such cases, 'over-normalization' of the data occurs, leading to excessively shifted spot intensity ratios. A salient example is given in Figure 3, where calling results of a tumor sample are shown after global-median and CGHnormaliter normalization. In this sample gains were experimentally verified in 14 out of 24 chromosomes. In the global-median approach the median is rather high, leading to an overestimation of the number of losses and underestimation of the number of gains. In fact, only 11 out of 14 gains were (partially) recognized. CGHnormaliter (and also popLowess) attempts to correct for this problem and is able to properly identify 13 gains. Finally, in Table 2 we compare the effect of each normalization method on the resulting M values. It is clear that alternative strategies lead to considerably different shifts in the M values, whereas the final calling results are more similar (see Figure 2A).

Bottom Line: Results were compared to a conventional normalization approach and two more recent state-of-the-art aCGH normalization strategies.Our findings show that, compared to these three methods, CGHnormaliter yields a higher specificity and precision in terms of identifying the 'true' copy numbers.We demonstrate that the normalization of aCGH data can be significantly enhanced using an iterative procedure that effectively eliminates the effect of imbalanced copy numbers.

View Article: PubMed Central - HTML - PubMed

Affiliation: Centre for Integrative Bioinformatics VU (IBIVU), VU University Amsterdam, De Boelelaan 1081A, 1081 HV Amsterdam, the Netherlands.

ABSTRACT

Background: Array comparative genomic hybridization (aCGH) is a popular technique for detection of genomic copy number imbalances. These play a critical role in the onset of various types of cancer. In the analysis of aCGH data, normalization is deemed a critical pre-processing step. In general, aCGH normalization approaches are similar to those used for gene expression data, albeit both data-types differ inherently. A particular problem with aCGH data is that imbalanced copy numbers lead to improper normalization using conventional methods.

Results: In this study we present a novel method, called CGHnormaliter, which addresses this issue by means of an iterative normalization procedure. First, provisory balanced copy numbers are identified and subsequently used for normalization. These two steps are then iterated to refine the normalization. We tested our method on three well-studied tumor-related aCGH datasets with experimentally confirmed copy numbers. Results were compared to a conventional normalization approach and two more recent state-of-the-art aCGH normalization strategies. Our findings show that, compared to these three methods, CGHnormaliter yields a higher specificity and precision in terms of identifying the 'true' copy numbers.

Conclusion: We demonstrate that the normalization of aCGH data can be significantly enhanced using an iterative procedure that effectively eliminates the effect of imbalanced copy numbers. This also leads to a more reliable assessment of aberrations. An R-package containing the implementation of CGHnormaliter is available at http://www.ibi.vu.nl/programs/cghnormaliterwww.

Show MeSH
Related in: MedlinePlus