Limits...
Experimental analysis of oligonucleotide microarray design criteria to detect deletions by comparative genomic hybridization.

Flibotte S, Moerman DG - BMC Genomics (2008)

Bottom Line: We have quantified the effects of filtering various oligonucleotide properties by measuring the resolving power for detecting deletions in the human and C. elegans genomes using NimbleGen microarrays.A similarity level above 80% to non-target sequences over the length of the probe produces significant cross-hybridization.We have determined experimentally the effects of varying several key oligonucleotide microarray design criteria for detection of deletions in C. elegans and humans with NimbleGen's CGH technology.

View Article: PubMed Central - HTML - PubMed

Affiliation: Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, Canada. sflibotte@bcgsc.ca

ABSTRACT

Background: Microarray comparative genomic hybridization (CGH) is currently one of the most powerful techniques to measure DNA copy number in large genomes. In humans, microarray CGH is widely used to assess copy number variants in healthy individuals and copy number aberrations associated with various diseases, syndromes and disease susceptibility. In model organisms such as Caenorhabditis elegans (C. elegans) the technique has been applied to detect mutations, primarily deletions, in strains of interest. Although various constraints on oligonucleotide properties have been suggested to minimize non-specific hybridization and improve the data quality, there have been few experimental validations for CGH experiments. For genomic regions where strict design filters would limit the coverage it would also be useful to quantify the expected loss in data quality associated with relaxed design criteria.

Results: We have quantified the effects of filtering various oligonucleotide properties by measuring the resolving power for detecting deletions in the human and C. elegans genomes using NimbleGen microarrays. Approximately twice as many oligonucleotides are typically required to be affected by a deletion in human DNA samples in order to achieve the same statistical confidence as one would observe for a deletion in C. elegans. Surprisingly, the ability to detect deletions strongly depends on the oligonucleotide 15-mer count, which is defined as the sum of the genomic frequency of all the constituent 15-mers within the oligonucleotide. A similarity level above 80% to non-target sequences over the length of the probe produces significant cross-hybridization. We recommend the use of a fairly large melting temperature window of up to 10 degrees C, the elimination of repeat sequences, the elimination of homopolymers longer than 5 nucleotides, and a threshold of -1 kcal/mol on the oligonucleotide self-folding energy. We observed very little difference in data quality when varying the oligonucleotide length between 50 and 70, and even when using an isothermal design strategy.

Conclusion: We have determined experimentally the effects of varying several key oligonucleotide microarray design criteria for detection of deletions in C. elegans and humans with NimbleGen's CGH technology. Our oligonucleotide design recommendations should be applicable for CGH analysis in most species.

Show MeSH

Related in: MedlinePlus

Effect of the position of a stretch of perfect identity within 50-mer oligonucleotides. LOESS regression of the difference in fluorescence intensity (in log2 scale) between the original and perturbed 50-mer oligonucleotides as a function of the length of the stretch of perfect identity. Solid (dashed) lines correspond to C. elegans (human) data. The perfect stretch of identity is either on the left (5') side (green lines), right (3') side (blue lines) or middle (red lines) of the 50-mer oligonucleotide. With NimbleGen's manufacturing process the oligonucleotides are synthesized from 3' to 5' and therefore the left side is protruding and freely floating in the solution while the right side is closer to the slide.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2577661&req=5

Figure 6: Effect of the position of a stretch of perfect identity within 50-mer oligonucleotides. LOESS regression of the difference in fluorescence intensity (in log2 scale) between the original and perturbed 50-mer oligonucleotides as a function of the length of the stretch of perfect identity. Solid (dashed) lines correspond to C. elegans (human) data. The perfect stretch of identity is either on the left (5') side (green lines), right (3') side (blue lines) or middle (red lines) of the 50-mer oligonucleotide. With NimbleGen's manufacturing process the oligonucleotides are synthesized from 3' to 5' and therefore the left side is protruding and freely floating in the solution while the right side is closer to the slide.

Mentions: The red boxplot in Figure 5 shows the difference in fluorescence intensity one is expected to observe between a 50-mer oligonucleotide mapping perfectly to the C. elegans genome and a random 50-mer oligonucleotide with the same GC content. This is the basis for comparison and oligonucleotides with sequence identity associated with a smaller difference in intensity present some level of cross-hybridization. As can be seen from the green boxplots in Figure 5, a stretch of perfect identity of length of about 22 and above in the middle of the oligonucleotide will produce some level of cross-hybridization, and of course the longer the perfectly matched sequence is the worst the effect will be on the performance of the oligonucleotide. The elimination of non-unique 20 mers in our standard filters seems therefore a little too conservative. However, as can be seen in Figure 6, the position of the stretch of perfect identity within the oligonucleotide is important. Presumably due to steric effects, a stretch of perfect identity close to the slide will produce less cross-hybridization problems than a perfect stretch of identical length located at the other end of the oligonucleotide. For example, in C. elegans a perfect match of length 30 in the middle of the oligonucleotide will introduce similar cross-hybridization noise as a perfect match of length 23 close to the slide or length 36 at the end away from the slide. In fact, a perfect match of length 20 at the end away from the slide will produce a measurable fluorescence intensity above background so our standard elimination of non-unique 20 mers is justifiable in these instances. Similar positional effects are manifest in our human data set, except that the overall amplitude of the intensity difference between original and perturbed oligonucleotides is smaller. This is because the human data is noisier and spans a smaller dynamical range. This effect is compatible with the asymmetry previously reported for experiments performed with one-colour hybridization scheme on NimbleGen microarrays [10]. Furthermore, such asymmetry could explain the difference in performance sometimes observed [21] between oligonucleotides designed following the plus and minus strand templates at a given genomic location.


Experimental analysis of oligonucleotide microarray design criteria to detect deletions by comparative genomic hybridization.

Flibotte S, Moerman DG - BMC Genomics (2008)

Effect of the position of a stretch of perfect identity within 50-mer oligonucleotides. LOESS regression of the difference in fluorescence intensity (in log2 scale) between the original and perturbed 50-mer oligonucleotides as a function of the length of the stretch of perfect identity. Solid (dashed) lines correspond to C. elegans (human) data. The perfect stretch of identity is either on the left (5') side (green lines), right (3') side (blue lines) or middle (red lines) of the 50-mer oligonucleotide. With NimbleGen's manufacturing process the oligonucleotides are synthesized from 3' to 5' and therefore the left side is protruding and freely floating in the solution while the right side is closer to the slide.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2577661&req=5

Figure 6: Effect of the position of a stretch of perfect identity within 50-mer oligonucleotides. LOESS regression of the difference in fluorescence intensity (in log2 scale) between the original and perturbed 50-mer oligonucleotides as a function of the length of the stretch of perfect identity. Solid (dashed) lines correspond to C. elegans (human) data. The perfect stretch of identity is either on the left (5') side (green lines), right (3') side (blue lines) or middle (red lines) of the 50-mer oligonucleotide. With NimbleGen's manufacturing process the oligonucleotides are synthesized from 3' to 5' and therefore the left side is protruding and freely floating in the solution while the right side is closer to the slide.
Mentions: The red boxplot in Figure 5 shows the difference in fluorescence intensity one is expected to observe between a 50-mer oligonucleotide mapping perfectly to the C. elegans genome and a random 50-mer oligonucleotide with the same GC content. This is the basis for comparison and oligonucleotides with sequence identity associated with a smaller difference in intensity present some level of cross-hybridization. As can be seen from the green boxplots in Figure 5, a stretch of perfect identity of length of about 22 and above in the middle of the oligonucleotide will produce some level of cross-hybridization, and of course the longer the perfectly matched sequence is the worst the effect will be on the performance of the oligonucleotide. The elimination of non-unique 20 mers in our standard filters seems therefore a little too conservative. However, as can be seen in Figure 6, the position of the stretch of perfect identity within the oligonucleotide is important. Presumably due to steric effects, a stretch of perfect identity close to the slide will produce less cross-hybridization problems than a perfect stretch of identical length located at the other end of the oligonucleotide. For example, in C. elegans a perfect match of length 30 in the middle of the oligonucleotide will introduce similar cross-hybridization noise as a perfect match of length 23 close to the slide or length 36 at the end away from the slide. In fact, a perfect match of length 20 at the end away from the slide will produce a measurable fluorescence intensity above background so our standard elimination of non-unique 20 mers is justifiable in these instances. Similar positional effects are manifest in our human data set, except that the overall amplitude of the intensity difference between original and perturbed oligonucleotides is smaller. This is because the human data is noisier and spans a smaller dynamical range. This effect is compatible with the asymmetry previously reported for experiments performed with one-colour hybridization scheme on NimbleGen microarrays [10]. Furthermore, such asymmetry could explain the difference in performance sometimes observed [21] between oligonucleotides designed following the plus and minus strand templates at a given genomic location.

Bottom Line: We have quantified the effects of filtering various oligonucleotide properties by measuring the resolving power for detecting deletions in the human and C. elegans genomes using NimbleGen microarrays.A similarity level above 80% to non-target sequences over the length of the probe produces significant cross-hybridization.We have determined experimentally the effects of varying several key oligonucleotide microarray design criteria for detection of deletions in C. elegans and humans with NimbleGen's CGH technology.

View Article: PubMed Central - HTML - PubMed

Affiliation: Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, Canada. sflibotte@bcgsc.ca

ABSTRACT

Background: Microarray comparative genomic hybridization (CGH) is currently one of the most powerful techniques to measure DNA copy number in large genomes. In humans, microarray CGH is widely used to assess copy number variants in healthy individuals and copy number aberrations associated with various diseases, syndromes and disease susceptibility. In model organisms such as Caenorhabditis elegans (C. elegans) the technique has been applied to detect mutations, primarily deletions, in strains of interest. Although various constraints on oligonucleotide properties have been suggested to minimize non-specific hybridization and improve the data quality, there have been few experimental validations for CGH experiments. For genomic regions where strict design filters would limit the coverage it would also be useful to quantify the expected loss in data quality associated with relaxed design criteria.

Results: We have quantified the effects of filtering various oligonucleotide properties by measuring the resolving power for detecting deletions in the human and C. elegans genomes using NimbleGen microarrays. Approximately twice as many oligonucleotides are typically required to be affected by a deletion in human DNA samples in order to achieve the same statistical confidence as one would observe for a deletion in C. elegans. Surprisingly, the ability to detect deletions strongly depends on the oligonucleotide 15-mer count, which is defined as the sum of the genomic frequency of all the constituent 15-mers within the oligonucleotide. A similarity level above 80% to non-target sequences over the length of the probe produces significant cross-hybridization. We recommend the use of a fairly large melting temperature window of up to 10 degrees C, the elimination of repeat sequences, the elimination of homopolymers longer than 5 nucleotides, and a threshold of -1 kcal/mol on the oligonucleotide self-folding energy. We observed very little difference in data quality when varying the oligonucleotide length between 50 and 70, and even when using an isothermal design strategy.

Conclusion: We have determined experimentally the effects of varying several key oligonucleotide microarray design criteria for detection of deletions in C. elegans and humans with NimbleGen's CGH technology. Our oligonucleotide design recommendations should be applicable for CGH analysis in most species.

Show MeSH
Related in: MedlinePlus