Limits...
An evaluation of custom microarray applications: the oligonucleotide design challenge.

Lemoine S, Combes F, Le Crom S - Nucleic Acids Res. (2009)

Bottom Line: Finally, we used a set of tests for the in silico benchmark of the oligo sets obtained from each type of software.We show that the design software must be selected according to the goal of the scientist, depending on factors such as the organism used, the number of probes required and their localization on the target sequence.The present work provides keys to the choice of the most relevant software, according to the various parameters we tested.

View Article: PubMed Central - PubMed

Affiliation: INSERM, CNRS, IFR36, Plate-forme Transcriptome, Paris, France.

ABSTRACT
The increase in feature resolution and the availability of multipack formats from microarray providers has opened the way to various custom genomic applications. However, oligonucleotide design and selection remains a bottleneck of the microarray workflow. Several tools are available to perform this work, and choosing the best one is not an easy task, nor are the choices obvious. Here we review the oligonucleotide design field to help users make their choice. We have first performed a comparative evaluation of the available solutions based on a set of criteria including: ease of installation, user-friendly access, the number of parameters and settings available. In a second step, we chose to submit two real cases to a selection of programs. Finally, we used a set of tests for the in silico benchmark of the oligo sets obtained from each type of software. We show that the design software must be selected according to the goal of the scientist, depending on factors such as the organism used, the number of probes required and their localization on the target sequence. The present work provides keys to the choice of the most relevant software, according to the various parameters we tested.

Show MeSH
Evaluation of tiling oligonucleotide specificity. (A) Distribution of the distance in base pair between oligonucleotide that follows each other on the tiling path. (B) Distribution of the number of oligonucleotide by transcript. (C) Distribution of the number of BLAST hits by oligonucleotide using the parameters described in the ‘Material and methods’ section. The y-axis is log scaled. To clearly display these distributions we removed all oligonucleotides with only one hit.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2665234&req=5

Figure 4: Evaluation of tiling oligonucleotide specificity. (A) Distribution of the distance in base pair between oligonucleotide that follows each other on the tiling path. (B) Distribution of the number of oligonucleotide by transcript. (C) Distribution of the number of BLAST hits by oligonucleotide using the parameters described in the ‘Material and methods’ section. The y-axis is log scaled. To clearly display these distributions we removed all oligonucleotides with only one hit.

Mentions: The critical step for tiling array design is the tiling path. Thus, for the same coverage (number of oligonucleotides on the genome), a uniform distribution of oligonucleotides provides greater detection of individual gene features. This can be measured using the interval between adjacent oligonucleotides. Figure 4A shows that the median of the interval designed by OligoTiler reflects our expectations (150 bp) with only a small variation (±27.8 bp), whereas the distribution of ArrayDesign's intervals has a median of 103 bp (±172.3 bp). The larger distribution of intervals between oligonucleotides with ArrayDesign may be a direct consequence of the ‘specificity’ optimization that the program performs, with a design mainly focused on conserved regions such as exons. We calculated the number of oligonucleotides designed for each transcript on the genome. Figure 4B shows that OligoTiler supplies a uniform distribution of oligonucleotides, and therefore achieves better coverage of transcripts than ArrayDesign. Indeed, OligoTiler designs eight probes per transcript, while ArrayDesign finds only four probes for each coding sequence. Finally, we evaluated oligonucleotide specificity using the first Kane parameter. For each designed oligonucleotide, we counted the number of BLAST hits that had an identity percentage ≥75% on a full-size alignment (60 bp). The number of oligonucleotides with only one hit is slightly greater (97%) using OligoTiler than ArrayDesign (96%). However, considering only the oligonucleotides with more than one hit (Figure 4C), the median hit number by oligonucleotide is four for ArrayDesign and three for OligoTiler. This comparison points out that these two different approaches achieve quite the same efficiency in terms of specificity based on BLAST hit calculation according to the first Kane parameter.Figure 4.


An evaluation of custom microarray applications: the oligonucleotide design challenge.

Lemoine S, Combes F, Le Crom S - Nucleic Acids Res. (2009)

Evaluation of tiling oligonucleotide specificity. (A) Distribution of the distance in base pair between oligonucleotide that follows each other on the tiling path. (B) Distribution of the number of oligonucleotide by transcript. (C) Distribution of the number of BLAST hits by oligonucleotide using the parameters described in the ‘Material and methods’ section. The y-axis is log scaled. To clearly display these distributions we removed all oligonucleotides with only one hit.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2665234&req=5

Figure 4: Evaluation of tiling oligonucleotide specificity. (A) Distribution of the distance in base pair between oligonucleotide that follows each other on the tiling path. (B) Distribution of the number of oligonucleotide by transcript. (C) Distribution of the number of BLAST hits by oligonucleotide using the parameters described in the ‘Material and methods’ section. The y-axis is log scaled. To clearly display these distributions we removed all oligonucleotides with only one hit.
Mentions: The critical step for tiling array design is the tiling path. Thus, for the same coverage (number of oligonucleotides on the genome), a uniform distribution of oligonucleotides provides greater detection of individual gene features. This can be measured using the interval between adjacent oligonucleotides. Figure 4A shows that the median of the interval designed by OligoTiler reflects our expectations (150 bp) with only a small variation (±27.8 bp), whereas the distribution of ArrayDesign's intervals has a median of 103 bp (±172.3 bp). The larger distribution of intervals between oligonucleotides with ArrayDesign may be a direct consequence of the ‘specificity’ optimization that the program performs, with a design mainly focused on conserved regions such as exons. We calculated the number of oligonucleotides designed for each transcript on the genome. Figure 4B shows that OligoTiler supplies a uniform distribution of oligonucleotides, and therefore achieves better coverage of transcripts than ArrayDesign. Indeed, OligoTiler designs eight probes per transcript, while ArrayDesign finds only four probes for each coding sequence. Finally, we evaluated oligonucleotide specificity using the first Kane parameter. For each designed oligonucleotide, we counted the number of BLAST hits that had an identity percentage ≥75% on a full-size alignment (60 bp). The number of oligonucleotides with only one hit is slightly greater (97%) using OligoTiler than ArrayDesign (96%). However, considering only the oligonucleotides with more than one hit (Figure 4C), the median hit number by oligonucleotide is four for ArrayDesign and three for OligoTiler. This comparison points out that these two different approaches achieve quite the same efficiency in terms of specificity based on BLAST hit calculation according to the first Kane parameter.Figure 4.

Bottom Line: Finally, we used a set of tests for the in silico benchmark of the oligo sets obtained from each type of software.We show that the design software must be selected according to the goal of the scientist, depending on factors such as the organism used, the number of probes required and their localization on the target sequence.The present work provides keys to the choice of the most relevant software, according to the various parameters we tested.

View Article: PubMed Central - PubMed

Affiliation: INSERM, CNRS, IFR36, Plate-forme Transcriptome, Paris, France.

ABSTRACT
The increase in feature resolution and the availability of multipack formats from microarray providers has opened the way to various custom genomic applications. However, oligonucleotide design and selection remains a bottleneck of the microarray workflow. Several tools are available to perform this work, and choosing the best one is not an easy task, nor are the choices obvious. Here we review the oligonucleotide design field to help users make their choice. We have first performed a comparative evaluation of the available solutions based on a set of criteria including: ease of installation, user-friendly access, the number of parameters and settings available. In a second step, we chose to submit two real cases to a selection of programs. Finally, we used a set of tests for the in silico benchmark of the oligo sets obtained from each type of software. We show that the design software must be selected according to the goal of the scientist, depending on factors such as the organism used, the number of probes required and their localization on the target sequence. The present work provides keys to the choice of the most relevant software, according to the various parameters we tested.

Show MeSH