Limits...
Shared probe design and existing microarray reanalysis using PICKY.

Chou HH - BMC Bioinformatics (2010)

Bottom Line: This limitation is due to thermodynamic restrictions and cannot be resolved by any computational method.PICKY 2.0 uses novel algorithms to track sharable regions among genes and to strictly distinguish them from other highly similar but nontarget regions during thermodynamic comparisons.In addition, more precise nonlinear salt effect estimates and other improvements are added, making PICKY 2.1 more versatile to microarray users.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Genetics, Development and Cell Biology, and Department of Computer Science, Iowa State University, Ames, IA, 50011-3223, USA. hhchou@iastate.edu

ABSTRACT

Background: Large genomes contain families of highly similar genes that cannot be individually identified by microarray probes. This limitation is due to thermodynamic restrictions and cannot be resolved by any computational method. Since gene annotations are updated more frequently than microarrays, another common issue facing microarray users is that existing microarrays must be routinely reanalyzed to determine probes that are still useful with respect to the updated annotations.

Results: PICKY 2.0 can design shared probes for sets of genes that cannot be individually identified using unique probes. PICKY 2.0 uses novel algorithms to track sharable regions among genes and to strictly distinguish them from other highly similar but nontarget regions during thermodynamic comparisons. Therefore, PICKY does not sacrifice the quality of shared probes when choosing them. The latest PICKY 2.1 includes the new capability to reanalyze existing microarray probes against updated gene sets to determine probes that are still valid to use. In addition, more precise nonlinear salt effect estimates and other improvements are added, making PICKY 2.1 more versatile to microarray users.

Conclusions: Shared probes allow expressed gene family members to be detected; this capability is generally more desirable than not knowing anything about these genes. Shared probes also enable the design of cross-genome microarrays, which facilitate multiple species identification in environmental samples. The new nonlinear salt effect calculation significantly increases the precision of probes at a lower buffer salt concentration, and the probe reanalysis function improves existing microarray result interpretations.

Show MeSH
Example implementation to discover all common region groups that can accommodate probes. suffix_array and common_array are always the same size; i and j are the left and right boundaries of an identified common region group; k saves its left overlap length with nontargets; m saves its right overlap length with nontargets; and n holds the shortest common region within the group.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2875240&req=5

Figure 2: Example implementation to discover all common region groups that can accommodate probes. suffix_array and common_array are always the same size; i and j are the left and right boundaries of an identified common region group; k saves its left overlap length with nontargets; m saves its right overlap length with nontargets; and n holds the shortest common region within the group.

Mentions: Step 1 discovers all regions that may accommodate probes. PICKY allows users to provide a list of nontarget sequences to be avoided during the microarray design; these can be any transcripts that might be encountered by the microarray (e.g., mitochondrial RNA). PICKY also considers the reverse-complements of all input sequences to be nontargets; this prevents secondary structure formations on the probes or on their targets. The details of these are described in the PICKY 1.0 paper [1]. If suffixes from nontarget sequences or the reverse-complements of any sequences are in a group, the group cannot be used to design probes. If a group is bound on either side by overlaps longer than the maximum allowable length of exact nontarget match, then suffixes in the group are overlapping too much with nontarget sequences, thus the group cannot be used either. The probe size and the maximum length of nontarget match are user specified parameters. In the algorithm, steps 1 and 2 can be combined in implementation and run in linear time. Step 3 can run in either constant time or logarithmic time depending on whether a hash table or a balanced binary tree is used for the lookup table. The worst-case complexity of this algorithm is thus O(n log n), where n is the number of suffixes from all input sequences (i.e., the total bases). Figure 2 presents an example implementation of this algorithm in C++.


Shared probe design and existing microarray reanalysis using PICKY.

Chou HH - BMC Bioinformatics (2010)

Example implementation to discover all common region groups that can accommodate probes. suffix_array and common_array are always the same size; i and j are the left and right boundaries of an identified common region group; k saves its left overlap length with nontargets; m saves its right overlap length with nontargets; and n holds the shortest common region within the group.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2875240&req=5

Figure 2: Example implementation to discover all common region groups that can accommodate probes. suffix_array and common_array are always the same size; i and j are the left and right boundaries of an identified common region group; k saves its left overlap length with nontargets; m saves its right overlap length with nontargets; and n holds the shortest common region within the group.
Mentions: Step 1 discovers all regions that may accommodate probes. PICKY allows users to provide a list of nontarget sequences to be avoided during the microarray design; these can be any transcripts that might be encountered by the microarray (e.g., mitochondrial RNA). PICKY also considers the reverse-complements of all input sequences to be nontargets; this prevents secondary structure formations on the probes or on their targets. The details of these are described in the PICKY 1.0 paper [1]. If suffixes from nontarget sequences or the reverse-complements of any sequences are in a group, the group cannot be used to design probes. If a group is bound on either side by overlaps longer than the maximum allowable length of exact nontarget match, then suffixes in the group are overlapping too much with nontarget sequences, thus the group cannot be used either. The probe size and the maximum length of nontarget match are user specified parameters. In the algorithm, steps 1 and 2 can be combined in implementation and run in linear time. Step 3 can run in either constant time or logarithmic time depending on whether a hash table or a balanced binary tree is used for the lookup table. The worst-case complexity of this algorithm is thus O(n log n), where n is the number of suffixes from all input sequences (i.e., the total bases). Figure 2 presents an example implementation of this algorithm in C++.

Bottom Line: This limitation is due to thermodynamic restrictions and cannot be resolved by any computational method.PICKY 2.0 uses novel algorithms to track sharable regions among genes and to strictly distinguish them from other highly similar but nontarget regions during thermodynamic comparisons.In addition, more precise nonlinear salt effect estimates and other improvements are added, making PICKY 2.1 more versatile to microarray users.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Genetics, Development and Cell Biology, and Department of Computer Science, Iowa State University, Ames, IA, 50011-3223, USA. hhchou@iastate.edu

ABSTRACT

Background: Large genomes contain families of highly similar genes that cannot be individually identified by microarray probes. This limitation is due to thermodynamic restrictions and cannot be resolved by any computational method. Since gene annotations are updated more frequently than microarrays, another common issue facing microarray users is that existing microarrays must be routinely reanalyzed to determine probes that are still useful with respect to the updated annotations.

Results: PICKY 2.0 can design shared probes for sets of genes that cannot be individually identified using unique probes. PICKY 2.0 uses novel algorithms to track sharable regions among genes and to strictly distinguish them from other highly similar but nontarget regions during thermodynamic comparisons. Therefore, PICKY does not sacrifice the quality of shared probes when choosing them. The latest PICKY 2.1 includes the new capability to reanalyze existing microarray probes against updated gene sets to determine probes that are still valid to use. In addition, more precise nonlinear salt effect estimates and other improvements are added, making PICKY 2.1 more versatile to microarray users.

Conclusions: Shared probes allow expressed gene family members to be detected; this capability is generally more desirable than not knowing anything about these genes. Shared probes also enable the design of cross-genome microarrays, which facilitate multiple species identification in environmental samples. The new nonlinear salt effect calculation significantly increases the precision of probes at a lower buffer salt concentration, and the probe reanalysis function improves existing microarray result interpretations.

Show MeSH