Limits...
HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing.

Larsson E, Lindahl P, Mostad P - BMC Bioinformatics (2007)

Bottom Line: We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs.It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing.

View Article: PubMed Central - HTML - PubMed

Affiliation: Wallenberg Laboratory for Cardiovascular Research, Bruna Stråket 16, Sahlgrenska University Hospital, SE-413 45 Göteborg, SWEDEN. erik.larsson@wlab.gu.se

ABSTRACT

Background: Correct temporal and spatial gene expression during metazoan development relies on combinatorial interactions between different transcription factors. As a consequence, cis-regulatory elements often colocalize in clusters termed cis-regulatory modules. These may have requirements on organizational features such as spacing, order and helical phasing (periodic spacing) between binding sites. Due to the turning of the DNA helix, a small modification of the distance between a pair of sites may sometimes drastically disrupt function, while insertion of a full helical turn of DNA (10-11 bp) between cis elements may cause functionality to be restored. Recently, de novo motif discovery methods which incorporate organizational properties such as colocalization and order preferences have been developed, but there are no tools which incorporate periodic spacing into the model.

Results: We have developed a web based motif discovery tool, HeliCis, which features a flexible model which allows de novo detection of motifs with periodic spacing. Depending on the parameter settings it may also be used for discovering colocalized motifs without periodicity or motifs separated by a fixed gap of known or unknown length. We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs. It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.

Conclusion: HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing. Our evaluations show that it can detect weak periodic patterns which are not easily discovered using a sequential approach, i.e. first finding the binding sites and second analyzing the properties of their pairwise distances.

Show MeSH

Related in: MedlinePlus

Performance on synthetic sequence datasets containing colocalized and periodically spaced CArG and ETS motifs with varying information content. HeliCis with different settings was compared to MEME and BioProspector. The information content of the motifs was gradually reduced by varying the number of pseudocounts and the sensitivity of the different tools was determined by calculating the fraction of correctly identified motifs. Results are from 5 averaged trials.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2200674&req=5

Figure 3: Performance on synthetic sequence datasets containing colocalized and periodically spaced CArG and ETS motifs with varying information content. HeliCis with different settings was compared to MEME and BioProspector. The information content of the motifs was gradually reduced by varying the number of pseudocounts and the sensitivity of the different tools was determined by calculating the fraction of correctly identified motifs. Results are from 5 averaged trials.

Mentions: HeliCis with default settings for periodic spacing (period 10, motif distance 0...50 bp), HeliCis with colocalization settings (period 1, distance 0...50 bp) and HeliCis with single motif settings were compared to an established single motif discovery tool based on the EM algorithm, MEME [26], and a motif discovery tool based on Gibbs sampling, BioProspector [27]. The latter was run in "two-block" mode, searching for motif pairs with a maximum gap of 50 bp. All were configured to search the forward strand only with a fixed motif width of 12 bp, and with forced presence of a motif in each sequence (oops = "one occurrence per sequence" model in MEME, "-a 1" switch for BioProspector, "-p 1" switch for HeliCis). The quality of the resulting alignments was determined by calculating the fraction of correctly identified sites (Figure 3). Results shown are average values from five independent trials where the sequence sets were regenerated each time. It should be noted that BioProspector, unlike HeliCis and MEME, cannot be forced to detect exactly one occurrence per sequence, but will often assign several motifs per sequence. This should be taken into account when evaluating the results, as this model may be slightly disadvantageous on this dataset.


HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing.

Larsson E, Lindahl P, Mostad P - BMC Bioinformatics (2007)

Performance on synthetic sequence datasets containing colocalized and periodically spaced CArG and ETS motifs with varying information content. HeliCis with different settings was compared to MEME and BioProspector. The information content of the motifs was gradually reduced by varying the number of pseudocounts and the sensitivity of the different tools was determined by calculating the fraction of correctly identified motifs. Results are from 5 averaged trials.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2200674&req=5

Figure 3: Performance on synthetic sequence datasets containing colocalized and periodically spaced CArG and ETS motifs with varying information content. HeliCis with different settings was compared to MEME and BioProspector. The information content of the motifs was gradually reduced by varying the number of pseudocounts and the sensitivity of the different tools was determined by calculating the fraction of correctly identified motifs. Results are from 5 averaged trials.
Mentions: HeliCis with default settings for periodic spacing (period 10, motif distance 0...50 bp), HeliCis with colocalization settings (period 1, distance 0...50 bp) and HeliCis with single motif settings were compared to an established single motif discovery tool based on the EM algorithm, MEME [26], and a motif discovery tool based on Gibbs sampling, BioProspector [27]. The latter was run in "two-block" mode, searching for motif pairs with a maximum gap of 50 bp. All were configured to search the forward strand only with a fixed motif width of 12 bp, and with forced presence of a motif in each sequence (oops = "one occurrence per sequence" model in MEME, "-a 1" switch for BioProspector, "-p 1" switch for HeliCis). The quality of the resulting alignments was determined by calculating the fraction of correctly identified sites (Figure 3). Results shown are average values from five independent trials where the sequence sets were regenerated each time. It should be noted that BioProspector, unlike HeliCis and MEME, cannot be forced to detect exactly one occurrence per sequence, but will often assign several motifs per sequence. This should be taken into account when evaluating the results, as this model may be slightly disadvantageous on this dataset.

Bottom Line: We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs.It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing.

View Article: PubMed Central - HTML - PubMed

Affiliation: Wallenberg Laboratory for Cardiovascular Research, Bruna Stråket 16, Sahlgrenska University Hospital, SE-413 45 Göteborg, SWEDEN. erik.larsson@wlab.gu.se

ABSTRACT

Background: Correct temporal and spatial gene expression during metazoan development relies on combinatorial interactions between different transcription factors. As a consequence, cis-regulatory elements often colocalize in clusters termed cis-regulatory modules. These may have requirements on organizational features such as spacing, order and helical phasing (periodic spacing) between binding sites. Due to the turning of the DNA helix, a small modification of the distance between a pair of sites may sometimes drastically disrupt function, while insertion of a full helical turn of DNA (10-11 bp) between cis elements may cause functionality to be restored. Recently, de novo motif discovery methods which incorporate organizational properties such as colocalization and order preferences have been developed, but there are no tools which incorporate periodic spacing into the model.

Results: We have developed a web based motif discovery tool, HeliCis, which features a flexible model which allows de novo detection of motifs with periodic spacing. Depending on the parameter settings it may also be used for discovering colocalized motifs without periodicity or motifs separated by a fixed gap of known or unknown length. We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs. It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.

Conclusion: HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing. Our evaluations show that it can detect weak periodic patterns which are not easily discovered using a sequential approach, i.e. first finding the binding sites and second analyzing the properties of their pairwise distances.

Show MeSH
Related in: MedlinePlus