Limits...
HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing.

Larsson E, Lindahl P, Mostad P - BMC Bioinformatics (2007)

Bottom Line: We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs.It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing.

View Article: PubMed Central - HTML - PubMed

Affiliation: Wallenberg Laboratory for Cardiovascular Research, Bruna Stråket 16, Sahlgrenska University Hospital, SE-413 45 Göteborg, SWEDEN. erik.larsson@wlab.gu.se

ABSTRACT

Background: Correct temporal and spatial gene expression during metazoan development relies on combinatorial interactions between different transcription factors. As a consequence, cis-regulatory elements often colocalize in clusters termed cis-regulatory modules. These may have requirements on organizational features such as spacing, order and helical phasing (periodic spacing) between binding sites. Due to the turning of the DNA helix, a small modification of the distance between a pair of sites may sometimes drastically disrupt function, while insertion of a full helical turn of DNA (10-11 bp) between cis elements may cause functionality to be restored. Recently, de novo motif discovery methods which incorporate organizational properties such as colocalization and order preferences have been developed, but there are no tools which incorporate periodic spacing into the model.

Results: We have developed a web based motif discovery tool, HeliCis, which features a flexible model which allows de novo detection of motifs with periodic spacing. Depending on the parameter settings it may also be used for discovering colocalized motifs without periodicity or motifs separated by a fixed gap of known or unknown length. We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs. It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.

Conclusion: HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing. Our evaluations show that it can detect weak periodic patterns which are not easily discovered using a sequential approach, i.e. first finding the binding sites and second analyzing the properties of their pairwise distances.

Show MeSH

Related in: MedlinePlus

Performance on synthetic sequence datasets with varying motif coverage. Datasets of 20 sequences with colocalized and periodically spaced CArG and ETS motifs were generated. The proportion of sequences containing the motifs was gradually reduced, thus making them increasingly difficult to detect. HeliCis with different settings was compared to MEME and BioProspector. The plots show sensitivity and positive predictive value (PPV = TP/(TP + FP)). Results are from 5 averaged trials.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2200674&req=5

Figure 4: Performance on synthetic sequence datasets with varying motif coverage. Datasets of 20 sequences with colocalized and periodically spaced CArG and ETS motifs were generated. The proportion of sequences containing the motifs was gradually reduced, thus making them increasingly difficult to detect. HeliCis with different settings was compared to MEME and BioProspector. The plots show sensitivity and positive predictive value (PPV = TP/(TP + FP)). Results are from 5 averaged trials.

Mentions: In a second evaluation, sets of 20 sequences containing artificially planted CArG and ETS motifs were generated as described above. However, this time the information content of the motif matrices was kept constant (one pseudocount added). Instead, the fraction of sequences containing motifs was gradually reduced from 20/20 to 10/20, thus making them increasingly difficult to detect. In this case, the tools were not forced to detect motifs in all sequences (zoops = "zero or one occurrences per sequence" model in MEME, default for BioProspector and HeliCis). Other settings were as described above. To account for false positive predictions, a PPV score (positive predictive value, i.e. the fraction of predicted sites which are correct) was calculated, in addition to sensitivity. The results, shown in Figure 4, are average values from 5 independent trials.


HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing.

Larsson E, Lindahl P, Mostad P - BMC Bioinformatics (2007)

Performance on synthetic sequence datasets with varying motif coverage. Datasets of 20 sequences with colocalized and periodically spaced CArG and ETS motifs were generated. The proportion of sequences containing the motifs was gradually reduced, thus making them increasingly difficult to detect. HeliCis with different settings was compared to MEME and BioProspector. The plots show sensitivity and positive predictive value (PPV = TP/(TP + FP)). Results are from 5 averaged trials.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2200674&req=5

Figure 4: Performance on synthetic sequence datasets with varying motif coverage. Datasets of 20 sequences with colocalized and periodically spaced CArG and ETS motifs were generated. The proportion of sequences containing the motifs was gradually reduced, thus making them increasingly difficult to detect. HeliCis with different settings was compared to MEME and BioProspector. The plots show sensitivity and positive predictive value (PPV = TP/(TP + FP)). Results are from 5 averaged trials.
Mentions: In a second evaluation, sets of 20 sequences containing artificially planted CArG and ETS motifs were generated as described above. However, this time the information content of the motif matrices was kept constant (one pseudocount added). Instead, the fraction of sequences containing motifs was gradually reduced from 20/20 to 10/20, thus making them increasingly difficult to detect. In this case, the tools were not forced to detect motifs in all sequences (zoops = "zero or one occurrences per sequence" model in MEME, default for BioProspector and HeliCis). Other settings were as described above. To account for false positive predictions, a PPV score (positive predictive value, i.e. the fraction of predicted sites which are correct) was calculated, in addition to sensitivity. The results, shown in Figure 4, are average values from 5 independent trials.

Bottom Line: We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs.It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing.

View Article: PubMed Central - HTML - PubMed

Affiliation: Wallenberg Laboratory for Cardiovascular Research, Bruna Stråket 16, Sahlgrenska University Hospital, SE-413 45 Göteborg, SWEDEN. erik.larsson@wlab.gu.se

ABSTRACT

Background: Correct temporal and spatial gene expression during metazoan development relies on combinatorial interactions between different transcription factors. As a consequence, cis-regulatory elements often colocalize in clusters termed cis-regulatory modules. These may have requirements on organizational features such as spacing, order and helical phasing (periodic spacing) between binding sites. Due to the turning of the DNA helix, a small modification of the distance between a pair of sites may sometimes drastically disrupt function, while insertion of a full helical turn of DNA (10-11 bp) between cis elements may cause functionality to be restored. Recently, de novo motif discovery methods which incorporate organizational properties such as colocalization and order preferences have been developed, but there are no tools which incorporate periodic spacing into the model.

Results: We have developed a web based motif discovery tool, HeliCis, which features a flexible model which allows de novo detection of motifs with periodic spacing. Depending on the parameter settings it may also be used for discovering colocalized motifs without periodicity or motifs separated by a fixed gap of known or unknown length. We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs. It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure.

Conclusion: HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing. Our evaluations show that it can detect weak periodic patterns which are not easily discovered using a sequential approach, i.e. first finding the binding sites and second analyzing the properties of their pairwise distances.

Show MeSH
Related in: MedlinePlus