Limits...
A stitch in time: efficient computation of genomic DNA melting bubbles.

Tøstesen E - Algorithms Mol Biol (2008)

Bottom Line: No approximations, such as windowing, have been introduced to reduce the time complexity.Sequences of several megabases have been computed, only limited by computer memory.Possible applications are the genome-wide comparisons of bubbles with promotors, TSS, viral integration sites, and other melting-related regions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Tumor Biology, Norwegian Radium Hospital, N-0310, Oslo, Norway. eivindto@math.uio.no

ABSTRACT

Background: It is of biological interest to make genome-wide predictions of the locations of DNA melting bubbles using statistical mechanics models. Computationally, this poses the challenge that a generic search through all combinations of bubble starts and ends is quadratic.

Results: An efficient algorithm is described, which shows that the time complexity of the task is O(NlogN) rather than quadratic. The algorithm exploits that bubble lengths may be limited, but without a prior assumption of a maximal bubble length. No approximations, such as windowing, have been introduced to reduce the time complexity. More than just finding the bubbles, the algorithm produces a stitch profile, which is a probabilistic graphical model of bubbles and helical regions. The algorithm applies a probability peak finding method based on a hierarchical analysis of the energy barriers in the Poland-Scheraga model.

Conclusion: Exact and fast computation of genomic stitch profiles is thus feasible. Sequences of several megabases have been computed, only limited by computer memory. Possible applications are the genome-wide comparisons of bubbles with promotors, TSS, viral integration sites, and other melting-related regions.

No MeSH data available.


Related in: MedlinePlus

What is a stitch profile diagram?. At the top are sketched three alternative DNA conformations at the same temperature. In the middle diagrams, the sequence location of each helical region (blue) and each bubble or single-stranded region (red) is represented by a stitch. At the bottom, the three "rows of stitches" are merged into a stitch profile diagram.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2553405&req=5

Figure 1: What is a stitch profile diagram?. At the top are sketched three alternative DNA conformations at the same temperature. In the middle diagrams, the sequence location of each helical region (blue) and each bubble or single-stranded region (red) is represented by a stitch. At the bottom, the three "rows of stitches" are merged into a stitch profile diagram.

Mentions: In previous work, we developed an algorithm that identifies regions of four types: helical regions, bubbles (internal loops), and unzipped 5' and 3' end regions (tails) [29-31]. The algorithm produces a stitch profile, which is a probabilistic graphical model of DNA's conformational space. A stitch profile contains a set of regions of the four types. Each region is called a stitch, because of the way they can be connected in paths. The stitch profile algorithm computes the location (start and end) of each stitch and the probability of that region being in the corresponding state (ds or ss) at the specified temperature. A stitch profile can be plotted in a stitch profile diagram, as illustrated in Figure 1. The location of a bubble or helix stitch is not given as a precise coordinate pair (x, y), but rather as a pair of ds/ss boundaries with fuzzy locations. For each ds/ss boundary, the range of thermal fluctuations is computed and given as an interval. A stitch profile indicates a number of alternative configurations, both optimal and suboptimal, as illustrated in Figure 1. In contrast, a melting map would indicate the single configuration at each temperature, in which each basepair is in its most probable state.


A stitch in time: efficient computation of genomic DNA melting bubbles.

Tøstesen E - Algorithms Mol Biol (2008)

What is a stitch profile diagram?. At the top are sketched three alternative DNA conformations at the same temperature. In the middle diagrams, the sequence location of each helical region (blue) and each bubble or single-stranded region (red) is represented by a stitch. At the bottom, the three "rows of stitches" are merged into a stitch profile diagram.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2553405&req=5

Figure 1: What is a stitch profile diagram?. At the top are sketched three alternative DNA conformations at the same temperature. In the middle diagrams, the sequence location of each helical region (blue) and each bubble or single-stranded region (red) is represented by a stitch. At the bottom, the three "rows of stitches" are merged into a stitch profile diagram.
Mentions: In previous work, we developed an algorithm that identifies regions of four types: helical regions, bubbles (internal loops), and unzipped 5' and 3' end regions (tails) [29-31]. The algorithm produces a stitch profile, which is a probabilistic graphical model of DNA's conformational space. A stitch profile contains a set of regions of the four types. Each region is called a stitch, because of the way they can be connected in paths. The stitch profile algorithm computes the location (start and end) of each stitch and the probability of that region being in the corresponding state (ds or ss) at the specified temperature. A stitch profile can be plotted in a stitch profile diagram, as illustrated in Figure 1. The location of a bubble or helix stitch is not given as a precise coordinate pair (x, y), but rather as a pair of ds/ss boundaries with fuzzy locations. For each ds/ss boundary, the range of thermal fluctuations is computed and given as an interval. A stitch profile indicates a number of alternative configurations, both optimal and suboptimal, as illustrated in Figure 1. In contrast, a melting map would indicate the single configuration at each temperature, in which each basepair is in its most probable state.

Bottom Line: No approximations, such as windowing, have been introduced to reduce the time complexity.Sequences of several megabases have been computed, only limited by computer memory.Possible applications are the genome-wide comparisons of bubbles with promotors, TSS, viral integration sites, and other melting-related regions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Tumor Biology, Norwegian Radium Hospital, N-0310, Oslo, Norway. eivindto@math.uio.no

ABSTRACT

Background: It is of biological interest to make genome-wide predictions of the locations of DNA melting bubbles using statistical mechanics models. Computationally, this poses the challenge that a generic search through all combinations of bubble starts and ends is quadratic.

Results: An efficient algorithm is described, which shows that the time complexity of the task is O(NlogN) rather than quadratic. The algorithm exploits that bubble lengths may be limited, but without a prior assumption of a maximal bubble length. No approximations, such as windowing, have been introduced to reduce the time complexity. More than just finding the bubbles, the algorithm produces a stitch profile, which is a probabilistic graphical model of bubbles and helical regions. The algorithm applies a probability peak finding method based on a hierarchical analysis of the energy barriers in the Poland-Scheraga model.

Conclusion: Exact and fast computation of genomic stitch profiles is thus feasible. Sequences of several megabases have been computed, only limited by computer memory. Possible applications are the genome-wide comparisons of bubbles with promotors, TSS, viral integration sites, and other melting-related regions.

No MeSH data available.


Related in: MedlinePlus