Limits...
Accelerating the Smith-Waterman algorithm with interpair pruning and band optimization for the all-pairs comparison of base sequences.

Okada D, Ino F, Hagihara K - BMC Bioinformatics (2015)

Bottom Line: Given the results of the pairs of sequences, our method realizes efficient block pruning by computing a lower bound for other pairs that have not yet been processed.This acceleration was achieved at the first phase of SW#, where our method significantly improved the initial lower bound.However, our interpair optimization was not effective for the comparison of the sequences of different species such as comparing human, chimpanzee, and gorilla.

View Article: PubMed Central - PubMed

Affiliation: Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, 565-0871, Japan.

ABSTRACT

Background: The Smith-Waterman algorithm is known to be a more sensitive approach than heuristic algorithms for local sequence alignment algorithms. Despite its sensitivity, a greater time complexity associated with the Smith-Waterman algorithm prevents its application to the all-pairs comparisons of base sequences, which aids in the construction of accurate phylogenetic trees. The aim of this study is to achieve greater acceleration using the Smith-Waterman algorithm (by realizing interpair block pruning and band optimization) compared with that achieved using a previous method that performs intrapair block pruning on graphics processing units (GPUs).

Results: We present an interpair optimization method for the Smith-Waterman algorithm with the aim of accelerating the all-pairs comparison of base sequences. Given the results of the pairs of sequences, our method realizes efficient block pruning by computing a lower bound for other pairs that have not yet been processed. This lower bound is further used for band optimization. We integrated our interpair optimization method into SW#, a previous GPU-based implementation that employs variants of a banded Smith-Waterman algorithm and a banded Myers-Miller algorithm. Evaluation using the six genomes of Bacillus anthracis shows that our method pruned 88% of the matrix cells on a single GPU and 73% of the matrix cells on two GPUs. For the genomes of the human chromosome 21, the alignment performance reached 202 giga-cell updates per second (GCUPS) on two Tesla K40 GPUs.

Conclusions: Efficient interpair pruning and band optimization makes it possible to complete the all-pairs comparisons of the sequences of the same species 1.2 times faster than the intrapair pruning method. This acceleration was achieved at the first phase of SW#, where our method significantly improved the initial lower bound. However, our interpair optimization was not effective for the comparison of the sequences of different species such as comparing human, chimpanzee, and gorilla. Consequently, our method is useful in accelerating the applications that require optimal local alignments scores for the same species. The source code is available for download from http://www-hagi.ist.osaka-u.ac.jp/research/code/.

No MeSH data available.


Pruned matrix cells for pair 〈2,6〉. (a) Proposed method on a single GPU, (b) previous method on a single GPU, (c) proposed method on dual GPUs, and (d) previous method on dual GPUs. Matrix cells in the gray area are pruned
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4595212&req=5

Fig11: Pruned matrix cells for pair 〈2,6〉. (a) Proposed method on a single GPU, (b) previous method on a single GPU, (c) proposed method on dual GPUs, and (d) previous method on dual GPUs. Matrix cells in the gray area are pruned

Mentions: We then analyzed how pruning was triggered during forward matrix-filling. There are four triggering patterns, each shaping a different border of the pruned area: (1) horizontal, (2) vertical, (3) diagonal, and (4) anti-diagonal. Horizontal and vertical borders appear when the highest score does not increase during the earlier phase of forward matrix-filling. Figure 11 shows an example of horizontal and vertical borders; in that example, the similar region of pair 〈2,6〉 existed in the latter part of the sequences. Consequently, the lower bound was rarely updated during the earlier phase of forward matrix-filling; thus, the triggering cells satisfy i=m−d1 or j=n−d2, where d1 and d2 are constant values. Diagonal borders appear around the highest scoring cell because its right and bottom neighbors can be pruned; such borders can be observed in our method and the previous method. Finally, anti-diagonal borders appear when the lower bound increases during forward matrix-filling. Anti-diagonal borders appear in the area that satisfies i≥n−j and i≤⌈m/2⌉, as shown in Fig. 10(b). Our method failed to update the lower bound within this region; consequently, anti-diagonal borders were observed only in the previous method.Fig. 11


Accelerating the Smith-Waterman algorithm with interpair pruning and band optimization for the all-pairs comparison of base sequences.

Okada D, Ino F, Hagihara K - BMC Bioinformatics (2015)

Pruned matrix cells for pair 〈2,6〉. (a) Proposed method on a single GPU, (b) previous method on a single GPU, (c) proposed method on dual GPUs, and (d) previous method on dual GPUs. Matrix cells in the gray area are pruned
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4595212&req=5

Fig11: Pruned matrix cells for pair 〈2,6〉. (a) Proposed method on a single GPU, (b) previous method on a single GPU, (c) proposed method on dual GPUs, and (d) previous method on dual GPUs. Matrix cells in the gray area are pruned
Mentions: We then analyzed how pruning was triggered during forward matrix-filling. There are four triggering patterns, each shaping a different border of the pruned area: (1) horizontal, (2) vertical, (3) diagonal, and (4) anti-diagonal. Horizontal and vertical borders appear when the highest score does not increase during the earlier phase of forward matrix-filling. Figure 11 shows an example of horizontal and vertical borders; in that example, the similar region of pair 〈2,6〉 existed in the latter part of the sequences. Consequently, the lower bound was rarely updated during the earlier phase of forward matrix-filling; thus, the triggering cells satisfy i=m−d1 or j=n−d2, where d1 and d2 are constant values. Diagonal borders appear around the highest scoring cell because its right and bottom neighbors can be pruned; such borders can be observed in our method and the previous method. Finally, anti-diagonal borders appear when the lower bound increases during forward matrix-filling. Anti-diagonal borders appear in the area that satisfies i≥n−j and i≤⌈m/2⌉, as shown in Fig. 10(b). Our method failed to update the lower bound within this region; consequently, anti-diagonal borders were observed only in the previous method.Fig. 11

Bottom Line: Given the results of the pairs of sequences, our method realizes efficient block pruning by computing a lower bound for other pairs that have not yet been processed.This acceleration was achieved at the first phase of SW#, where our method significantly improved the initial lower bound.However, our interpair optimization was not effective for the comparison of the sequences of different species such as comparing human, chimpanzee, and gorilla.

View Article: PubMed Central - PubMed

Affiliation: Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, 565-0871, Japan.

ABSTRACT

Background: The Smith-Waterman algorithm is known to be a more sensitive approach than heuristic algorithms for local sequence alignment algorithms. Despite its sensitivity, a greater time complexity associated with the Smith-Waterman algorithm prevents its application to the all-pairs comparisons of base sequences, which aids in the construction of accurate phylogenetic trees. The aim of this study is to achieve greater acceleration using the Smith-Waterman algorithm (by realizing interpair block pruning and band optimization) compared with that achieved using a previous method that performs intrapair block pruning on graphics processing units (GPUs).

Results: We present an interpair optimization method for the Smith-Waterman algorithm with the aim of accelerating the all-pairs comparison of base sequences. Given the results of the pairs of sequences, our method realizes efficient block pruning by computing a lower bound for other pairs that have not yet been processed. This lower bound is further used for band optimization. We integrated our interpair optimization method into SW#, a previous GPU-based implementation that employs variants of a banded Smith-Waterman algorithm and a banded Myers-Miller algorithm. Evaluation using the six genomes of Bacillus anthracis shows that our method pruned 88% of the matrix cells on a single GPU and 73% of the matrix cells on two GPUs. For the genomes of the human chromosome 21, the alignment performance reached 202 giga-cell updates per second (GCUPS) on two Tesla K40 GPUs.

Conclusions: Efficient interpair pruning and band optimization makes it possible to complete the all-pairs comparisons of the sequences of the same species 1.2 times faster than the intrapair pruning method. This acceleration was achieved at the first phase of SW#, where our method significantly improved the initial lower bound. However, our interpair optimization was not effective for the comparison of the sequences of different species such as comparing human, chimpanzee, and gorilla. Consequently, our method is useful in accelerating the applications that require optimal local alignments scores for the same species. The source code is available for download from http://www-hagi.ist.osaka-u.ac.jp/research/code/.

No MeSH data available.