Limits...
Improving the Mapping of Smith-Waterman Sequence Database Searches onto CUDA-Enabled GPUs.

Huang LT, Wu CC, Lai LF, Li YJ - Biomed Res Int (2015)

Bottom Line: In this paper, we focused on how to improve the mapping, especially for short query sequences, by better usage of shared memory.We performed and evaluated the proposed method on two different platforms (Tesla C1060 and Tesla K20) and compared it with two classic methods in CUDASW++.Further, the performance on different numbers of threads and blocks has been analyzed.

View Article: PubMed Central - PubMed

Affiliation: Department of Medical Informatics, Tzu Chi University, Hualien 970, Taiwan.

ABSTRACT
Sequence alignment lies at heart of the bioinformatics. The Smith-Waterman algorithm is one of the key sequence search algorithms and has gained popularity due to improved implementations and rapidly increasing compute power. Recently, the Smith-Waterman algorithm has been successfully mapped onto the emerging general-purpose graphics processing units (GPUs). In this paper, we focused on how to improve the mapping, especially for short query sequences, by better usage of shared memory. We performed and evaluated the proposed method on two different platforms (Tesla C1060 and Tesla K20) and compared it with two classic methods in CUDASW++. Further, the performance on different numbers of threads and blocks has been analyzed. The results showed that the proposed method significantly improves Smith-Waterman algorithm on CUDA-enabled GPUs in proper allocation of block and thread numbers.

No MeSH data available.


The overview of the proposed mapping algorithm.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4538332&req=5

fig3: The overview of the proposed mapping algorithm.

Mentions: Registers are the fastest memory and shared memory is faster than global memory one hundred times. To use registers to solve dependence as much as possible and to efficiently use shared memory to buffer spilled registers, we propose the following algorithm to perform the Smith-Waterman algorithm, as shown in Figure 3. Every K consecutive residue on the assigned subject sequence for a thread is grouped in order, padding dummy residues at the end of the subject sequence when necessary. Similarly, every P consecutive residue on the query sequence is in an ordered partition, padding dummy residues at the end of the query sequence when necessary. A tile is defined as the alignment of one group of ordered residues on the subject sequence with a partition of ordered residues on the query sequence. To enforce the dependence, tiles will be aligned one by one in the column-major order, where the tiles on the same column are processed from top to bottom. Furthermore, for a tile, the first residue on the query sequence is aligned with the K residues on the subject sequence one by one, from left to right; then the second, the third, and to the Pth residues are aligned with the K residues on the subject sequence, respectively.


Improving the Mapping of Smith-Waterman Sequence Database Searches onto CUDA-Enabled GPUs.

Huang LT, Wu CC, Lai LF, Li YJ - Biomed Res Int (2015)

The overview of the proposed mapping algorithm.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4538332&req=5

fig3: The overview of the proposed mapping algorithm.
Mentions: Registers are the fastest memory and shared memory is faster than global memory one hundred times. To use registers to solve dependence as much as possible and to efficiently use shared memory to buffer spilled registers, we propose the following algorithm to perform the Smith-Waterman algorithm, as shown in Figure 3. Every K consecutive residue on the assigned subject sequence for a thread is grouped in order, padding dummy residues at the end of the subject sequence when necessary. Similarly, every P consecutive residue on the query sequence is in an ordered partition, padding dummy residues at the end of the query sequence when necessary. A tile is defined as the alignment of one group of ordered residues on the subject sequence with a partition of ordered residues on the query sequence. To enforce the dependence, tiles will be aligned one by one in the column-major order, where the tiles on the same column are processed from top to bottom. Furthermore, for a tile, the first residue on the query sequence is aligned with the K residues on the subject sequence one by one, from left to right; then the second, the third, and to the Pth residues are aligned with the K residues on the subject sequence, respectively.

Bottom Line: In this paper, we focused on how to improve the mapping, especially for short query sequences, by better usage of shared memory.We performed and evaluated the proposed method on two different platforms (Tesla C1060 and Tesla K20) and compared it with two classic methods in CUDASW++.Further, the performance on different numbers of threads and blocks has been analyzed.

View Article: PubMed Central - PubMed

Affiliation: Department of Medical Informatics, Tzu Chi University, Hualien 970, Taiwan.

ABSTRACT
Sequence alignment lies at heart of the bioinformatics. The Smith-Waterman algorithm is one of the key sequence search algorithms and has gained popularity due to improved implementations and rapidly increasing compute power. Recently, the Smith-Waterman algorithm has been successfully mapped onto the emerging general-purpose graphics processing units (GPUs). In this paper, we focused on how to improve the mapping, especially for short query sequences, by better usage of shared memory. We performed and evaluated the proposed method on two different platforms (Tesla C1060 and Tesla K20) and compared it with two classic methods in CUDASW++. Further, the performance on different numbers of threads and blocks has been analyzed. The results showed that the proposed method significantly improves Smith-Waterman algorithm on CUDA-enabled GPUs in proper allocation of block and thread numbers.

No MeSH data available.