Limits...
TS-AMIR: a topology string alignment method for intensive rapid protein structure comparison.

Razmara J, Deris S, Parvizpour S - Algorithms Mol Biol (2012)

Bottom Line: The performance of the method was assessed in different information retrieval tests and the results were compared with those of CE and TM-align, representing two geometrical tools, and YAKUSA, 3D-BLAST and SARST as three representatives of linear encoding schemes.In addition, the method runs about 800 and 7200 times faster than TM-align and CE respectively, while maintaining a competitive accuracy with TM-align and CE.The experimental results demonstrate that linear encoding techniques are capable of reaching the same high degree of accuracy as that achieved by geometrical methods, while generally running hundreds of times faster than conventional programs.

View Article: PubMed Central - HTML - PubMed

Affiliation: Faculty of Computer Science and Information Systems, Universiti Teknologi Malaysia 81310, Johor Bahru, Malaysia. razmaraj@gmail.com.

ABSTRACT

Background: In structural biology, similarity analysis of protein structure is a crucial step in studying the relationship between proteins. Despite the considerable number of techniques that have been explored within the past two decades, the development of new alternative methods is still an active research area due to the need for high performance tools.

Results: In this paper, we present TS-AMIR, a Topology String Alignment Method for Intensive Rapid comparison of protein structures. The proposed method works in two stages: In the first stage, the method generates a topology string based on the geometric details of secondary structure elements, and then, utilizes an n-gram modelling technique over entropy concept to capture similarities in these strings. This initial correspondence map between secondary structure elements is submitted to the second stage in order to obtain the alignment at the residue level. Applying the Kabsch method, a heuristic step-by-step algorithm is adopted in the second stage to align the residues, resulting in an optimal rotation matrix and minimized RMSD. The performance of the method was assessed in different information retrieval tests and the results were compared with those of CE and TM-align, representing two geometrical tools, and YAKUSA, 3D-BLAST and SARST as three representatives of linear encoding schemes. It is shown that the method obtains a high running speed similar to that of the linear encoding schemes. In addition, the method runs about 800 and 7200 times faster than TM-align and CE respectively, while maintaining a competitive accuracy with TM-align and CE.

Conclusions: The experimental results demonstrate that linear encoding techniques are capable of reaching the same high degree of accuracy as that achieved by geometrical methods, while generally running hundreds of times faster than conventional programs.

No MeSH data available.


Average TMscore obtained at different gap penalties.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3298807&req=5

Figure 5: Average TMscore obtained at different gap penalties.

Mentions: Figure 5 shows the average TMscore obtained at different gap penalty values. In this figure, a higher TMscore value indicates a higher degree of precision and/or length in the alignment. As shown in the figure, increasing the negative gap penalty yields a higher TMscore on varying gap penalties ranging from -1.0 to -3.0. Based on the figure, the optimal gap penalty value is -3, where for the higher values, the TMscore decreases slightly. Indeed, the high gap penalty values prevent the alignment from extending along the protein structure, and yet the low value yields numerous gaps in the alignment with a low biological significance. Accordingly, the value of -3.0 seems to be the optimal gap penalty value in our experiment.


TS-AMIR: a topology string alignment method for intensive rapid protein structure comparison.

Razmara J, Deris S, Parvizpour S - Algorithms Mol Biol (2012)

Average TMscore obtained at different gap penalties.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3298807&req=5

Figure 5: Average TMscore obtained at different gap penalties.
Mentions: Figure 5 shows the average TMscore obtained at different gap penalty values. In this figure, a higher TMscore value indicates a higher degree of precision and/or length in the alignment. As shown in the figure, increasing the negative gap penalty yields a higher TMscore on varying gap penalties ranging from -1.0 to -3.0. Based on the figure, the optimal gap penalty value is -3, where for the higher values, the TMscore decreases slightly. Indeed, the high gap penalty values prevent the alignment from extending along the protein structure, and yet the low value yields numerous gaps in the alignment with a low biological significance. Accordingly, the value of -3.0 seems to be the optimal gap penalty value in our experiment.

Bottom Line: The performance of the method was assessed in different information retrieval tests and the results were compared with those of CE and TM-align, representing two geometrical tools, and YAKUSA, 3D-BLAST and SARST as three representatives of linear encoding schemes.In addition, the method runs about 800 and 7200 times faster than TM-align and CE respectively, while maintaining a competitive accuracy with TM-align and CE.The experimental results demonstrate that linear encoding techniques are capable of reaching the same high degree of accuracy as that achieved by geometrical methods, while generally running hundreds of times faster than conventional programs.

View Article: PubMed Central - HTML - PubMed

Affiliation: Faculty of Computer Science and Information Systems, Universiti Teknologi Malaysia 81310, Johor Bahru, Malaysia. razmaraj@gmail.com.

ABSTRACT

Background: In structural biology, similarity analysis of protein structure is a crucial step in studying the relationship between proteins. Despite the considerable number of techniques that have been explored within the past two decades, the development of new alternative methods is still an active research area due to the need for high performance tools.

Results: In this paper, we present TS-AMIR, a Topology String Alignment Method for Intensive Rapid comparison of protein structures. The proposed method works in two stages: In the first stage, the method generates a topology string based on the geometric details of secondary structure elements, and then, utilizes an n-gram modelling technique over entropy concept to capture similarities in these strings. This initial correspondence map between secondary structure elements is submitted to the second stage in order to obtain the alignment at the residue level. Applying the Kabsch method, a heuristic step-by-step algorithm is adopted in the second stage to align the residues, resulting in an optimal rotation matrix and minimized RMSD. The performance of the method was assessed in different information retrieval tests and the results were compared with those of CE and TM-align, representing two geometrical tools, and YAKUSA, 3D-BLAST and SARST as three representatives of linear encoding schemes. It is shown that the method obtains a high running speed similar to that of the linear encoding schemes. In addition, the method runs about 800 and 7200 times faster than TM-align and CE respectively, while maintaining a competitive accuracy with TM-align and CE.

Conclusions: The experimental results demonstrate that linear encoding techniques are capable of reaching the same high degree of accuracy as that achieved by geometrical methods, while generally running hundreds of times faster than conventional programs.

No MeSH data available.