Limits...
Multiple structure alignment with msTALI.

Shealy P, Valafar H - BMC Bioinformatics (2012)

Bottom Line: Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification.We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications.In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA.

ABSTRACT

Background: Multiple structure alignments have received increasing attention in recent years as an alternative to multiple sequence alignments. Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification. A method that is capable of solving a variety of problems using structure comparison is still absent. Here we introduce a program msTALI for aligning multiple protein structures. Our algorithm uses several informative features to guide its alignments: torsion angles, backbone Cα atom positions, secondary structure, residue type, surface accessibility, and properties of nearby atoms. The algorithm allows the user to weight the types of information used to generate the alignment, which expands its utility to a wide variety of problems.

Results: msTALI exhibits competitive results on 824 families from the Homstrad and SABmark databases when compared to Matt and Mustang. We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications. Finally, we present an example applying msTALI to the problem of detecting hinges in a protein undergoing rigid-body motion.

Conclusions: msTALI is an effective algorithm for multiple structure alignment. In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications. The C++ source code for msTALI is available for Linux on the web at http://ifestos.cse.sc.edu/mstali.

Show MeSH

Related in: MedlinePlus

Comparison of msTALI to competing algorithms on Homstrad and SABmark. Comparison plots of the backbone RMSD (on top) and core size (on bottom) between msTALI and Mustang (left), POSA (middle), and Matt (right) on Homstrad. Backbone RMSD is measured in angstroms; core size is measured in residues. In all plots, msTALI is on the y-axis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3473313&req=5

Figure 3: Comparison of msTALI to competing algorithms on Homstrad and SABmark. Comparison plots of the backbone RMSD (on top) and core size (on bottom) between msTALI and Mustang (left), POSA (middle), and Matt (right) on Homstrad. Backbone RMSD is measured in angstroms; core size is measured in residues. In all plots, msTALI is on the y-axis.

Mentions: Figure3 further illustrates msTALI’s performance on Homstrad. This figure plots backbone RMSD and core size for msTALI compared to Matt, Mustang, and POSA. The POSA core size plot is skewed well above the dividing line, clearly demonstrating that msTALI identifies many cores with larger sizes. Furthermore, a majority of the points in the RMSD plot lie below the line, illustrating the results from Table4 that msTALI frequently locates protein cores with smaller RMSDs. The Mustang RMSD plot is skewed to the right; in particular, a number of Mustang cores have RMSDs higher than 7 Å, while msTALI has only a few. The core size plot is less conclusive; some core sizes are better for msTALI while others favor Mustang. The Matt RMSD plot is centered about the equality line, but the core size plot clearly shows that msTALI identified larger cores for a significant majority of the families. SABmark results on the 425 superfamily groups are shown in Table5. POSA results are not available for SABmark, so the SABmark analysis includes only Matt and Mustang. msTALI exhibits excellent performance against Mustang, outperforming it on both core size and backbone RMSD for 43.6% of all groups. msTALI also outperforms Matt on 26.3% of all families. The results without the training families are 43.6% and 26.6% respectively. The competing applications perform better than msTALI on both measures in only 22.5% and 9.2% of the families, respectively. SABmark contains more challenging structure comparisons, and yet msTALI is able to achieve better results than the best competing algorithms.


Multiple structure alignment with msTALI.

Shealy P, Valafar H - BMC Bioinformatics (2012)

Comparison of msTALI to competing algorithms on Homstrad and SABmark. Comparison plots of the backbone RMSD (on top) and core size (on bottom) between msTALI and Mustang (left), POSA (middle), and Matt (right) on Homstrad. Backbone RMSD is measured in angstroms; core size is measured in residues. In all plots, msTALI is on the y-axis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3473313&req=5

Figure 3: Comparison of msTALI to competing algorithms on Homstrad and SABmark. Comparison plots of the backbone RMSD (on top) and core size (on bottom) between msTALI and Mustang (left), POSA (middle), and Matt (right) on Homstrad. Backbone RMSD is measured in angstroms; core size is measured in residues. In all plots, msTALI is on the y-axis.
Mentions: Figure3 further illustrates msTALI’s performance on Homstrad. This figure plots backbone RMSD and core size for msTALI compared to Matt, Mustang, and POSA. The POSA core size plot is skewed well above the dividing line, clearly demonstrating that msTALI identifies many cores with larger sizes. Furthermore, a majority of the points in the RMSD plot lie below the line, illustrating the results from Table4 that msTALI frequently locates protein cores with smaller RMSDs. The Mustang RMSD plot is skewed to the right; in particular, a number of Mustang cores have RMSDs higher than 7 Å, while msTALI has only a few. The core size plot is less conclusive; some core sizes are better for msTALI while others favor Mustang. The Matt RMSD plot is centered about the equality line, but the core size plot clearly shows that msTALI identified larger cores for a significant majority of the families. SABmark results on the 425 superfamily groups are shown in Table5. POSA results are not available for SABmark, so the SABmark analysis includes only Matt and Mustang. msTALI exhibits excellent performance against Mustang, outperforming it on both core size and backbone RMSD for 43.6% of all groups. msTALI also outperforms Matt on 26.3% of all families. The results without the training families are 43.6% and 26.6% respectively. The competing applications perform better than msTALI on both measures in only 22.5% and 9.2% of the families, respectively. SABmark contains more challenging structure comparisons, and yet msTALI is able to achieve better results than the best competing algorithms.

Bottom Line: Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification.We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications.In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA.

ABSTRACT

Background: Multiple structure alignments have received increasing attention in recent years as an alternative to multiple sequence alignments. Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification. A method that is capable of solving a variety of problems using structure comparison is still absent. Here we introduce a program msTALI for aligning multiple protein structures. Our algorithm uses several informative features to guide its alignments: torsion angles, backbone Cα atom positions, secondary structure, residue type, surface accessibility, and properties of nearby atoms. The algorithm allows the user to weight the types of information used to generate the alignment, which expands its utility to a wide variety of problems.

Results: msTALI exhibits competitive results on 824 families from the Homstrad and SABmark databases when compared to Matt and Mustang. We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications. Finally, we present an example applying msTALI to the problem of detecting hinges in a protein undergoing rigid-body motion.

Conclusions: msTALI is an effective algorithm for multiple structure alignment. In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications. The C++ source code for msTALI is available for Linux on the web at http://ifestos.cse.sc.edu/mstali.

Show MeSH
Related in: MedlinePlus