Limits...
Multiple structure alignment with msTALI.

Shealy P, Valafar H - BMC Bioinformatics (2012)

Bottom Line: Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification.We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications.In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA.

ABSTRACT

Background: Multiple structure alignments have received increasing attention in recent years as an alternative to multiple sequence alignments. Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification. A method that is capable of solving a variety of problems using structure comparison is still absent. Here we introduce a program msTALI for aligning multiple protein structures. Our algorithm uses several informative features to guide its alignments: torsion angles, backbone Cα atom positions, secondary structure, residue type, surface accessibility, and properties of nearby atoms. The algorithm allows the user to weight the types of information used to generate the alignment, which expands its utility to a wide variety of problems.

Results: msTALI exhibits competitive results on 824 families from the Homstrad and SABmark databases when compared to Matt and Mustang. We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications. Finally, we present an example applying msTALI to the problem of detecting hinges in a protein undergoing rigid-body motion.

Conclusions: msTALI is an effective algorithm for multiple structure alignment. In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications. The C++ source code for msTALI is available for Linux on the web at http://ifestos.cse.sc.edu/mstali.

Show MeSH

Related in: MedlinePlus

An illustration of neighboring residues. An illustration of neighboring residues from a portion of a protein structure. Cα atoms are shown as balls. For the residue in question, Ser 10, the N-terminal neighbor Thr 7 was identified, and its distance to Ser 10 d1 measured and sequence type s1 noted. Similarly, the C-terminal residue Phe 13 was identified and its distance d2 measured and sequence type s2 noted. While both neighboring residues are three residues from the target, this need not be the case.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3473313&req=5

Figure 10: An illustration of neighboring residues. An illustration of neighboring residues from a portion of a protein structure. Cα atoms are shown as balls. For the residue in question, Ser 10, the N-terminal neighbor Thr 7 was identified, and its distance to Ser 10 d1 measured and sequence type s1 noted. Similarly, the C-terminal residue Phe 13 was identified and its distance d2 measured and sequence type s2 noted. While both neighboring residues are three residues from the target, this need not be the case.

Mentions: Several properties of neighboring residues can play an important role in determining an overall alignment of structures. Several features related to relevant neighboring residues are therefore incorporated into the algorithm through the functions dp, ds, sp, and ss to resolve potential ambiguities. An example is a series of antiparallel β-strands that form a β-sheet. If one strand is missing from a structure to be aligned, a flexible alignment algorithm may have difficulty identifying the correct correspondence between β-strands from different structures. We introduce neighboring residues to reduce this type of uncertainty. An example is shown in Figure10. For any residue i (example Ser 10 in Figure10), the closest preceding residue (in Euclidean distance measured between Cα atoms) among all preceding residues is identified, and the residue’s type s1 and the distance d1 are noted. To ensure that any of the two immediately preceding residues i-1 or i-2 is not always chosen, the chosen residue must be > 2 residues away in the primary sequence. This is repeated for the successive residues (i + 1 and i + 2 residues excluded), where the residue type is labeled s2 and the distance is labeled d2 . The comparison functions dp and ds each accept two residues numbered i and j from structures m and n and compute the differences in d1 and d2:


Multiple structure alignment with msTALI.

Shealy P, Valafar H - BMC Bioinformatics (2012)

An illustration of neighboring residues. An illustration of neighboring residues from a portion of a protein structure. Cα atoms are shown as balls. For the residue in question, Ser 10, the N-terminal neighbor Thr 7 was identified, and its distance to Ser 10 d1 measured and sequence type s1 noted. Similarly, the C-terminal residue Phe 13 was identified and its distance d2 measured and sequence type s2 noted. While both neighboring residues are three residues from the target, this need not be the case.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3473313&req=5

Figure 10: An illustration of neighboring residues. An illustration of neighboring residues from a portion of a protein structure. Cα atoms are shown as balls. For the residue in question, Ser 10, the N-terminal neighbor Thr 7 was identified, and its distance to Ser 10 d1 measured and sequence type s1 noted. Similarly, the C-terminal residue Phe 13 was identified and its distance d2 measured and sequence type s2 noted. While both neighboring residues are three residues from the target, this need not be the case.
Mentions: Several properties of neighboring residues can play an important role in determining an overall alignment of structures. Several features related to relevant neighboring residues are therefore incorporated into the algorithm through the functions dp, ds, sp, and ss to resolve potential ambiguities. An example is a series of antiparallel β-strands that form a β-sheet. If one strand is missing from a structure to be aligned, a flexible alignment algorithm may have difficulty identifying the correct correspondence between β-strands from different structures. We introduce neighboring residues to reduce this type of uncertainty. An example is shown in Figure10. For any residue i (example Ser 10 in Figure10), the closest preceding residue (in Euclidean distance measured between Cα atoms) among all preceding residues is identified, and the residue’s type s1 and the distance d1 are noted. To ensure that any of the two immediately preceding residues i-1 or i-2 is not always chosen, the chosen residue must be > 2 residues away in the primary sequence. This is repeated for the successive residues (i + 1 and i + 2 residues excluded), where the residue type is labeled s2 and the distance is labeled d2 . The comparison functions dp and ds each accept two residues numbered i and j from structures m and n and compute the differences in d1 and d2:

Bottom Line: Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification.We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications.In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA.

ABSTRACT

Background: Multiple structure alignments have received increasing attention in recent years as an alternative to multiple sequence alignments. Although multiple structure alignment algorithms can potentially be applied to a number of problems, they have primarily been used for protein core identification. A method that is capable of solving a variety of problems using structure comparison is still absent. Here we introduce a program msTALI for aligning multiple protein structures. Our algorithm uses several informative features to guide its alignments: torsion angles, backbone Cα atom positions, secondary structure, residue type, surface accessibility, and properties of nearby atoms. The algorithm allows the user to weight the types of information used to generate the alignment, which expands its utility to a wide variety of problems.

Results: msTALI exhibits competitive results on 824 families from the Homstrad and SABmark databases when compared to Matt and Mustang. We also demonstrate success at building a database of protein cores using 341 randomly selected CATH domains and highlight the contribution of msTALI compared to the CATH classifications. Finally, we present an example applying msTALI to the problem of detecting hinges in a protein undergoing rigid-body motion.

Conclusions: msTALI is an effective algorithm for multiple structure alignment. In addition to its performance on standard comparison databases, it utilizes clear, informative features, allowing further customization for domain-specific applications. The C++ source code for msTALI is available for Linux on the web at http://ifestos.cse.sc.edu/mstali.

Show MeSH
Related in: MedlinePlus