Limits...
Detailed protein sequence alignment based on Spectral Similarity Score (SSS).

Gupta K, Thomas D, Vidya SV, Venkatesh KV, Ramakumar S - BMC Bioinformatics (2005)

Bottom Line: Detailed comparison established close similarities between subsequences that do not have any significant character identity.The method captures subsequences that do not align by traditional character based alignment tools but give rise to similar secondary and tertiary structures.The Spectral Similarity Score (SSS) is an extension to the conventional similarity methods and results indicate that it holds a strong potential for analysis of various biological sequences and structural variations in proteins.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science & Engineering, Indian Institute of Technology, Bombay, Mumbai, India. kshitiz@cse.iitb.ac.in

ABSTRACT

Background: The chemical property and biological function of a protein is a direct consequence of its primary structure. Several algorithms have been developed which determine alignment and similarity of primary protein sequences. However, character based similarity cannot provide insight into the structural aspects of a protein. We present a method based on spectral similarity to compare subsequences of amino acids that behave similarly but are not aligned well by considering amino acids as mere characters. This approach finds a similarity score between sequences based on any given attribute, like hydrophobicity of amino acids, on the basis of spectral information after partial conversion to the frequency domain.

Results: Distance matrices of various branches of the human kinome, that is the full complement of human kinases, were developed that matched the phylogenetic tree of the human kinome establishing the efficacy of the global alignment of the algorithm. PKCd and PKCe kinases share close biological properties and structural similarities but do not give high scores with character based alignments. Detailed comparison established close similarities between subsequences that do not have any significant character identity. We compared their known 3D structures to establish that the algorithm is able to pick subsequences that are not considered similar by character based matching algorithms but share structural similarities. Similarly many subsequences with low character identity were picked between xyna-theau and xyna-clotm F/10 xylanases. Comparison of 3D structures of the subsequences confirmed the claim of similarity in structure.

Conclusion: An algorithm is developed which is inspired by successful application of spectral similarity applied to music sequences. The method captures subsequences that do not align by traditional character based alignment tools but give rise to similar secondary and tertiary structures. The Spectral Similarity Score (SSS) is an extension to the conventional similarity methods and results indicate that it holds a strong potential for analysis of various biological sequences and structural variations in proteins.

Show MeSH

Related in: MedlinePlus

3D matching for xyna-theau xynz-clotm using SPDBV magic fit. 3D images of fit obtained by using SPDBV [30, 31] software's "magic fit" tools. The first value in the bracket is the SSS for the subsequence and second refers to rms value obtained by the tool in 0A Color red is used for xyna-theau and yellow for xynz-clotm. The two proteins are similar proteins with high BLAST score and overlapping 3D structures. SSS however is still able to catch subsequences that are left as dissimilar by BLAST, and low rms values for captured subsequences confirm the findings. The subsequences in the figures are (a) SCVGITVM & NCNTFVMW (b) GITVWGVA & TFVMWGFT (c) RVKQWRAA & MIKSMKER (d) EDGSLRQT & SGNGLRSS belonging to xyna-theau and xynz-clotm respectively.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC1131888&req=5

Figure 5: 3D matching for xyna-theau xynz-clotm using SPDBV magic fit. 3D images of fit obtained by using SPDBV [30, 31] software's "magic fit" tools. The first value in the bracket is the SSS for the subsequence and second refers to rms value obtained by the tool in 0A Color red is used for xyna-theau and yellow for xynz-clotm. The two proteins are similar proteins with high BLAST score and overlapping 3D structures. SSS however is still able to catch subsequences that are left as dissimilar by BLAST, and low rms values for captured subsequences confirm the findings. The subsequences in the figures are (a) SCVGITVM & NCNTFVMW (b) GITVWGVA & TFVMWGFT (c) RVKQWRAA & MIKSMKER (d) EDGSLRQT & SGNGLRSS belonging to xyna-theau and xynz-clotm respectively.

Mentions: 4. The algorithm was run on xyna-theau (pdbid: 1gor) and xynz-clotm chain A (pdbid: 1xyz) and compared with the results from BLAST. Subsequences that were found to be matching with large distance values (meaning that the similarity is not very high, but reported in the matching segments) were looked for their secondary structures. Appreciable similarity in secondary structures were reported though alignment was not perfect (see table 5). Figures 5 shows the fits obtained for the individual subsequences picked by the SSS using SPDBV. Xyna-theau and xynz-clotm are abound in H (Helix), but the algorithm is able to catch the subsequences where for short duration β strands were located within two bends and align them with a similar stretch in the other sequence. It must be considered, that interesting results may be expected by the algorithm (and those not expected from character based alignment) only when the distance value D is not very small, and a micro analysis of the matching segments may produce results that are unobtainable otherwise.


Detailed protein sequence alignment based on Spectral Similarity Score (SSS).

Gupta K, Thomas D, Vidya SV, Venkatesh KV, Ramakumar S - BMC Bioinformatics (2005)

3D matching for xyna-theau xynz-clotm using SPDBV magic fit. 3D images of fit obtained by using SPDBV [30, 31] software's "magic fit" tools. The first value in the bracket is the SSS for the subsequence and second refers to rms value obtained by the tool in 0A Color red is used for xyna-theau and yellow for xynz-clotm. The two proteins are similar proteins with high BLAST score and overlapping 3D structures. SSS however is still able to catch subsequences that are left as dissimilar by BLAST, and low rms values for captured subsequences confirm the findings. The subsequences in the figures are (a) SCVGITVM & NCNTFVMW (b) GITVWGVA & TFVMWGFT (c) RVKQWRAA & MIKSMKER (d) EDGSLRQT & SGNGLRSS belonging to xyna-theau and xynz-clotm respectively.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC1131888&req=5

Figure 5: 3D matching for xyna-theau xynz-clotm using SPDBV magic fit. 3D images of fit obtained by using SPDBV [30, 31] software's "magic fit" tools. The first value in the bracket is the SSS for the subsequence and second refers to rms value obtained by the tool in 0A Color red is used for xyna-theau and yellow for xynz-clotm. The two proteins are similar proteins with high BLAST score and overlapping 3D structures. SSS however is still able to catch subsequences that are left as dissimilar by BLAST, and low rms values for captured subsequences confirm the findings. The subsequences in the figures are (a) SCVGITVM & NCNTFVMW (b) GITVWGVA & TFVMWGFT (c) RVKQWRAA & MIKSMKER (d) EDGSLRQT & SGNGLRSS belonging to xyna-theau and xynz-clotm respectively.
Mentions: 4. The algorithm was run on xyna-theau (pdbid: 1gor) and xynz-clotm chain A (pdbid: 1xyz) and compared with the results from BLAST. Subsequences that were found to be matching with large distance values (meaning that the similarity is not very high, but reported in the matching segments) were looked for their secondary structures. Appreciable similarity in secondary structures were reported though alignment was not perfect (see table 5). Figures 5 shows the fits obtained for the individual subsequences picked by the SSS using SPDBV. Xyna-theau and xynz-clotm are abound in H (Helix), but the algorithm is able to catch the subsequences where for short duration β strands were located within two bends and align them with a similar stretch in the other sequence. It must be considered, that interesting results may be expected by the algorithm (and those not expected from character based alignment) only when the distance value D is not very small, and a micro analysis of the matching segments may produce results that are unobtainable otherwise.

Bottom Line: Detailed comparison established close similarities between subsequences that do not have any significant character identity.The method captures subsequences that do not align by traditional character based alignment tools but give rise to similar secondary and tertiary structures.The Spectral Similarity Score (SSS) is an extension to the conventional similarity methods and results indicate that it holds a strong potential for analysis of various biological sequences and structural variations in proteins.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science & Engineering, Indian Institute of Technology, Bombay, Mumbai, India. kshitiz@cse.iitb.ac.in

ABSTRACT

Background: The chemical property and biological function of a protein is a direct consequence of its primary structure. Several algorithms have been developed which determine alignment and similarity of primary protein sequences. However, character based similarity cannot provide insight into the structural aspects of a protein. We present a method based on spectral similarity to compare subsequences of amino acids that behave similarly but are not aligned well by considering amino acids as mere characters. This approach finds a similarity score between sequences based on any given attribute, like hydrophobicity of amino acids, on the basis of spectral information after partial conversion to the frequency domain.

Results: Distance matrices of various branches of the human kinome, that is the full complement of human kinases, were developed that matched the phylogenetic tree of the human kinome establishing the efficacy of the global alignment of the algorithm. PKCd and PKCe kinases share close biological properties and structural similarities but do not give high scores with character based alignments. Detailed comparison established close similarities between subsequences that do not have any significant character identity. We compared their known 3D structures to establish that the algorithm is able to pick subsequences that are not considered similar by character based matching algorithms but share structural similarities. Similarly many subsequences with low character identity were picked between xyna-theau and xyna-clotm F/10 xylanases. Comparison of 3D structures of the subsequences confirmed the claim of similarity in structure.

Conclusion: An algorithm is developed which is inspired by successful application of spectral similarity applied to music sequences. The method captures subsequences that do not align by traditional character based alignment tools but give rise to similar secondary and tertiary structures. The Spectral Similarity Score (SSS) is an extension to the conventional similarity methods and results indicate that it holds a strong potential for analysis of various biological sequences and structural variations in proteins.

Show MeSH
Related in: MedlinePlus