Limits...
The mining of toxin-like polypeptides from EST database by single residue distribution analysis.

Kozlov S, Grishin E - BMC Genomics (2011)

Bottom Line: The adequacy of motifs for mining toxin-like sequences was confirmed by their ability to identify 100% toxin-like anemone polypeptides in the reference polypeptide database.Analysis of 39939 ESTs of sea anemone Anemonia viridis resulted in identification of five protein precursors of earlier described toxins, discovery of 43 novel polypeptide toxins, and prediction of 39 putative polypeptide toxin sequences.In addition, two precursors of novel peptides presumably displaying neuronal function were disclosed.

View Article: PubMed Central - HTML - PubMed

Affiliation: Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, ul. Miklukho-Maklaya 16/10, 117997 Moscow, Russia.

ABSTRACT

Background: Novel high throughput sequencing technologies require permanent development of bioinformatics data processing methods. Among them, rapid and reliable identification of encoded proteins plays a pivotal role. To search for particular protein families, the amino acid sequence motifs suitable for selective screening of nucleotide sequence databases may be used. In this work, we suggest a novel method for simplified representation of protein amino acid sequences named Single Residue Distribution Analysis, which is applicable both for homology search and database screening.

Results: Using the procedure developed, a search for amino acid sequence motifs in sea anemone polypeptides was performed, and 14 different motifs with broad and low specificity were discriminated. The adequacy of motifs for mining toxin-like sequences was confirmed by their ability to identify 100% toxin-like anemone polypeptides in the reference polypeptide database. The employment of novel motifs for the search of polypeptide toxins in Anemonia viridis EST dataset allowed us to identify 89 putative toxin precursors. The translated and modified ESTs were scanned using a special algorithm. In addition to direct comparison with the motifs developed, the putative signal peptides were predicted and homology with known structures was examined.

Conclusions: The suggested method may be used to retrieve structures of interest from the EST databases using simple amino acid sequence motifs as templates. The efficiency of the procedure for directed search of polypeptides is higher than that of most currently used methods. Analysis of 39939 ESTs of sea anemone Anemonia viridis resulted in identification of five protein precursors of earlier described toxins, discovery of 43 novel polypeptide toxins, and prediction of 39 putative polypeptide toxin sequences. In addition, two precursors of novel peptides presumably displaying neuronal function were disclosed.

Show MeSH
Alignment of polypeptide structures retrieved with motif 4 vs. BPTI/Kunitz family of serine proteinase inhibitors and toxins (P10280, Q9TWG0, Q9TWF9, Q9TWF8). Mature polypeptides are shown in black, while signal peptides and propeptide domains are given in light brown; amino acids that differ from the kalicludine-1 sequence are shown in red.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3040730&req=5

Figure 7: Alignment of polypeptide structures retrieved with motif 4 vs. BPTI/Kunitz family of serine proteinase inhibitors and toxins (P10280, Q9TWG0, Q9TWF9, Q9TWF8). Mature polypeptides are shown in black, while signal peptides and propeptide domains are given in light brown; amino acids that differ from the kalicludine-1 sequence are shown in red.

Mentions: The Kunitz-type polypeptides were retrieved using motif 4 (see Figure 7). The Kunitz-type scaffold is found not only in inhibitors of proteolytic enzymes but in toxins as well, for example in kalicludines. Some other polypeptides with antifungal and antimicrobial activities and those showing analgesic properties adopt the same scaffold [5,38,42,43]. In this group, the most represented sequences corresponded to the earlier described kalicludine-3 and to a new polypeptide kalicludine-4 (AsKC4). Another less abundant sequence AsKC1a had an additional residue at the C-terminus compared to kalicludine-1. Conversely, a novel homologue of a known proteinase inhibitor 5 II named proteinase inhibitor 5 III, which was C-terminally truncated by three amino acid residues, was discovered in the database. Other members of the family due to high homology to kalicludines were designated AsKC4-AsKC16.


The mining of toxin-like polypeptides from EST database by single residue distribution analysis.

Kozlov S, Grishin E - BMC Genomics (2011)

Alignment of polypeptide structures retrieved with motif 4 vs. BPTI/Kunitz family of serine proteinase inhibitors and toxins (P10280, Q9TWG0, Q9TWF9, Q9TWF8). Mature polypeptides are shown in black, while signal peptides and propeptide domains are given in light brown; amino acids that differ from the kalicludine-1 sequence are shown in red.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3040730&req=5

Figure 7: Alignment of polypeptide structures retrieved with motif 4 vs. BPTI/Kunitz family of serine proteinase inhibitors and toxins (P10280, Q9TWG0, Q9TWF9, Q9TWF8). Mature polypeptides are shown in black, while signal peptides and propeptide domains are given in light brown; amino acids that differ from the kalicludine-1 sequence are shown in red.
Mentions: The Kunitz-type polypeptides were retrieved using motif 4 (see Figure 7). The Kunitz-type scaffold is found not only in inhibitors of proteolytic enzymes but in toxins as well, for example in kalicludines. Some other polypeptides with antifungal and antimicrobial activities and those showing analgesic properties adopt the same scaffold [5,38,42,43]. In this group, the most represented sequences corresponded to the earlier described kalicludine-3 and to a new polypeptide kalicludine-4 (AsKC4). Another less abundant sequence AsKC1a had an additional residue at the C-terminus compared to kalicludine-1. Conversely, a novel homologue of a known proteinase inhibitor 5 II named proteinase inhibitor 5 III, which was C-terminally truncated by three amino acid residues, was discovered in the database. Other members of the family due to high homology to kalicludines were designated AsKC4-AsKC16.

Bottom Line: The adequacy of motifs for mining toxin-like sequences was confirmed by their ability to identify 100% toxin-like anemone polypeptides in the reference polypeptide database.Analysis of 39939 ESTs of sea anemone Anemonia viridis resulted in identification of five protein precursors of earlier described toxins, discovery of 43 novel polypeptide toxins, and prediction of 39 putative polypeptide toxin sequences.In addition, two precursors of novel peptides presumably displaying neuronal function were disclosed.

View Article: PubMed Central - HTML - PubMed

Affiliation: Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, ul. Miklukho-Maklaya 16/10, 117997 Moscow, Russia.

ABSTRACT

Background: Novel high throughput sequencing technologies require permanent development of bioinformatics data processing methods. Among them, rapid and reliable identification of encoded proteins plays a pivotal role. To search for particular protein families, the amino acid sequence motifs suitable for selective screening of nucleotide sequence databases may be used. In this work, we suggest a novel method for simplified representation of protein amino acid sequences named Single Residue Distribution Analysis, which is applicable both for homology search and database screening.

Results: Using the procedure developed, a search for amino acid sequence motifs in sea anemone polypeptides was performed, and 14 different motifs with broad and low specificity were discriminated. The adequacy of motifs for mining toxin-like sequences was confirmed by their ability to identify 100% toxin-like anemone polypeptides in the reference polypeptide database. The employment of novel motifs for the search of polypeptide toxins in Anemonia viridis EST dataset allowed us to identify 89 putative toxin precursors. The translated and modified ESTs were scanned using a special algorithm. In addition to direct comparison with the motifs developed, the putative signal peptides were predicted and homology with known structures was examined.

Conclusions: The suggested method may be used to retrieve structures of interest from the EST databases using simple amino acid sequence motifs as templates. The efficiency of the procedure for directed search of polypeptides is higher than that of most currently used methods. Analysis of 39939 ESTs of sea anemone Anemonia viridis resulted in identification of five protein precursors of earlier described toxins, discovery of 43 novel polypeptide toxins, and prediction of 39 putative polypeptide toxin sequences. In addition, two precursors of novel peptides presumably displaying neuronal function were disclosed.

Show MeSH