Limits...
Exploiting publicly available biological and biochemical information for the discovery of novel short linear motifs.

Sayadi A, Briganti L, Tramontano A, Via A - PLoS ONE (2011)

Bottom Line: Consequently, only a small fraction of them have been discovered so far.We describe here an approach for the discovery of SLiMs based on their occurrence in evolutionarily unrelated proteins belonging to the same biological, signalling or metabolic pathway and give specific examples of its effectiveness in both rediscovering known motifs and in discovering novel ones.An automatic implementation of the procedure, available for download, allows significant motifs to be identified, automatically annotated with functional, evolutionary and structural information and organized in a database that can be inspected and queried.

View Article: PubMed Central - PubMed

Affiliation: Department of Physics, Sapienza University of Rome, Rome, Italy.

ABSTRACT
The function of proteins is often mediated by short linear segments of their amino acid sequence, called Short Linear Motifs or SLiMs, the identification of which can provide important information about a protein function. However, the short length of the motifs and their variable degree of conservation makes their identification hard since it is difficult to correctly estimate the statistical significance of their occurrence. Consequently, only a small fraction of them have been discovered so far. We describe here an approach for the discovery of SLiMs based on their occurrence in evolutionarily unrelated proteins belonging to the same biological, signalling or metabolic pathway and give specific examples of its effectiveness in both rediscovering known motifs and in discovering novel ones. An automatic implementation of the procedure, available for download, allows significant motifs to be identified, automatically annotated with functional, evolutionary and structural information and organized in a database that can be inspected and queried. An instance of the database populated with pre-computed data on seven organisms is accessible through a publicly available server and we believe it constitutes by itself a useful resource for the life sciences (http://www.biocomputing.it/modipath).

Show MeSH
The information provided by MoDiPath for the hsa04640 KEGG pathway.(a) First column: the SLiM regular expression; Second column: a ‘+’ is reported if the motif overlaps to a similar motif in other databases (the list of which is shown by moving the mouse over the ‘+’); Third column: the hyper-geometric p-value of the number of motif hits in the hsa04640 pathway compared to the number of motif hits in the SwissProt database; Fourth column: The fraction of proteins in the hsa04640 pathway that contain the WS.WS motif (b) Multiple sequence alignment of the hsa04640 pathway proteins containing the WS.WS motif. (c) Information about each of the hsa04640 proteins containing the WS.WS motif. Clicking on the ‘Show’ button provides more detailed information, including the protein structure visualization with the motif hit(s) highlighted. (d) List of motif overlap(s) to similar motifs in other databases; the last column reports the CompariMotif [32] similarity score (NormIC). (e) GO terms shared by the hsa04640 pathway proteins that have the motif; the last column reports the fraction of the proteins hosting the motif that share a GO term.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3140502&req=5

pone-0022270-g003: The information provided by MoDiPath for the hsa04640 KEGG pathway.(a) First column: the SLiM regular expression; Second column: a ‘+’ is reported if the motif overlaps to a similar motif in other databases (the list of which is shown by moving the mouse over the ‘+’); Third column: the hyper-geometric p-value of the number of motif hits in the hsa04640 pathway compared to the number of motif hits in the SwissProt database; Fourth column: The fraction of proteins in the hsa04640 pathway that contain the WS.WS motif (b) Multiple sequence alignment of the hsa04640 pathway proteins containing the WS.WS motif. (c) Information about each of the hsa04640 proteins containing the WS.WS motif. Clicking on the ‘Show’ button provides more detailed information, including the protein structure visualization with the motif hit(s) highlighted. (d) List of motif overlap(s) to similar motifs in other databases; the last column reports the CompariMotif [32] similarity score (NormIC). (e) GO terms shared by the hsa04640 pathway proteins that have the motif; the last column reports the fraction of the proteins hosting the motif that share a GO term.

Mentions: Figure 3 shows a screenshot with the information provided by MoDiPath for the WS.WS motif, which is specific for the hsa04640 KEGG pathway. For each protein sharing the motif, a page containing functional and structural details is provided. In particular, if the protein is of known structure, the position of the matching sub-sequence is displayed in the context of its three-dimensional structure.


Exploiting publicly available biological and biochemical information for the discovery of novel short linear motifs.

Sayadi A, Briganti L, Tramontano A, Via A - PLoS ONE (2011)

The information provided by MoDiPath for the hsa04640 KEGG pathway.(a) First column: the SLiM regular expression; Second column: a ‘+’ is reported if the motif overlaps to a similar motif in other databases (the list of which is shown by moving the mouse over the ‘+’); Third column: the hyper-geometric p-value of the number of motif hits in the hsa04640 pathway compared to the number of motif hits in the SwissProt database; Fourth column: The fraction of proteins in the hsa04640 pathway that contain the WS.WS motif (b) Multiple sequence alignment of the hsa04640 pathway proteins containing the WS.WS motif. (c) Information about each of the hsa04640 proteins containing the WS.WS motif. Clicking on the ‘Show’ button provides more detailed information, including the protein structure visualization with the motif hit(s) highlighted. (d) List of motif overlap(s) to similar motifs in other databases; the last column reports the CompariMotif [32] similarity score (NormIC). (e) GO terms shared by the hsa04640 pathway proteins that have the motif; the last column reports the fraction of the proteins hosting the motif that share a GO term.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3140502&req=5

pone-0022270-g003: The information provided by MoDiPath for the hsa04640 KEGG pathway.(a) First column: the SLiM regular expression; Second column: a ‘+’ is reported if the motif overlaps to a similar motif in other databases (the list of which is shown by moving the mouse over the ‘+’); Third column: the hyper-geometric p-value of the number of motif hits in the hsa04640 pathway compared to the number of motif hits in the SwissProt database; Fourth column: The fraction of proteins in the hsa04640 pathway that contain the WS.WS motif (b) Multiple sequence alignment of the hsa04640 pathway proteins containing the WS.WS motif. (c) Information about each of the hsa04640 proteins containing the WS.WS motif. Clicking on the ‘Show’ button provides more detailed information, including the protein structure visualization with the motif hit(s) highlighted. (d) List of motif overlap(s) to similar motifs in other databases; the last column reports the CompariMotif [32] similarity score (NormIC). (e) GO terms shared by the hsa04640 pathway proteins that have the motif; the last column reports the fraction of the proteins hosting the motif that share a GO term.
Mentions: Figure 3 shows a screenshot with the information provided by MoDiPath for the WS.WS motif, which is specific for the hsa04640 KEGG pathway. For each protein sharing the motif, a page containing functional and structural details is provided. In particular, if the protein is of known structure, the position of the matching sub-sequence is displayed in the context of its three-dimensional structure.

Bottom Line: Consequently, only a small fraction of them have been discovered so far.We describe here an approach for the discovery of SLiMs based on their occurrence in evolutionarily unrelated proteins belonging to the same biological, signalling or metabolic pathway and give specific examples of its effectiveness in both rediscovering known motifs and in discovering novel ones.An automatic implementation of the procedure, available for download, allows significant motifs to be identified, automatically annotated with functional, evolutionary and structural information and organized in a database that can be inspected and queried.

View Article: PubMed Central - PubMed

Affiliation: Department of Physics, Sapienza University of Rome, Rome, Italy.

ABSTRACT
The function of proteins is often mediated by short linear segments of their amino acid sequence, called Short Linear Motifs or SLiMs, the identification of which can provide important information about a protein function. However, the short length of the motifs and their variable degree of conservation makes their identification hard since it is difficult to correctly estimate the statistical significance of their occurrence. Consequently, only a small fraction of them have been discovered so far. We describe here an approach for the discovery of SLiMs based on their occurrence in evolutionarily unrelated proteins belonging to the same biological, signalling or metabolic pathway and give specific examples of its effectiveness in both rediscovering known motifs and in discovering novel ones. An automatic implementation of the procedure, available for download, allows significant motifs to be identified, automatically annotated with functional, evolutionary and structural information and organized in a database that can be inspected and queried. An instance of the database populated with pre-computed data on seven organisms is accessible through a publicly available server and we believe it constitutes by itself a useful resource for the life sciences (http://www.biocomputing.it/modipath).

Show MeSH