Limits...
QSLiMFinder: improved short linear motif prediction using specific query protein data.

Palopoli N, Lythgow KT, Edwards RJ - Bioinformatics (2015)

Bottom Line: QSLiMFinder was extensively benchmarked using known SLiM-containing proteins and simulated protein interaction datasets of real human proteins.Exploiting prior knowledge of a query protein likely to be involved in a SLiM-mediated interaction increased the proportion of true positives correctly returned and reduced the proportion of datasets returning a false positive prediction.The biggest improvement was seen if a short region of the query protein flanking the interaction site was known.

View Article: PubMed Central - PubMed

Affiliation: Centre for Biological Sciences, University of Southampton, Southampton, UK.

No MeSH data available.


Comparison of the effect of incorporating ambiguity on motif definition on the proportion of SimBench datasets returning (a) at least one TP (SN) and (b) at least one FP (FPX) when searches are performed using QSLiMFinder (QSF) and SLiMFinder (SF). Results are plot at different SLiMChance significance cut-offs (0.05, 0.01, 0.005, 0.001, 5 e-04, 1 e-04, 1 e-05, 1 e-06, 1 e-07, 1 e-08, 1 e-09, 1 e-10; in panel (b) results are truncated at 1 e-04, the least significant cut-off for which FPX = 0.) Searches were made with the whole protein (‘none’, circles), with a window of five residues flanking the known ELM at each side (‘flank5’, triangles) or with the region of the motif only (‘site’, squares)
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4495300&req=5

btv155-F5: Comparison of the effect of incorporating ambiguity on motif definition on the proportion of SimBench datasets returning (a) at least one TP (SN) and (b) at least one FP (FPX) when searches are performed using QSLiMFinder (QSF) and SLiMFinder (SF). Results are plot at different SLiMChance significance cut-offs (0.05, 0.01, 0.005, 0.001, 5 e-04, 1 e-04, 1 e-05, 1 e-06, 1 e-07, 1 e-08, 1 e-09, 1 e-10; in panel (b) results are truncated at 1 e-04, the least significant cut-off for which FPX = 0.) Searches were made with the whole protein (‘none’, circles), with a window of five residues flanking the known ELM at each side (‘flank5’, triangles) or with the region of the motif only (‘site’, squares)

Mentions: Reducing the motif space to that of the query does not come without cost. In addition to removing one of the TP instances, the ability to incorporate ambiguity is compromised. SLiMBuild constructs ambiguous positions by combining different fixed SLiM patterns according to an ‘equivalence list’ of permitted ambiguities, provided that they extend dataset coverage (support) versus the individual fixed patterns. Because QSLiMFinder builds the motif space from the query alone, it cannot incorporate pattern variants found elsewhere in the data without violating the SLiMChance model. Incorporating ambiguity in QSLiMFinder therefore results in over-prediction and elevated FP rates, whilst SLiMFinder is less affected (Fig. 5). However, ambiguity can be useful to providing a more nuanced motif definition than fixed position motifs alone (Edwards et al., 2007) and does give a marginal improvement in SN (Fig. 5a). A possible workaround is to enable the return of ambiguous motifs but exclude them as FPs unless a significant fixed position pattern is returned in the same motif cloud (set of overlapping motifs [Edwards et al., 2007]). This is provided as a new option (cloudfix = T) in SLiMFinder and QSLiMFinder.Fig. 5.


QSLiMFinder: improved short linear motif prediction using specific query protein data.

Palopoli N, Lythgow KT, Edwards RJ - Bioinformatics (2015)

Comparison of the effect of incorporating ambiguity on motif definition on the proportion of SimBench datasets returning (a) at least one TP (SN) and (b) at least one FP (FPX) when searches are performed using QSLiMFinder (QSF) and SLiMFinder (SF). Results are plot at different SLiMChance significance cut-offs (0.05, 0.01, 0.005, 0.001, 5 e-04, 1 e-04, 1 e-05, 1 e-06, 1 e-07, 1 e-08, 1 e-09, 1 e-10; in panel (b) results are truncated at 1 e-04, the least significant cut-off for which FPX = 0.) Searches were made with the whole protein (‘none’, circles), with a window of five residues flanking the known ELM at each side (‘flank5’, triangles) or with the region of the motif only (‘site’, squares)
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4495300&req=5

btv155-F5: Comparison of the effect of incorporating ambiguity on motif definition on the proportion of SimBench datasets returning (a) at least one TP (SN) and (b) at least one FP (FPX) when searches are performed using QSLiMFinder (QSF) and SLiMFinder (SF). Results are plot at different SLiMChance significance cut-offs (0.05, 0.01, 0.005, 0.001, 5 e-04, 1 e-04, 1 e-05, 1 e-06, 1 e-07, 1 e-08, 1 e-09, 1 e-10; in panel (b) results are truncated at 1 e-04, the least significant cut-off for which FPX = 0.) Searches were made with the whole protein (‘none’, circles), with a window of five residues flanking the known ELM at each side (‘flank5’, triangles) or with the region of the motif only (‘site’, squares)
Mentions: Reducing the motif space to that of the query does not come without cost. In addition to removing one of the TP instances, the ability to incorporate ambiguity is compromised. SLiMBuild constructs ambiguous positions by combining different fixed SLiM patterns according to an ‘equivalence list’ of permitted ambiguities, provided that they extend dataset coverage (support) versus the individual fixed patterns. Because QSLiMFinder builds the motif space from the query alone, it cannot incorporate pattern variants found elsewhere in the data without violating the SLiMChance model. Incorporating ambiguity in QSLiMFinder therefore results in over-prediction and elevated FP rates, whilst SLiMFinder is less affected (Fig. 5). However, ambiguity can be useful to providing a more nuanced motif definition than fixed position motifs alone (Edwards et al., 2007) and does give a marginal improvement in SN (Fig. 5a). A possible workaround is to enable the return of ambiguous motifs but exclude them as FPs unless a significant fixed position pattern is returned in the same motif cloud (set of overlapping motifs [Edwards et al., 2007]). This is provided as a new option (cloudfix = T) in SLiMFinder and QSLiMFinder.Fig. 5.

Bottom Line: QSLiMFinder was extensively benchmarked using known SLiM-containing proteins and simulated protein interaction datasets of real human proteins.Exploiting prior knowledge of a query protein likely to be involved in a SLiM-mediated interaction increased the proportion of true positives correctly returned and reduced the proportion of datasets returning a false positive prediction.The biggest improvement was seen if a short region of the query protein flanking the interaction site was known.

View Article: PubMed Central - PubMed

Affiliation: Centre for Biological Sciences, University of Southampton, Southampton, UK.

No MeSH data available.