Limits...
Enhanced peptide identification by electron transfer dissociation using an improved Mascot Percolator.

Wright JC, Collins MO, Yu L, Käll L, Brosch M, Choudhary JS - Mol. Cell Proteomics (2012)

Bottom Line: We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data.Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%).We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.

View Article: PubMed Central - PubMed

Affiliation: Proteomic Mass Spectrometry, Wellcome Trust Sanger Institute, Hinxton, Cambridge.

ABSTRACT
Peptide identification using tandem mass spectrometry is a core technology in proteomics. Latest generations of mass spectrometry instruments enable the use of electron transfer dissociation (ETD) to complement collision induced dissociation (CID) for peptide fragmentation. However, a critical limitation to the use of ETD has been optimal database search software. Percolator is a post-search algorithm, which uses semi-supervised machine learning to improve the rate of peptide spectrum identifications (PSMs) together with providing reliable significance measures. We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data. Here, we report recent developments in the Mascot Percolator V2.0 software including an improved feature calculator and support for a wider range of ion series. The updated software is applied to the analysis of several CID and ETD fragmented peptide data sets. This version of Mascot Percolator increases the number of CID PSMs by up to 80% and ETD PSMs by up to 60% at a 0.01 q-value (1% false discovery rate) threshold over a standard Mascot search, notably recovering PSMs from high charge state precursor ions. The greatly increased number of PSMs and peptide coverage afforded by Mascot Percolator has enabled a fuller assessment of CID/ETD complementarity to be performed. Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%). We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.

Show MeSH

Related in: MedlinePlus

E. coli sequential experiments—Heat maps highlighting the numbers of unique peptide identifications for Mascot, OMSSA, and Mascot Percolator across the range of m/z and precursor charge state, where a PSM q-value threshold of 0.01 has been applied. The top left six maps show the distribution of unique peptide identifications for CID or ETcaD PSMs for each identification method. The three right hand heat maps show the difference in numbers of CID and ETcaD peptide identifications; a negative number reflects a greater number of ETcaD peptides and a positive number reflects a greater number of CID peptides. The lower four heat maps show the differences in unique peptide identifications between Mascot versus Mascot Percolator, and OMSSA versus Mascot Percolator; a positive percentage represents a gain in the number of significant peptides identified with Mascot Percolator.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3412976&req=5

Figure 6: E. coli sequential experiments—Heat maps highlighting the numbers of unique peptide identifications for Mascot, OMSSA, and Mascot Percolator across the range of m/z and precursor charge state, where a PSM q-value threshold of 0.01 has been applied. The top left six maps show the distribution of unique peptide identifications for CID or ETcaD PSMs for each identification method. The three right hand heat maps show the difference in numbers of CID and ETcaD peptide identifications; a negative number reflects a greater number of ETcaD peptides and a positive number reflects a greater number of CID peptides. The lower four heat maps show the differences in unique peptide identifications between Mascot versus Mascot Percolator, and OMSSA versus Mascot Percolator; a positive percentage represents a gain in the number of significant peptides identified with Mascot Percolator.

Mentions: We also conducted sequential fragmentation experiments using the partially digested E. coli sample, in which each precursor is analyzed sequentially by CID and ETcaD, thereby generating spectral pairs for direct comparison of fragmentation patterns (Table 4B). 20,016 CID/ETD spectra pairs were collected in this data set. Fig. 6 compares the number of PSMs identified from the CID and ETcaD spectra using the search methods across the full range of m/z and charge state. Direct comparison between search methods has been made by calculating the percentage increase in PSMs identified by Mascot Percolator compared with Mascot and OMSSA at each m/z and charge state bin using a PSM q-value threshold of 0.01. The heat map highlights that Mascot Percolator boosts spectral identifications across the whole mass and charge ranges of both fragmentation types. Mascot Percolator is especially effective in improving the identification of spectra from larger and more highly charged peptides, significantly identifying eight CID PSMs with a 6+ charge state compared with the two and three PSMs identified by Mascot and OMSSA, and at the same time increasing the number of PSMs at the highest m/z for each charge state. This increase in range is also seen in the ETcaD data set where Mascot Percolator finds three significant PSMs with an 8+ charge state when none are significant in the Mascot and OMSSA results, also 83 PSMs with an m/z of 1000 or greater are significant compared with only 37 for Mascot and 58 for OMSSA. Examining the three right hand heat maps from Fig. 6, in which the ETcaD PSMs have been subtracted from the CID PSMs, ETcaD performs better than CID for high-charge low-mass peptides with both Mascot and OMSSA, as reported in previous studies (5, 27, 37). Moreover, this difference is apparent for OMSSA where the 3+ charge m/z bin at which the number of CID PSMs becomes greater than ETcaD PSMs is 500 m/z compared with the 800 m/z bin for Mascot. number of ETcaD PSMs at 4+ charge states below 700 m/z increases. Interestingly, Mascot Percolator extends CID spectral identifications to provide better coverage of higher charge states. This effect is also seen with the ETcaD data, improving the number of PSMs above the strict q-value threshold at very high charge states (greater than 4+).


Enhanced peptide identification by electron transfer dissociation using an improved Mascot Percolator.

Wright JC, Collins MO, Yu L, Käll L, Brosch M, Choudhary JS - Mol. Cell Proteomics (2012)

E. coli sequential experiments—Heat maps highlighting the numbers of unique peptide identifications for Mascot, OMSSA, and Mascot Percolator across the range of m/z and precursor charge state, where a PSM q-value threshold of 0.01 has been applied. The top left six maps show the distribution of unique peptide identifications for CID or ETcaD PSMs for each identification method. The three right hand heat maps show the difference in numbers of CID and ETcaD peptide identifications; a negative number reflects a greater number of ETcaD peptides and a positive number reflects a greater number of CID peptides. The lower four heat maps show the differences in unique peptide identifications between Mascot versus Mascot Percolator, and OMSSA versus Mascot Percolator; a positive percentage represents a gain in the number of significant peptides identified with Mascot Percolator.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3412976&req=5

Figure 6: E. coli sequential experiments—Heat maps highlighting the numbers of unique peptide identifications for Mascot, OMSSA, and Mascot Percolator across the range of m/z and precursor charge state, where a PSM q-value threshold of 0.01 has been applied. The top left six maps show the distribution of unique peptide identifications for CID or ETcaD PSMs for each identification method. The three right hand heat maps show the difference in numbers of CID and ETcaD peptide identifications; a negative number reflects a greater number of ETcaD peptides and a positive number reflects a greater number of CID peptides. The lower four heat maps show the differences in unique peptide identifications between Mascot versus Mascot Percolator, and OMSSA versus Mascot Percolator; a positive percentage represents a gain in the number of significant peptides identified with Mascot Percolator.
Mentions: We also conducted sequential fragmentation experiments using the partially digested E. coli sample, in which each precursor is analyzed sequentially by CID and ETcaD, thereby generating spectral pairs for direct comparison of fragmentation patterns (Table 4B). 20,016 CID/ETD spectra pairs were collected in this data set. Fig. 6 compares the number of PSMs identified from the CID and ETcaD spectra using the search methods across the full range of m/z and charge state. Direct comparison between search methods has been made by calculating the percentage increase in PSMs identified by Mascot Percolator compared with Mascot and OMSSA at each m/z and charge state bin using a PSM q-value threshold of 0.01. The heat map highlights that Mascot Percolator boosts spectral identifications across the whole mass and charge ranges of both fragmentation types. Mascot Percolator is especially effective in improving the identification of spectra from larger and more highly charged peptides, significantly identifying eight CID PSMs with a 6+ charge state compared with the two and three PSMs identified by Mascot and OMSSA, and at the same time increasing the number of PSMs at the highest m/z for each charge state. This increase in range is also seen in the ETcaD data set where Mascot Percolator finds three significant PSMs with an 8+ charge state when none are significant in the Mascot and OMSSA results, also 83 PSMs with an m/z of 1000 or greater are significant compared with only 37 for Mascot and 58 for OMSSA. Examining the three right hand heat maps from Fig. 6, in which the ETcaD PSMs have been subtracted from the CID PSMs, ETcaD performs better than CID for high-charge low-mass peptides with both Mascot and OMSSA, as reported in previous studies (5, 27, 37). Moreover, this difference is apparent for OMSSA where the 3+ charge m/z bin at which the number of CID PSMs becomes greater than ETcaD PSMs is 500 m/z compared with the 800 m/z bin for Mascot. number of ETcaD PSMs at 4+ charge states below 700 m/z increases. Interestingly, Mascot Percolator extends CID spectral identifications to provide better coverage of higher charge states. This effect is also seen with the ETcaD data, improving the number of PSMs above the strict q-value threshold at very high charge states (greater than 4+).

Bottom Line: We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data.Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%).We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.

View Article: PubMed Central - PubMed

Affiliation: Proteomic Mass Spectrometry, Wellcome Trust Sanger Institute, Hinxton, Cambridge.

ABSTRACT
Peptide identification using tandem mass spectrometry is a core technology in proteomics. Latest generations of mass spectrometry instruments enable the use of electron transfer dissociation (ETD) to complement collision induced dissociation (CID) for peptide fragmentation. However, a critical limitation to the use of ETD has been optimal database search software. Percolator is a post-search algorithm, which uses semi-supervised machine learning to improve the rate of peptide spectrum identifications (PSMs) together with providing reliable significance measures. We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data. Here, we report recent developments in the Mascot Percolator V2.0 software including an improved feature calculator and support for a wider range of ion series. The updated software is applied to the analysis of several CID and ETD fragmented peptide data sets. This version of Mascot Percolator increases the number of CID PSMs by up to 80% and ETD PSMs by up to 60% at a 0.01 q-value (1% false discovery rate) threshold over a standard Mascot search, notably recovering PSMs from high charge state precursor ions. The greatly increased number of PSMs and peptide coverage afforded by Mascot Percolator has enabled a fuller assessment of CID/ETD complementarity to be performed. Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%). We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.

Show MeSH
Related in: MedlinePlus