Limits...
Enhanced peptide identification by electron transfer dissociation using an improved Mascot Percolator.

Wright JC, Collins MO, Yu L, Käll L, Brosch M, Choudhary JS - Mol. Cell Proteomics (2012)

Bottom Line: We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data.Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%).We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.

View Article: PubMed Central - PubMed

Affiliation: Proteomic Mass Spectrometry, Wellcome Trust Sanger Institute, Hinxton, Cambridge.

ABSTRACT
Peptide identification using tandem mass spectrometry is a core technology in proteomics. Latest generations of mass spectrometry instruments enable the use of electron transfer dissociation (ETD) to complement collision induced dissociation (CID) for peptide fragmentation. However, a critical limitation to the use of ETD has been optimal database search software. Percolator is a post-search algorithm, which uses semi-supervised machine learning to improve the rate of peptide spectrum identifications (PSMs) together with providing reliable significance measures. We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data. Here, we report recent developments in the Mascot Percolator V2.0 software including an improved feature calculator and support for a wider range of ion series. The updated software is applied to the analysis of several CID and ETD fragmented peptide data sets. This version of Mascot Percolator increases the number of CID PSMs by up to 80% and ETD PSMs by up to 60% at a 0.01 q-value (1% false discovery rate) threshold over a standard Mascot search, notably recovering PSMs from high charge state precursor ions. The greatly increased number of PSMs and peptide coverage afforded by Mascot Percolator has enabled a fuller assessment of CID/ETD complementarity to be performed. Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%). We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.

Show MeSH

Related in: MedlinePlus

Yeast unique peptide venn plots—The overlap in unique peptides identified between Mascot, OMSSA, and Mascot Percolator at a PSM q-value threshold of 0.01 for the Yeast ETD and ETcaD data sets. These Venn plots are not drawn to scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3412976&req=5

Figure 4: Yeast unique peptide venn plots—The overlap in unique peptides identified between Mascot, OMSSA, and Mascot Percolator at a PSM q-value threshold of 0.01 for the Yeast ETD and ETcaD data sets. These Venn plots are not drawn to scale.

Mentions: At the peptide level, large gains of 48% in the ETD and 34% in the ETcaD data sets for Mascot Percolator over the original Mascot search are again noted (Table 3B). The prominent gain in the standard ETD experiment can be attributed to the lower number of doubly charged peptides in this data. The number of unique 2+ peptides identified from the ETD data increases from 818 for Mascot to 2479 for Mascot Percolator, similarly for the ETcaD data it increases from 2065 to 3162. This represents an increase of 203% for ETD and 53% for the ETcaD; when compared with unique peptides >2+ the improvement remains more consistent at 30% for the ETD and 28% for the ETcaD. The Venn diagrams in Figure 4 show that Mascot Percolator boosts the significance of unique peptides that were significant in the OMSSA search but not the Mascot search, including many peptides at higher charge states. Moreover, Mascot Percolator gives confidence to a large number of unique peptides that are not reported as significant in either of the stand alone searches at this q-value threshold. Less than 1.4% of the total unique peptide identifications at a 0.01 q-value threshold across the three different analysis tools are not significant in the Mascot Percolator results; it is noticeable that these are only observed by one of the search engines. The number of protein clusters identified at a cluster level FDR of 1% increases from 1176 in the ETD data and 1183 in the ETcaD data for OMSSA to 1264 in the ETD data and 1300 in the ETcaD data for Mascot and then up to 1574 in the ETD data and 1611 in the ETcaD data for Mascot Percolator. This corresponds to an increase of 25 and 24% for the ETD and ETcaD data sets, respectively. This implies that greater proteome coverage can be achieved with fewer experiments using Mascot Percolator.


Enhanced peptide identification by electron transfer dissociation using an improved Mascot Percolator.

Wright JC, Collins MO, Yu L, Käll L, Brosch M, Choudhary JS - Mol. Cell Proteomics (2012)

Yeast unique peptide venn plots—The overlap in unique peptides identified between Mascot, OMSSA, and Mascot Percolator at a PSM q-value threshold of 0.01 for the Yeast ETD and ETcaD data sets. These Venn plots are not drawn to scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3412976&req=5

Figure 4: Yeast unique peptide venn plots—The overlap in unique peptides identified between Mascot, OMSSA, and Mascot Percolator at a PSM q-value threshold of 0.01 for the Yeast ETD and ETcaD data sets. These Venn plots are not drawn to scale.
Mentions: At the peptide level, large gains of 48% in the ETD and 34% in the ETcaD data sets for Mascot Percolator over the original Mascot search are again noted (Table 3B). The prominent gain in the standard ETD experiment can be attributed to the lower number of doubly charged peptides in this data. The number of unique 2+ peptides identified from the ETD data increases from 818 for Mascot to 2479 for Mascot Percolator, similarly for the ETcaD data it increases from 2065 to 3162. This represents an increase of 203% for ETD and 53% for the ETcaD; when compared with unique peptides >2+ the improvement remains more consistent at 30% for the ETD and 28% for the ETcaD. The Venn diagrams in Figure 4 show that Mascot Percolator boosts the significance of unique peptides that were significant in the OMSSA search but not the Mascot search, including many peptides at higher charge states. Moreover, Mascot Percolator gives confidence to a large number of unique peptides that are not reported as significant in either of the stand alone searches at this q-value threshold. Less than 1.4% of the total unique peptide identifications at a 0.01 q-value threshold across the three different analysis tools are not significant in the Mascot Percolator results; it is noticeable that these are only observed by one of the search engines. The number of protein clusters identified at a cluster level FDR of 1% increases from 1176 in the ETD data and 1183 in the ETcaD data for OMSSA to 1264 in the ETD data and 1300 in the ETcaD data for Mascot and then up to 1574 in the ETD data and 1611 in the ETcaD data for Mascot Percolator. This corresponds to an increase of 25 and 24% for the ETD and ETcaD data sets, respectively. This implies that greater proteome coverage can be achieved with fewer experiments using Mascot Percolator.

Bottom Line: We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data.Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%).We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.

View Article: PubMed Central - PubMed

Affiliation: Proteomic Mass Spectrometry, Wellcome Trust Sanger Institute, Hinxton, Cambridge.

ABSTRACT
Peptide identification using tandem mass spectrometry is a core technology in proteomics. Latest generations of mass spectrometry instruments enable the use of electron transfer dissociation (ETD) to complement collision induced dissociation (CID) for peptide fragmentation. However, a critical limitation to the use of ETD has been optimal database search software. Percolator is a post-search algorithm, which uses semi-supervised machine learning to improve the rate of peptide spectrum identifications (PSMs) together with providing reliable significance measures. We have previously interfaced the Mascot search engine with Percolator and demonstrated sensitivity and specificity benefits with CID data. Here, we report recent developments in the Mascot Percolator V2.0 software including an improved feature calculator and support for a wider range of ion series. The updated software is applied to the analysis of several CID and ETD fragmented peptide data sets. This version of Mascot Percolator increases the number of CID PSMs by up to 80% and ETD PSMs by up to 60% at a 0.01 q-value (1% false discovery rate) threshold over a standard Mascot search, notably recovering PSMs from high charge state precursor ions. The greatly increased number of PSMs and peptide coverage afforded by Mascot Percolator has enabled a fuller assessment of CID/ETD complementarity to be performed. Using a data set of CID and ETcaD spectral pairs, we find that at a 1% false discovery rate, the overlap in peptide identifications by CID and ETD is 83%, which is significantly higher than that obtained using either stand-alone Mascot (69%) or OMSSA (39%). We conclude that Mascot Percolator is a highly sensitive and accurate post-search algorithm for peptide identification and allows direct comparison of peptide identifications using multiple alternative fragmentation techniques.

Show MeSH
Related in: MedlinePlus