Limits...
Universal database search tool for proteomics

View Article: PubMed Central - PubMed

ABSTRACT

Mass spectrometry (MS) instruments and experimental protocols are rapidly advancing, but the software tools to analyze tandem mass spectra are lagging behind. We present a database search tool MS-GF+ that is sensitive (it identifies more peptides than most other database search tools) and universal (it works well for diverse types of spectra, different configurations of MS instruments and different experimental protocols). We benchmark MS-GF+ using diverse spectral datasets: (i) spectra of varying fragmentation methods; (ii) spectra of multiple enzyme digests; (iii) spectra of phosphorylated peptides; (iv) spectra of peptides with unusual fragmentation propensities produced by a novel alpha-lytic protease. For all these datasets, MS-GF+ significantly increases the number of identified peptides compared to commonly used methods for peptide identifications. We emphasize that while MS-GF+ is not specifically designed for any particular experimental set-up, it improves upon the performance of tools specifically designed for these applications (e.g., specialized tools for phosphoproteomics).

No MeSH data available.


Benchmarking MS-GF+ against Mascot+Percolator. Percent increases in the number of identified PSMs for MS-GF+ compared to Mascot+Percolator for all 19 datasets. Each bar represents a spectral dataset of a specified spectral type. For (CID, Low, Standard, Trypsin) and (ETD, Low, Standard, Trypsin), there are two corresponding datasets, one from human and the other from yeast. We distinguish them by adding ‘*’ to the yeast datasets. For the (CID, Low, Phosphorylation, Trypsin) and (CID, Low, Ubiquitination, Trypsin) datasets, the number of phosphorylated and ubiquitinated PSMs were counted instead of the number of all identified PSMs. For the (ETD,Low,Standard,αLP) dataset, Mascot+Percolator identified no PSM.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5036525&req=5

Figure 2: Benchmarking MS-GF+ against Mascot+Percolator. Percent increases in the number of identified PSMs for MS-GF+ compared to Mascot+Percolator for all 19 datasets. Each bar represents a spectral dataset of a specified spectral type. For (CID, Low, Standard, Trypsin) and (ETD, Low, Standard, Trypsin), there are two corresponding datasets, one from human and the other from yeast. We distinguish them by adding ‘*’ to the yeast datasets. For the (CID, Low, Phosphorylation, Trypsin) and (CID, Low, Ubiquitination, Trypsin) datasets, the number of phosphorylated and ubiquitinated PSMs were counted instead of the number of all identified PSMs. For the (ETD,Low,Standard,αLP) dataset, Mascot+Percolator identified no PSM.

Mentions: For all the 19 datasets, MS-GF+ identified significantly more PSMs compared to Mascot+Percolator (Figure 2). Figure 3 (a) shows the benchmarking results for the five human datasets generated with varying fragmentations and instruments [20]. Percolator greatly increased the number of identifications as compared to Mascot, but for all these datasets, MS-GF+ identified significantly more PSMs (17–38%) than Mascot+Percolator (see Supplementary Fig. 1 for Venn diagrams of MS-GF+ and Mascot+Percolator identifications). We also compared the number of identifications reported by the original study [20] which also used Mascot+Percolator along with in-house pre- and post-processing tools. In this comparison, MS-GF+ also showed an improved performance (identifying 16–55% more PSMs).


Universal database search tool for proteomics
Benchmarking MS-GF+ against Mascot+Percolator. Percent increases in the number of identified PSMs for MS-GF+ compared to Mascot+Percolator for all 19 datasets. Each bar represents a spectral dataset of a specified spectral type. For (CID, Low, Standard, Trypsin) and (ETD, Low, Standard, Trypsin), there are two corresponding datasets, one from human and the other from yeast. We distinguish them by adding ‘*’ to the yeast datasets. For the (CID, Low, Phosphorylation, Trypsin) and (CID, Low, Ubiquitination, Trypsin) datasets, the number of phosphorylated and ubiquitinated PSMs were counted instead of the number of all identified PSMs. For the (ETD,Low,Standard,αLP) dataset, Mascot+Percolator identified no PSM.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5036525&req=5

Figure 2: Benchmarking MS-GF+ against Mascot+Percolator. Percent increases in the number of identified PSMs for MS-GF+ compared to Mascot+Percolator for all 19 datasets. Each bar represents a spectral dataset of a specified spectral type. For (CID, Low, Standard, Trypsin) and (ETD, Low, Standard, Trypsin), there are two corresponding datasets, one from human and the other from yeast. We distinguish them by adding ‘*’ to the yeast datasets. For the (CID, Low, Phosphorylation, Trypsin) and (CID, Low, Ubiquitination, Trypsin) datasets, the number of phosphorylated and ubiquitinated PSMs were counted instead of the number of all identified PSMs. For the (ETD,Low,Standard,αLP) dataset, Mascot+Percolator identified no PSM.
Mentions: For all the 19 datasets, MS-GF+ identified significantly more PSMs compared to Mascot+Percolator (Figure 2). Figure 3 (a) shows the benchmarking results for the five human datasets generated with varying fragmentations and instruments [20]. Percolator greatly increased the number of identifications as compared to Mascot, but for all these datasets, MS-GF+ identified significantly more PSMs (17–38%) than Mascot+Percolator (see Supplementary Fig. 1 for Venn diagrams of MS-GF+ and Mascot+Percolator identifications). We also compared the number of identifications reported by the original study [20] which also used Mascot+Percolator along with in-house pre- and post-processing tools. In this comparison, MS-GF+ also showed an improved performance (identifying 16–55% more PSMs).

View Article: PubMed Central - PubMed

ABSTRACT

Mass spectrometry (MS) instruments and experimental protocols are rapidly advancing, but the software tools to analyze tandem mass spectra are lagging behind. We present a database search tool MS-GF+ that is sensitive (it identifies more peptides than most other database search tools) and universal (it works well for diverse types of spectra, different configurations of MS instruments and different experimental protocols). We benchmark MS-GF+ using diverse spectral datasets: (i) spectra of varying fragmentation methods; (ii) spectra of multiple enzyme digests; (iii) spectra of phosphorylated peptides; (iv) spectra of peptides with unusual fragmentation propensities produced by a novel alpha-lytic protease. For all these datasets, MS-GF+ significantly increases the number of identified peptides compared to commonly used methods for peptide identifications. We emphasize that while MS-GF+ is not specifically designed for any particular experimental set-up, it improves upon the performance of tools specifically designed for these applications (e.g., specialized tools for phosphoproteomics).

No MeSH data available.