Limits...
A multi-center study benchmarks software tools for label-free proteome quantification

View Article: PubMed Central - PubMed

ABSTRACT

The consistent and accurate quantification of proteins by mass spectrometry (MS)-based proteomics depends on the performance of instruments, acquisition methods and data analysis software. In collaboration with the software developers, we evaluated OpenSWATH, SWATH2.0, Skyline, Spectronaut and DIA-Umpire, five of the most widely used software methods for processing data from SWATH-MS (sequential window acquisition of all theoretical fragment ion spectra), a method that uses data-independent acquisition (DIA) for label-free protein quantification. We analyzed high-complexity test datasets from hybrid proteome samples of defined quantitative composition acquired on two different MS instruments using different SWATH isolation windows setups. For consistent evaluation we developed LFQbench, an R-package to calculate metrics of precision and accuracy in label-free quantitative MS, and report the identification performance, robustness and specificity of each software tool. Our reference datasets enabled developers to improve their software tools. After optimization, all tools provided highly convergent identification and reliable quantification performance, underscoring their robustness for label-free quantitative proteomics.

No MeSH data available.


Integrated analysis of the five software tools.(a) Overlap of quantified peptides and proteins for library-based tools. The font size of each element is proportional to the number of peptides or proteins displayed. (b) Overlap of quantified peptides and proteins by all software tools. The font size of each element is proportional to the number of peptides or proteins displayed. An asterisk indicates protein/peptide numbers below ten. (c) Protein abundance distribution of peptides and proteins detected by DIA-Umpire. Red: peptides or proteins shared with other software tools. Turquoise: peptides or proteins detected exclusively by DIA-Umpire.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5120688&req=5

Figure 3: Integrated analysis of the five software tools.(a) Overlap of quantified peptides and proteins for library-based tools. The font size of each element is proportional to the number of peptides or proteins displayed. (b) Overlap of quantified peptides and proteins by all software tools. The font size of each element is proportional to the number of peptides or proteins displayed. An asterisk indicates protein/peptide numbers below ten. (c) Protein abundance distribution of peptides and proteins detected by DIA-Umpire. Red: peptides or proteins shared with other software tools. Turquoise: peptides or proteins detected exclusively by DIA-Umpire.

Mentions: In the HYE124 dataset, the library based tools identified in the iteration 2 between 35,489 and 42,517 peptides mapping to between 3,673 and 4,692 proteins. Notably, we observed an exceptional overlap between these tools, as 93% of all identified peptides and 95% of proteins were identified by at least three out of the four library-based tools (Figure 3 A). The overlap of all five tools was 22,407 peptides and 3,064 proteins (Figure 3 B). On the peptide level, the results provided by the library-based tools covered 65% of the sequences provided by DIA-Umpire, which additionally identified 12,748 sequences not found by the library-based approaches (Figure 3 B), in part due to slightly different search parameters. Only 288 sequences of those identified exclusively by DIA-Umpire were present in the assay library and may potentially be false negative cases for the library-based workflow. Notably, the overlap on protein level was remarkably higher (86%), similarly to the typical overlap between different DDA search engines25, indicating that DIA-Umpire may cover additional peptides not included in the assay library, e.g. singly charged peptide ions (726 peptides), which are usually not triggered for MS/MS in DDA experiments and thus not included in the consensus library. The number of peptides per protein was similar for all library-based tools (Supplementary Figure 19).


A multi-center study benchmarks software tools for label-free proteome quantification
Integrated analysis of the five software tools.(a) Overlap of quantified peptides and proteins for library-based tools. The font size of each element is proportional to the number of peptides or proteins displayed. (b) Overlap of quantified peptides and proteins by all software tools. The font size of each element is proportional to the number of peptides or proteins displayed. An asterisk indicates protein/peptide numbers below ten. (c) Protein abundance distribution of peptides and proteins detected by DIA-Umpire. Red: peptides or proteins shared with other software tools. Turquoise: peptides or proteins detected exclusively by DIA-Umpire.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5120688&req=5

Figure 3: Integrated analysis of the five software tools.(a) Overlap of quantified peptides and proteins for library-based tools. The font size of each element is proportional to the number of peptides or proteins displayed. (b) Overlap of quantified peptides and proteins by all software tools. The font size of each element is proportional to the number of peptides or proteins displayed. An asterisk indicates protein/peptide numbers below ten. (c) Protein abundance distribution of peptides and proteins detected by DIA-Umpire. Red: peptides or proteins shared with other software tools. Turquoise: peptides or proteins detected exclusively by DIA-Umpire.
Mentions: In the HYE124 dataset, the library based tools identified in the iteration 2 between 35,489 and 42,517 peptides mapping to between 3,673 and 4,692 proteins. Notably, we observed an exceptional overlap between these tools, as 93% of all identified peptides and 95% of proteins were identified by at least three out of the four library-based tools (Figure 3 A). The overlap of all five tools was 22,407 peptides and 3,064 proteins (Figure 3 B). On the peptide level, the results provided by the library-based tools covered 65% of the sequences provided by DIA-Umpire, which additionally identified 12,748 sequences not found by the library-based approaches (Figure 3 B), in part due to slightly different search parameters. Only 288 sequences of those identified exclusively by DIA-Umpire were present in the assay library and may potentially be false negative cases for the library-based workflow. Notably, the overlap on protein level was remarkably higher (86%), similarly to the typical overlap between different DDA search engines25, indicating that DIA-Umpire may cover additional peptides not included in the assay library, e.g. singly charged peptide ions (726 peptides), which are usually not triggered for MS/MS in DDA experiments and thus not included in the consensus library. The number of peptides per protein was similar for all library-based tools (Supplementary Figure 19).

View Article: PubMed Central - PubMed

ABSTRACT

The consistent and accurate quantification of proteins by mass spectrometry (MS)-based proteomics depends on the performance of instruments, acquisition methods and data analysis software. In collaboration with the software developers, we evaluated OpenSWATH, SWATH2.0, Skyline, Spectronaut and DIA-Umpire, five of the most widely used software methods for processing data from SWATH-MS (sequential window acquisition of all theoretical fragment ion spectra), a method that uses data-independent acquisition (DIA) for label-free protein quantification. We analyzed high-complexity test datasets from hybrid proteome samples of defined quantitative composition acquired on two different MS instruments using different SWATH isolation windows setups. For consistent evaluation we developed LFQbench, an R-package to calculate metrics of precision and accuracy in label-free quantitative MS, and report the identification performance, robustness and specificity of each software tool. Our reference datasets enabled developers to improve their software tools. After optimization, all tools provided highly convergent identification and reliable quantification performance, underscoring their robustness for label-free quantitative proteomics.

No MeSH data available.