Limits...
Accurate peak list extraction from proteomic mass spectra for identification and profiling studies.

Barbarini N, Magni P - BMC Bioinformatics (2010)

Bottom Line: In both cases, the main phase of the data analysis is the procedure to extract the significant features from a mass spectrum.It has been developed principally to improve the precision of peak extraction in comparison to other reference algorithms.It contains many innovative features among which a sophisticated method for managing the overlapping isotopic distributions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Dipartimento di Informatica e Sistemistica, Università degli Studi di Pavia, Pavia, Italy. nicola.barbarini@unipv.it

ABSTRACT

Background: Mass spectrometry is an essential technique in proteomics both to identify the proteins of a biological sample and to compare proteomic profiles of different samples. In both cases, the main phase of the data analysis is the procedure to extract the significant features from a mass spectrum. Its final output is the so-called peak list which contains the mass, the charge and the intensity of every detected biomolecule. The main steps of the peak list extraction procedure are usually preprocessing, peak detection, peak selection, charge determination and monoisotoping operation.

Results: This paper describes an original algorithm for peak list extraction from low and high resolution mass spectra. It has been developed principally to improve the precision of peak extraction in comparison to other reference algorithms. It contains many innovative features among which a sophisticated method for managing the overlapping isotopic distributions.

Conclusions: The performances of the basic version of the algorithm and of its optional functionalities have been evaluated in this paper on both SELDI-TOF, MALDI-TOF and ESI-FTICR ECD mass spectra. Executable files of MassSpec, a MATLAB implementation of the peak list extraction procedure for Windows and Linux systems, can be downloaded free of charge for nonprofit institutions from the following web site: http://aimed11.unipv.it/MassSpec.

Show MeSH
Extraction of two overlapping IDs. The extraction of two overlapping IDs is helped by the application of the optional functionalities that exploit the correlation among replicates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2967564&req=5

Figure 8: Extraction of two overlapping IDs. The extraction of two overlapping IDs is helped by the application of the optional functionalities that exploit the correlation among replicates.

Mentions: Every spectrum is composed by 373401 different m/z values which are the candidate biomarkers (or features of the classification problem). The procedure proposed in this paper considered the different spectra in each group of subjects as replicated and it was applied on the sum spectrum by setting the maximum number of charges equals to 2 and RPin equals to 8000. By using only the basic version (without optional functionalities) good results in term of feature reduction were reached: 617 different IDs were extracted both singly and doubly charged. Whereas by applying the full procedure 560 IDs (singly and doubly charged) were extracted. The application of the optional functionalities, especially that for the management of the ID overlapping, is very useful, given the great complexity of the high resolution MS profile of the entire low molecular weight human serum (see Figure 8). It was feasible thanks the great number of spectra (subjects) composing the dataset.


Accurate peak list extraction from proteomic mass spectra for identification and profiling studies.

Barbarini N, Magni P - BMC Bioinformatics (2010)

Extraction of two overlapping IDs. The extraction of two overlapping IDs is helped by the application of the optional functionalities that exploit the correlation among replicates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2967564&req=5

Figure 8: Extraction of two overlapping IDs. The extraction of two overlapping IDs is helped by the application of the optional functionalities that exploit the correlation among replicates.
Mentions: Every spectrum is composed by 373401 different m/z values which are the candidate biomarkers (or features of the classification problem). The procedure proposed in this paper considered the different spectra in each group of subjects as replicated and it was applied on the sum spectrum by setting the maximum number of charges equals to 2 and RPin equals to 8000. By using only the basic version (without optional functionalities) good results in term of feature reduction were reached: 617 different IDs were extracted both singly and doubly charged. Whereas by applying the full procedure 560 IDs (singly and doubly charged) were extracted. The application of the optional functionalities, especially that for the management of the ID overlapping, is very useful, given the great complexity of the high resolution MS profile of the entire low molecular weight human serum (see Figure 8). It was feasible thanks the great number of spectra (subjects) composing the dataset.

Bottom Line: In both cases, the main phase of the data analysis is the procedure to extract the significant features from a mass spectrum.It has been developed principally to improve the precision of peak extraction in comparison to other reference algorithms.It contains many innovative features among which a sophisticated method for managing the overlapping isotopic distributions.

View Article: PubMed Central - HTML - PubMed

Affiliation: Dipartimento di Informatica e Sistemistica, Università degli Studi di Pavia, Pavia, Italy. nicola.barbarini@unipv.it

ABSTRACT

Background: Mass spectrometry is an essential technique in proteomics both to identify the proteins of a biological sample and to compare proteomic profiles of different samples. In both cases, the main phase of the data analysis is the procedure to extract the significant features from a mass spectrum. Its final output is the so-called peak list which contains the mass, the charge and the intensity of every detected biomolecule. The main steps of the peak list extraction procedure are usually preprocessing, peak detection, peak selection, charge determination and monoisotoping operation.

Results: This paper describes an original algorithm for peak list extraction from low and high resolution mass spectra. It has been developed principally to improve the precision of peak extraction in comparison to other reference algorithms. It contains many innovative features among which a sophisticated method for managing the overlapping isotopic distributions.

Conclusions: The performances of the basic version of the algorithm and of its optional functionalities have been evaluated in this paper on both SELDI-TOF, MALDI-TOF and ESI-FTICR ECD mass spectra. Executable files of MassSpec, a MATLAB implementation of the peak list extraction procedure for Windows and Linux systems, can be downloaded free of charge for nonprofit institutions from the following web site: http://aimed11.unipv.it/MassSpec.

Show MeSH