Limits...
Characterising phase variations in MALDI-TOF data and correcting them by peak alignment.

Lin SM, Haney RP, Campa MJ, Fitzgerald MC, Patz EF - Cancer Inform (2005)

Bottom Line: With the help of principal component analysis, we demonstrated that after peak alignment, the differences among replicates are reduced.We compared this approach to peak alignment with a model-based calibration approach in which there was known information about peaks in common among all spectra.Finally, we examined the potential value at each point in an analysis pipeline of having a set of methods available that includes parametric, semiparametric and nonparametric methods; among such methods are those that benefit from the use of prior information.

View Article: PubMed Central - PubMed

Affiliation: Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Chicago, IL, USA. S-Lin2@northwestern.edu

ABSTRACT
The use of MALDI-TOF mass spectrometry as a means of analyzing the proteome has been evaluated extensively in recent years. One of the limitations of this technique that has impeded the development of robust data analysis algorithms is the variability in the location of protein ion signals along the x-axis. We studied technical variations of MALDI-TOF measurements in the context of proteomics profiling. By acquiring a benchmark data set with five replicates, we estimated 76% to 85% of the total variance is due to phase variation. We devised a lobster plot, so named because of the resemblance to a lobster claw, to help detect the phase variation in replicates. We also investigated a peak alignment algorithm to remove the phase variation. This operation is analogous to the normalization step in microarray data analysis. Only after this critical step can features of biological interest be clearly revealed. With the help of principal component analysis, we demonstrated that after peak alignment, the differences among replicates are reduced. We compared this approach to peak alignment with a model-based calibration approach in which there was known information about peaks in common among all spectra. Finally, we examined the potential value at each point in an analysis pipeline of having a set of methods available that includes parametric, semiparametric and nonparametric methods; among such methods are those that benefit from the use of prior information.

No MeSH data available.


Spectrum from two different biological samples A and B. (a) Replicates of sample A1 to A5. (b) Replicates of sample B1 to B5. (c): PCA plot of all spectra before (open symbols) and after (filled symbols) peak alignment. Note that one of the open triangles is hidden behind the filled triangles.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC2657651&req=5

f3-cin-01-32: Spectrum from two different biological samples A and B. (a) Replicates of sample A1 to A5. (b) Replicates of sample B1 to B5. (c): PCA plot of all spectra before (open symbols) and after (filled symbols) peak alignment. Note that one of the open triangles is hidden behind the filled triangles.

Mentions: The raw intensities were square-root transformed, and then subjected to a baseline correction procedure by subtracting the 25th percentile of the intensity values, followed by a rescale procedure to project the data to the (0, 1) range. A plot of the spectra after preprocessing can be found in Figure 3a and 3b. TOF data were usually collected by binning time into short intervals and then counting the intensities during each interval. To plot the spectrum, we simply plot the intensity against the bin number, instead of using the more common m/z value. The m/z value is related to the bin number by a monotonic function.


Characterising phase variations in MALDI-TOF data and correcting them by peak alignment.

Lin SM, Haney RP, Campa MJ, Fitzgerald MC, Patz EF - Cancer Inform (2005)

Spectrum from two different biological samples A and B. (a) Replicates of sample A1 to A5. (b) Replicates of sample B1 to B5. (c): PCA plot of all spectra before (open symbols) and after (filled symbols) peak alignment. Note that one of the open triangles is hidden behind the filled triangles.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC2657651&req=5

f3-cin-01-32: Spectrum from two different biological samples A and B. (a) Replicates of sample A1 to A5. (b) Replicates of sample B1 to B5. (c): PCA plot of all spectra before (open symbols) and after (filled symbols) peak alignment. Note that one of the open triangles is hidden behind the filled triangles.
Mentions: The raw intensities were square-root transformed, and then subjected to a baseline correction procedure by subtracting the 25th percentile of the intensity values, followed by a rescale procedure to project the data to the (0, 1) range. A plot of the spectra after preprocessing can be found in Figure 3a and 3b. TOF data were usually collected by binning time into short intervals and then counting the intensities during each interval. To plot the spectrum, we simply plot the intensity against the bin number, instead of using the more common m/z value. The m/z value is related to the bin number by a monotonic function.

Bottom Line: With the help of principal component analysis, we demonstrated that after peak alignment, the differences among replicates are reduced.We compared this approach to peak alignment with a model-based calibration approach in which there was known information about peaks in common among all spectra.Finally, we examined the potential value at each point in an analysis pipeline of having a set of methods available that includes parametric, semiparametric and nonparametric methods; among such methods are those that benefit from the use of prior information.

View Article: PubMed Central - PubMed

Affiliation: Robert H. Lurie Comprehensive Cancer Center, Northwestern University, Chicago, IL, USA. S-Lin2@northwestern.edu

ABSTRACT
The use of MALDI-TOF mass spectrometry as a means of analyzing the proteome has been evaluated extensively in recent years. One of the limitations of this technique that has impeded the development of robust data analysis algorithms is the variability in the location of protein ion signals along the x-axis. We studied technical variations of MALDI-TOF measurements in the context of proteomics profiling. By acquiring a benchmark data set with five replicates, we estimated 76% to 85% of the total variance is due to phase variation. We devised a lobster plot, so named because of the resemblance to a lobster claw, to help detect the phase variation in replicates. We also investigated a peak alignment algorithm to remove the phase variation. This operation is analogous to the normalization step in microarray data analysis. Only after this critical step can features of biological interest be clearly revealed. With the help of principal component analysis, we demonstrated that after peak alignment, the differences among replicates are reduced. We compared this approach to peak alignment with a model-based calibration approach in which there was known information about peaks in common among all spectra. Finally, we examined the potential value at each point in an analysis pipeline of having a set of methods available that includes parametric, semiparametric and nonparametric methods; among such methods are those that benefit from the use of prior information.

No MeSH data available.