Limits...
Understanding the characteristics of mass spectrometry data through the use of simulation.

Coombes KR, Koomen JM, Baggerly KA, Morris JS, Kobayashi R - Cancer Inform (2005)

Bottom Line: We found that the relative mass error is affected by the time discretization of the detector (about 0.01%) and the spread of initial velocities (about 0.1%).Natural isotope distributions play a major role inbroadening peaks associated with individual proteins.The area of a peak is a more accurate measure of its size than the height.

View Article: PubMed Central - PubMed

Affiliation: Departments of Biostatistics and Applied Mathematics, University of Texas M.D. Anderson Cancer Center, Houston TX 77030 USA. kcoombes@mdanderson.org

ABSTRACT

Background: Mass spectrometry is actively being used to discover disease-related proteomic patterns in complex mixtures of proteins derived from tissue samples or from easily obtained biological fluids. The potential importance of these clinical applications has made the development of better methods for processing and analyzing the data an active area of research. It is, however, difficult to determine which methods are better without knowing the true biochemical composition of the samples used in the experiments.

Methods: We developed a mathematical model based on the physics of a simple MALDI-TOF mass spectrometer with time-lag focusing. Using this model, we implemented a statistical simulation of mass spectra. We used the simulation to explore some of the basicoperating characteristics of MALDI or SELDI instruments.

Results: The simulation reproduced several characteristics of actual instruments. We found that the relative mass error is affected by the time discretization of the detector (about 0.01%) and the spread of initial velocities (about 0.1%). The accuracy of calibration based on external standards decays rapidly outside the range spanned by the calibrants. Natural isotope distributions play a major role inbroadening peaks associated with individual proteins. The area of a peak is a more accurate measure of its size than the height.

Conclusions: The model described here is capable of simulating realistic mass spectra. The simulation should become a useful tool forgenerating spectra where the true inputs are known, allowing researchers to evaluate the performance of new methods for processing and analyzing mass spectra.

Availability: http://bioinformatics.mdanderson.org/cromwell.html.

No MeSH data available.


Related in: MedlinePlus

The effect of the isotope distribution on the size and shape o peaks. Peaks on a low resolution instrument are expected to be lower and broader after accounting for isotopes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC2657656&req=5

f5-cin-01-41: The effect of the isotope distribution on the size and shape o peaks. Peaks on a low resolution instrument are expected to be lower and broader after accounting for isotopes.

Mentions: Figure 5 illustrates how accounting for the isotope distribution of a peak at 2000 Daltons lowers and broadens the peak shape. This effect becomes more pronounced at higher masses because there are more chances for a larger molecule to incorporate different isotopes. We can estimate the magnitude of the effect using the same simplifications we have incorporated in our model. The distribution of the number of heavier isotopes in a protein of mass m is approximated by a binomial distribution, Binom(m/15, 0.0111), and so the expected number of heavier isotopes is 0.0111m/15 = 0.00074m. There is still notable skew in the distribution in the mid-mass range. When m is large, however, the distribution is approximately normal with standard deviation √(m/15)(0.0111) (0.9889) = 0.027 √m. To illustrate this result, when m = 20000 Daltons, we expect to see an average of about 15 heavier isotopes per molecule, with 99% of molecules containing between 3 and 27 heavier isotopes. This effect spreads the peak over a range of at least 24 Daltons or about 0.012% of the nominal mass. The offset of the center of the peak can also affect the calibration and the interpretation of the results.


Understanding the characteristics of mass spectrometry data through the use of simulation.

Coombes KR, Koomen JM, Baggerly KA, Morris JS, Kobayashi R - Cancer Inform (2005)

The effect of the isotope distribution on the size and shape o peaks. Peaks on a low resolution instrument are expected to be lower and broader after accounting for isotopes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC2657656&req=5

f5-cin-01-41: The effect of the isotope distribution on the size and shape o peaks. Peaks on a low resolution instrument are expected to be lower and broader after accounting for isotopes.
Mentions: Figure 5 illustrates how accounting for the isotope distribution of a peak at 2000 Daltons lowers and broadens the peak shape. This effect becomes more pronounced at higher masses because there are more chances for a larger molecule to incorporate different isotopes. We can estimate the magnitude of the effect using the same simplifications we have incorporated in our model. The distribution of the number of heavier isotopes in a protein of mass m is approximated by a binomial distribution, Binom(m/15, 0.0111), and so the expected number of heavier isotopes is 0.0111m/15 = 0.00074m. There is still notable skew in the distribution in the mid-mass range. When m is large, however, the distribution is approximately normal with standard deviation √(m/15)(0.0111) (0.9889) = 0.027 √m. To illustrate this result, when m = 20000 Daltons, we expect to see an average of about 15 heavier isotopes per molecule, with 99% of molecules containing between 3 and 27 heavier isotopes. This effect spreads the peak over a range of at least 24 Daltons or about 0.012% of the nominal mass. The offset of the center of the peak can also affect the calibration and the interpretation of the results.

Bottom Line: We found that the relative mass error is affected by the time discretization of the detector (about 0.01%) and the spread of initial velocities (about 0.1%).Natural isotope distributions play a major role inbroadening peaks associated with individual proteins.The area of a peak is a more accurate measure of its size than the height.

View Article: PubMed Central - PubMed

Affiliation: Departments of Biostatistics and Applied Mathematics, University of Texas M.D. Anderson Cancer Center, Houston TX 77030 USA. kcoombes@mdanderson.org

ABSTRACT

Background: Mass spectrometry is actively being used to discover disease-related proteomic patterns in complex mixtures of proteins derived from tissue samples or from easily obtained biological fluids. The potential importance of these clinical applications has made the development of better methods for processing and analyzing the data an active area of research. It is, however, difficult to determine which methods are better without knowing the true biochemical composition of the samples used in the experiments.

Methods: We developed a mathematical model based on the physics of a simple MALDI-TOF mass spectrometer with time-lag focusing. Using this model, we implemented a statistical simulation of mass spectra. We used the simulation to explore some of the basicoperating characteristics of MALDI or SELDI instruments.

Results: The simulation reproduced several characteristics of actual instruments. We found that the relative mass error is affected by the time discretization of the detector (about 0.01%) and the spread of initial velocities (about 0.1%). The accuracy of calibration based on external standards decays rapidly outside the range spanned by the calibrants. Natural isotope distributions play a major role inbroadening peaks associated with individual proteins. The area of a peak is a more accurate measure of its size than the height.

Conclusions: The model described here is capable of simulating realistic mass spectra. The simulation should become a useful tool forgenerating spectra where the true inputs are known, allowing researchers to evaluate the performance of new methods for processing and analyzing mass spectra.

Availability: http://bioinformatics.mdanderson.org/cromwell.html.

No MeSH data available.


Related in: MedlinePlus