Limits...
Modeling peptide fragmentation with dynamic Bayesian networks for peptide identification.

Klammer AA, Reynolds SM, Bilmes JA, MacCoss MJ, Noble WS - Bioinformatics (2008)

Bottom Line: We train a set of DBNs on high-confidence peptide-spectrum matches.Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate.Python and C source code are available upon request from the authors.

View Article: PubMed Central - PubMed

Affiliation: Department of Genome Sciences, University of Washington, Seattle, WA, USA.

ABSTRACT

Motivation: Tandem mass spectrometry (MS/MS) is an indispensable technology for identification of proteins from complex mixtures. Proteins are digested to peptides that are then identified by their fragmentation patterns in the mass spectrometer. Thus, at its core, MS/MS protein identification relies on the relative predictability of peptide fragmentation. Unfortunately, peptide fragmentation is complex and not fully understood, and what is understood is not always exploited by peptide identification algorithms.

Results: We use a hybrid dynamic Bayesian network (DBN)/support vector machine (SVM) approach to address these two problems. We train a set of DBNs on high-confidence peptide-spectrum matches. These DBNs, known collectively as Riptide, comprise a probabilistic model of peptide fragmentation chemistry. Examination of the distributions learned by Riptide allows identification of new trends, such as prevalent a-ion fragmentation at peptide cleavage sites C-term to hydrophobic residues. In addition, Riptide can be used to produce likelihood scores that indicate whether a given peptide-spectrum match is correct. A vector of such scores is evaluated by an SVM, which produces a final score to be used in peptide identification. Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate.

Availability: Python and C source code are available upon request from the authors. The curated training sets are available at http://noble.gs.washington.edu/proj/intense/. The Graphical Model Tool Kit (GMTK) is freely available at http://ssli.ee.washington.edu/bilmes/gmtk.

Show MeSH

Related in: MedlinePlus

Learned parameters of the Riptide model. (A) Displays the mean peak intensities for different residues and ion types learned using the Riptide single-ion models. Each cell shows the mean normalized intensity value for a particular ion series and flanking residue. For the left heat map, residues designated are those to the left of the amide bond fragmented to produce ions of that type (i.e. the amide bond is itself C-term to the residue), while for the right heat map, the residues designated are those to the right of the fragmented amide bond (i.e. the amide bond is itself N-term to the residue). The top image was created using matrix2png (Pavlidis and Noble, 2003). (B) Displays the 2D Gaussian distributions of peak intensities for pairs of ions learned using the Riptide paired-ion models. Each plot shows the joint distribution of ion intensities resulting from the same amide bond cleavage; thus, for example, points on the b/y plot corresponds to bi/yn−i pairs, for a peptide of length n. The color bar scale indicates natural log probability.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2665034&req=5

Figure 6: Learned parameters of the Riptide model. (A) Displays the mean peak intensities for different residues and ion types learned using the Riptide single-ion models. Each cell shows the mean normalized intensity value for a particular ion series and flanking residue. For the left heat map, residues designated are those to the left of the amide bond fragmented to produce ions of that type (i.e. the amide bond is itself C-term to the residue), while for the right heat map, the residues designated are those to the right of the fragmented amide bond (i.e. the amide bond is itself N-term to the residue). The top image was created using matrix2png (Pavlidis and Noble, 2003). (B) Displays the 2D Gaussian distributions of peak intensities for pairs of ions learned using the Riptide paired-ion models. Each plot shows the joint distribution of ion intensities resulting from the same amide bond cleavage; thus, for example, points on the b/y plot corresponds to bi/yn−i pairs, for a peptide of length n. The color bar scale indicates natural log probability.

Mentions: An additional benefit of using DBNs in the Riptide model is that the probability distributions learned by the networks can be readily interpreted to produce scientific insights. We examine the probability distributions governing ion fragment intensities learned by the single-ion and paired-ion types of Riptide models in Figure 6.Fig. 6.


Modeling peptide fragmentation with dynamic Bayesian networks for peptide identification.

Klammer AA, Reynolds SM, Bilmes JA, MacCoss MJ, Noble WS - Bioinformatics (2008)

Learned parameters of the Riptide model. (A) Displays the mean peak intensities for different residues and ion types learned using the Riptide single-ion models. Each cell shows the mean normalized intensity value for a particular ion series and flanking residue. For the left heat map, residues designated are those to the left of the amide bond fragmented to produce ions of that type (i.e. the amide bond is itself C-term to the residue), while for the right heat map, the residues designated are those to the right of the fragmented amide bond (i.e. the amide bond is itself N-term to the residue). The top image was created using matrix2png (Pavlidis and Noble, 2003). (B) Displays the 2D Gaussian distributions of peak intensities for pairs of ions learned using the Riptide paired-ion models. Each plot shows the joint distribution of ion intensities resulting from the same amide bond cleavage; thus, for example, points on the b/y plot corresponds to bi/yn−i pairs, for a peptide of length n. The color bar scale indicates natural log probability.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2665034&req=5

Figure 6: Learned parameters of the Riptide model. (A) Displays the mean peak intensities for different residues and ion types learned using the Riptide single-ion models. Each cell shows the mean normalized intensity value for a particular ion series and flanking residue. For the left heat map, residues designated are those to the left of the amide bond fragmented to produce ions of that type (i.e. the amide bond is itself C-term to the residue), while for the right heat map, the residues designated are those to the right of the fragmented amide bond (i.e. the amide bond is itself N-term to the residue). The top image was created using matrix2png (Pavlidis and Noble, 2003). (B) Displays the 2D Gaussian distributions of peak intensities for pairs of ions learned using the Riptide paired-ion models. Each plot shows the joint distribution of ion intensities resulting from the same amide bond cleavage; thus, for example, points on the b/y plot corresponds to bi/yn−i pairs, for a peptide of length n. The color bar scale indicates natural log probability.
Mentions: An additional benefit of using DBNs in the Riptide model is that the probability distributions learned by the networks can be readily interpreted to produce scientific insights. We examine the probability distributions governing ion fragment intensities learned by the single-ion and paired-ion types of Riptide models in Figure 6.Fig. 6.

Bottom Line: We train a set of DBNs on high-confidence peptide-spectrum matches.Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate.Python and C source code are available upon request from the authors.

View Article: PubMed Central - PubMed

Affiliation: Department of Genome Sciences, University of Washington, Seattle, WA, USA.

ABSTRACT

Motivation: Tandem mass spectrometry (MS/MS) is an indispensable technology for identification of proteins from complex mixtures. Proteins are digested to peptides that are then identified by their fragmentation patterns in the mass spectrometer. Thus, at its core, MS/MS protein identification relies on the relative predictability of peptide fragmentation. Unfortunately, peptide fragmentation is complex and not fully understood, and what is understood is not always exploited by peptide identification algorithms.

Results: We use a hybrid dynamic Bayesian network (DBN)/support vector machine (SVM) approach to address these two problems. We train a set of DBNs on high-confidence peptide-spectrum matches. These DBNs, known collectively as Riptide, comprise a probabilistic model of peptide fragmentation chemistry. Examination of the distributions learned by Riptide allows identification of new trends, such as prevalent a-ion fragmentation at peptide cleavage sites C-term to hydrophobic residues. In addition, Riptide can be used to produce likelihood scores that indicate whether a given peptide-spectrum match is correct. A vector of such scores is evaluated by an SVM, which produces a final score to be used in peptide identification. Using Riptide in this way yields improved discrimination when compared to other state-of-the-art MS/MS identification algorithms, increasing the number of positive identifications by as much as 12% at a 1% false discovery rate.

Availability: Python and C source code are available upon request from the authors. The curated training sets are available at http://noble.gs.washington.edu/proj/intense/. The Graphical Model Tool Kit (GMTK) is freely available at http://ssli.ee.washington.edu/bilmes/gmtk.

Show MeSH
Related in: MedlinePlus