Limits...
Artificial neural networks for the prediction of peptide drift time in ion mobility mass spectrometry.

Wang B, Valentine S, Plasencia M, Raghuraman S, Zhang X - BMC Bioinformatics (2010)

Bottom Line: For the model training and testing, a 10-fold cross-validation strategy was employed for three datasets each containing different charge states.The results achieved here demonstrate the effectiveness and efficiency of the prediction model.This work can enhance the confidence of protein identification by combining with current database search approaches for protein identification.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Electronics and Information Engineering, Anhui University of Technology, Ma'anshan, 243002, China. wangbing@ustc.edu

ABSTRACT

Background: There is an increasing usage of ion mobility-mass spectrometry (IMMS) in proteomics. IMMS combines the features of ion mobility spectrometry (IMS) and mass spectrometry (MS). It separates and detects peptide ions on a millisecond time-scale. IMS separates peptide ions based on drift time that is determined by the collision cross-section of each peptide ion in a given experiment condition. A peptide ion's collision cross-section is related to the ion size and shape resulted from the peptide amino acid sequence and their modifications. This inherent relation between the drift time of peptide ion and peptide sequence indicates that the drift time of peptide ions can be used to infer peptide sequence and therefore, for peptide identification.

Results: This paper describes an artificial neural networks (ANNs) regression model for the prediction of peptide ion drift time in IMMS. Each peptide in this work was represented using three descriptors (i.e., molecular weight, sequence length and a two-dimensional sequence index). An ANN predictor consisting of four input nodes, three hidden nodes and one output node was constructed for peptide ion drift time prediction. For the model training and testing, a 10-fold cross-validation strategy was employed for three datasets each containing different charge states. Dataset one contains 212 singly-charged peptide ions, dataset two has 306 doubly-charged peptide ions, and dataset three has 77 triply-charged peptide ions. Our proposed method achieved 94.4%, 93.6% and 74.2% prediction accuracy for singly-, doubly- and triply-charged peptide ions, respectively.

Conclusions: An ANN-based method has been developed for predicting the drift time of peptide ions in IMMS. The results achieved here demonstrate the effectiveness and efficiency of the prediction model. This work can enhance the confidence of protein identification by combining with current database search approaches for protein identification.

Show MeSH

Related in: MedlinePlus

Box plots of peptide molecular weight (A), sequence length (B) and drift time distribution (C) in the three datasets. The central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points that are not outliers, the cross points are outliers if they are larger than Q3+1.5*(Q3-Q1) or smaller than Q1-1.5*(Q3-Q1), where Q1 and Q3 are the 25th and 75th percentiles, respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2874804&req=5

Figure 1: Box plots of peptide molecular weight (A), sequence length (B) and drift time distribution (C) in the three datasets. The central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points that are not outliers, the cross points are outliers if they are larger than Q3+1.5*(Q3-Q1) or smaller than Q1-1.5*(Q3-Q1), where Q1 and Q3 are the 25th and 75th percentiles, respectively.

Mentions: Figure 1 shows the distribution of peptide molecular weight, sequence length and drift time in each of the three datasets, i.e., C1, C2 and C3. The molecular weight distribution of peptides in each dataset has a relatively wide range from 374.22 Da to 3503.71 Da. The average molecular weight of the singly-charged peptides (dataset C1) is 900.14 Da, while the averaged molecular weight of the doubly-charged (dataset C2) and the triply-charged peptide ions (dataset C3) are 1470.39 Da and 2046.30 Da, respectively (Figure 1a). The number of amino acid residues in the 595 peptides ranges from 3 to 34. The average numbers of amino acid residues in these three datasets C1, C2 and C3 are 7.9, 13.2, and 18.3, respectively (Figure 1b). The distributions of peptide molecular weight and peptide sequence length in each dataset indicate that the large peptides, i.e., peptides with large molecular weight and long amino acid sequences, tend to have high charge states. The peptide ion drift time is also significantly related to the overall ion charge state. The mean value of peptide drift time for the singly-charged peptide ions is 7.48 s while the mean values of the doubly-charged and the triply-charged peptide ions are 3.07 s and 2.28 s, respectively (Figure 1c).


Artificial neural networks for the prediction of peptide drift time in ion mobility mass spectrometry.

Wang B, Valentine S, Plasencia M, Raghuraman S, Zhang X - BMC Bioinformatics (2010)

Box plots of peptide molecular weight (A), sequence length (B) and drift time distribution (C) in the three datasets. The central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points that are not outliers, the cross points are outliers if they are larger than Q3+1.5*(Q3-Q1) or smaller than Q1-1.5*(Q3-Q1), where Q1 and Q3 are the 25th and 75th percentiles, respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2874804&req=5

Figure 1: Box plots of peptide molecular weight (A), sequence length (B) and drift time distribution (C) in the three datasets. The central mark is the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points that are not outliers, the cross points are outliers if they are larger than Q3+1.5*(Q3-Q1) or smaller than Q1-1.5*(Q3-Q1), where Q1 and Q3 are the 25th and 75th percentiles, respectively.
Mentions: Figure 1 shows the distribution of peptide molecular weight, sequence length and drift time in each of the three datasets, i.e., C1, C2 and C3. The molecular weight distribution of peptides in each dataset has a relatively wide range from 374.22 Da to 3503.71 Da. The average molecular weight of the singly-charged peptides (dataset C1) is 900.14 Da, while the averaged molecular weight of the doubly-charged (dataset C2) and the triply-charged peptide ions (dataset C3) are 1470.39 Da and 2046.30 Da, respectively (Figure 1a). The number of amino acid residues in the 595 peptides ranges from 3 to 34. The average numbers of amino acid residues in these three datasets C1, C2 and C3 are 7.9, 13.2, and 18.3, respectively (Figure 1b). The distributions of peptide molecular weight and peptide sequence length in each dataset indicate that the large peptides, i.e., peptides with large molecular weight and long amino acid sequences, tend to have high charge states. The peptide ion drift time is also significantly related to the overall ion charge state. The mean value of peptide drift time for the singly-charged peptide ions is 7.48 s while the mean values of the doubly-charged and the triply-charged peptide ions are 3.07 s and 2.28 s, respectively (Figure 1c).

Bottom Line: For the model training and testing, a 10-fold cross-validation strategy was employed for three datasets each containing different charge states.The results achieved here demonstrate the effectiveness and efficiency of the prediction model.This work can enhance the confidence of protein identification by combining with current database search approaches for protein identification.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Electronics and Information Engineering, Anhui University of Technology, Ma'anshan, 243002, China. wangbing@ustc.edu

ABSTRACT

Background: There is an increasing usage of ion mobility-mass spectrometry (IMMS) in proteomics. IMMS combines the features of ion mobility spectrometry (IMS) and mass spectrometry (MS). It separates and detects peptide ions on a millisecond time-scale. IMS separates peptide ions based on drift time that is determined by the collision cross-section of each peptide ion in a given experiment condition. A peptide ion's collision cross-section is related to the ion size and shape resulted from the peptide amino acid sequence and their modifications. This inherent relation between the drift time of peptide ion and peptide sequence indicates that the drift time of peptide ions can be used to infer peptide sequence and therefore, for peptide identification.

Results: This paper describes an artificial neural networks (ANNs) regression model for the prediction of peptide ion drift time in IMMS. Each peptide in this work was represented using three descriptors (i.e., molecular weight, sequence length and a two-dimensional sequence index). An ANN predictor consisting of four input nodes, three hidden nodes and one output node was constructed for peptide ion drift time prediction. For the model training and testing, a 10-fold cross-validation strategy was employed for three datasets each containing different charge states. Dataset one contains 212 singly-charged peptide ions, dataset two has 306 doubly-charged peptide ions, and dataset three has 77 triply-charged peptide ions. Our proposed method achieved 94.4%, 93.6% and 74.2% prediction accuracy for singly-, doubly- and triply-charged peptide ions, respectively.

Conclusions: An ANN-based method has been developed for predicting the drift time of peptide ions in IMMS. The results achieved here demonstrate the effectiveness and efficiency of the prediction model. This work can enhance the confidence of protein identification by combining with current database search approaches for protein identification.

Show MeSH
Related in: MedlinePlus