Limits...
An efficient algorithmic approach for mass spectrometry-based disulfide connectivity determination using multi-ion analysis.

Murad W, Singh R, Yen TY - BMC Bioinformatics (2011)

Bottom Line: Additionally, each bond is associated with a confidence score, which aids in interpretation and assimilation of the results.The method was also compared with other techniques at the state-of-the-art.It was found to perform as well or better than the competing techniques.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, San Francisco State University, 1600 Holloway Avenue, San Francisco, CA 94132, USA. whemurad@sfsu.edu

ABSTRACT

Background: Determining the disulfide (S-S) bond pattern in a protein is often crucial for understanding its structure and function. In recent research, mass spectrometry (MS) based analysis has been applied to this problem following protein digestion under both partial reduction and non-reduction conditions. However, this paradigm still awaits solutions to certain algorithmic problems fundamental amongst which is the efficient matching of an exponentially growing set of putative S-S bonded structural alternatives to the large amounts of experimental spectrometric data. Current methods circumvent this challenge primarily through simplifications, such as by assuming only the occurrence of certain ion-types (b-ions and y-ions) that predominate in the more popular dissociation methods, such as collision-induced dissociation (CID). Unfortunately, this can adversely impact the quality of results.

Method: We present an algorithmic approach to this problem that can, with high computational efficiency, analyze multiple ions types (a, b, bo, b*, c, x, y, yo, y*, and z) and deal with complex bonding topologies, such as inter/intra bonding involving more than two peptides. The proposed approach combines an approximation algorithm-based search formulation with data driven parameter estimation. This formulation considers only those regions of the search space where the correct solution resides with a high likelihood. Putative disulfide bonds thus obtained are finally combined in a globally consistent pattern to yield the overall disulfide bonding topology of the molecule. Additionally, each bond is associated with a confidence score, which aids in interpretation and assimilation of the results.

Results: The method was tested on nine different eukaryotic Glycosyltransferases possessing disulfide bonding topologies of varying complexity. Its performance was found to be characterized by high efficiency (in terms of time and the fraction of search space considered), sensitivity, specificity, and accuracy. The method was also compared with other techniques at the state-of-the-art. It was found to perform as well or better than the competing techniques. An implementation is available at: http://tintin.sfsu.edu/~whemurad/disulfidebond.

Conclusions: This research addresses some of the significant challenges in MS-based disulfide bond determination. To the best of our knowledge, this is the first algorithmic work that can consider multiple ion types in this problem setting while simultaneously ensuring polynomial time complexity and high accuracy of results.

Show MeSH

Related in: MedlinePlus

Multiple-ion spectra analysis. This figure illustrates the presence of multiple ions types (in green) after CID. In the first spectrum, note the presence of bo and yo ions with high intensity in the fragmentation of the precursor ion with sequence: FFLQGIQLNTILPDAR, for the protein Lysozyme [Swiss-Prot: P11279]. In the second spectrum, a, bo, b*, and yo ions (all with high intensity) can be observed after the fragmentation of a precursor ion existing in the protein Pratelet glycoprotein 4 [Swiss-Prot P16671].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3044266&req=5

Figure 2: Multiple-ion spectra analysis. This figure illustrates the presence of multiple ions types (in green) after CID. In the first spectrum, note the presence of bo and yo ions with high intensity in the fragmentation of the precursor ion with sequence: FFLQGIQLNTILPDAR, for the protein Lysozyme [Swiss-Prot: P11279]. In the second spectrum, a, bo, b*, and yo ions (all with high intensity) can be observed after the fragmentation of a precursor ion existing in the protein Pratelet glycoprotein 4 [Swiss-Prot P16671].

Mentions: MS-based methods generally outperform methods using sequence-based learning formulations, as showed by Lee and Singh [3]. However, a number of algorithmic challenges remain outstanding in realizing the potential of MS-based approaches. Salient among these are: (1) accounting for multiple ion types in the data [4,5]: To avoid an exponential increase in the search space, a common simplification is to limit the analysis to the spectra of b-ions and y-ions only [3,6,7]. However, this simplification may erroneously ignore the occurrence of other ions, such as: a, bo, b*, c, x, yo, y*, and z. While the occurrence of non-b/y ions is minimized (though not eliminated) in collision-induced dissociation (CID), some of these ions can be present with greater likelihood in dissociation methods such as electron capture dissociation (ECD), electron transfer dissociation (ETD), and electron-detachment dissociation (EDD). In fact these ions types should be considered even in CID as illustrated by the example in Figure 2. (2) Design of efficient search and matching algorithms: The search space of possible disulfide topologies increases rapidly not only with the number of ion types being analyzed but also with the number of cysteines as well as the types of connectivity patterns. Thus, it is imperative to have algorithms that can accommodate the richness of the entire problem domain. (3) Automated data-driven determination of parameters: Many advanced algorithms in this area are intrinsically parametric. Often, determining the optimal value of these parameters automatically is in itself, a complex problem. This places the practitioner at a significant disadvantage. Support for automated and data-driven strategies for estimation of crucial parameters is therefore crucial to the real-world success of a method in this problem domain.


An efficient algorithmic approach for mass spectrometry-based disulfide connectivity determination using multi-ion analysis.

Murad W, Singh R, Yen TY - BMC Bioinformatics (2011)

Multiple-ion spectra analysis. This figure illustrates the presence of multiple ions types (in green) after CID. In the first spectrum, note the presence of bo and yo ions with high intensity in the fragmentation of the precursor ion with sequence: FFLQGIQLNTILPDAR, for the protein Lysozyme [Swiss-Prot: P11279]. In the second spectrum, a, bo, b*, and yo ions (all with high intensity) can be observed after the fragmentation of a precursor ion existing in the protein Pratelet glycoprotein 4 [Swiss-Prot P16671].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3044266&req=5

Figure 2: Multiple-ion spectra analysis. This figure illustrates the presence of multiple ions types (in green) after CID. In the first spectrum, note the presence of bo and yo ions with high intensity in the fragmentation of the precursor ion with sequence: FFLQGIQLNTILPDAR, for the protein Lysozyme [Swiss-Prot: P11279]. In the second spectrum, a, bo, b*, and yo ions (all with high intensity) can be observed after the fragmentation of a precursor ion existing in the protein Pratelet glycoprotein 4 [Swiss-Prot P16671].
Mentions: MS-based methods generally outperform methods using sequence-based learning formulations, as showed by Lee and Singh [3]. However, a number of algorithmic challenges remain outstanding in realizing the potential of MS-based approaches. Salient among these are: (1) accounting for multiple ion types in the data [4,5]: To avoid an exponential increase in the search space, a common simplification is to limit the analysis to the spectra of b-ions and y-ions only [3,6,7]. However, this simplification may erroneously ignore the occurrence of other ions, such as: a, bo, b*, c, x, yo, y*, and z. While the occurrence of non-b/y ions is minimized (though not eliminated) in collision-induced dissociation (CID), some of these ions can be present with greater likelihood in dissociation methods such as electron capture dissociation (ECD), electron transfer dissociation (ETD), and electron-detachment dissociation (EDD). In fact these ions types should be considered even in CID as illustrated by the example in Figure 2. (2) Design of efficient search and matching algorithms: The search space of possible disulfide topologies increases rapidly not only with the number of ion types being analyzed but also with the number of cysteines as well as the types of connectivity patterns. Thus, it is imperative to have algorithms that can accommodate the richness of the entire problem domain. (3) Automated data-driven determination of parameters: Many advanced algorithms in this area are intrinsically parametric. Often, determining the optimal value of these parameters automatically is in itself, a complex problem. This places the practitioner at a significant disadvantage. Support for automated and data-driven strategies for estimation of crucial parameters is therefore crucial to the real-world success of a method in this problem domain.

Bottom Line: Additionally, each bond is associated with a confidence score, which aids in interpretation and assimilation of the results.The method was also compared with other techniques at the state-of-the-art.It was found to perform as well or better than the competing techniques.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science, San Francisco State University, 1600 Holloway Avenue, San Francisco, CA 94132, USA. whemurad@sfsu.edu

ABSTRACT

Background: Determining the disulfide (S-S) bond pattern in a protein is often crucial for understanding its structure and function. In recent research, mass spectrometry (MS) based analysis has been applied to this problem following protein digestion under both partial reduction and non-reduction conditions. However, this paradigm still awaits solutions to certain algorithmic problems fundamental amongst which is the efficient matching of an exponentially growing set of putative S-S bonded structural alternatives to the large amounts of experimental spectrometric data. Current methods circumvent this challenge primarily through simplifications, such as by assuming only the occurrence of certain ion-types (b-ions and y-ions) that predominate in the more popular dissociation methods, such as collision-induced dissociation (CID). Unfortunately, this can adversely impact the quality of results.

Method: We present an algorithmic approach to this problem that can, with high computational efficiency, analyze multiple ions types (a, b, bo, b*, c, x, y, yo, y*, and z) and deal with complex bonding topologies, such as inter/intra bonding involving more than two peptides. The proposed approach combines an approximation algorithm-based search formulation with data driven parameter estimation. This formulation considers only those regions of the search space where the correct solution resides with a high likelihood. Putative disulfide bonds thus obtained are finally combined in a globally consistent pattern to yield the overall disulfide bonding topology of the molecule. Additionally, each bond is associated with a confidence score, which aids in interpretation and assimilation of the results.

Results: The method was tested on nine different eukaryotic Glycosyltransferases possessing disulfide bonding topologies of varying complexity. Its performance was found to be characterized by high efficiency (in terms of time and the fraction of search space considered), sensitivity, specificity, and accuracy. The method was also compared with other techniques at the state-of-the-art. It was found to perform as well or better than the competing techniques. An implementation is available at: http://tintin.sfsu.edu/~whemurad/disulfidebond.

Conclusions: This research addresses some of the significant challenges in MS-based disulfide bond determination. To the best of our knowledge, this is the first algorithmic work that can consider multiple ion types in this problem setting while simultaneously ensuring polynomial time complexity and high accuracy of results.

Show MeSH
Related in: MedlinePlus