Limits...
Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method.

Peters B, Sette A - BMC Bioinformatics (2005)

Bottom Line: First, they can provide a summary of experimental results, allowing for a deeper understanding of the mechanisms involved in sequence recognition.This method has been successfully applied to predicting peptide binding to MHC molecules, peptide transport by the transporter associated with antigen presentation (TAP) and proteasomal cleavage of protein sequences.Making the SMM method publicly available enables bioinformaticians and experimental biologists to easily access it, to compare its performance to other prediction methods, and to extend it to other applications.

View Article: PubMed Central - HTML - PubMed

Affiliation: La Jolla Institute for Allergy and Immunology, 3030 Bunker Hill Street, Suite 326, San Diego, CA 92109, USA. bjoern_peters@gmx.net

ABSTRACT

Background: Many processes in molecular biology involve the recognition of short sequences of nucleic-or amino acids, such as the binding of immunogenic peptides to major histocompatibility complex (MHC) molecules. From experimental data, a model of the sequence specificity of these processes can be constructed, such as a sequence motif, a scoring matrix or an artificial neural network. The purpose of these models is two-fold. First, they can provide a summary of experimental results, allowing for a deeper understanding of the mechanisms involved in sequence recognition. Second, such models can be used to predict the experimental outcome for yet untested sequences. In the past we reported the development of a method to generate such models called the Stabilized Matrix Method (SMM). This method has been successfully applied to predicting peptide binding to MHC molecules, peptide transport by the transporter associated with antigen presentation (TAP) and proteasomal cleavage of protein sequences.

Results: Herein we report the implementation of the SMM algorithm as a publicly available software package. Specific features determining the type of problems the method is most appropriate for are discussed. Advantageous features of the package are: (1) the output generated is easy to interpret, (2) input and output are both quantitative, (3) specific computational strategies to handle experimental noise are built in, (4) the algorithm is designed to effectively handle bounded experimental data, (5) experimental data from randomized peptide libraries and conventional peptides can easily be combined, and (6) it is possible to incorporate pair interactions between positions of a sequence.

Conclusion: Making the SMM method publicly available enables bioinformaticians and experimental biologists to easily access it, to compare its performance to other prediction methods, and to extend it to other applications.

Show MeSH

Related in: MedlinePlus

Iterative model fitting using the L2<> norm. In this example, the model is a linear function which is fitted to a set of paired values (x, ymeas). For two of the x values (x = 3 and x = 5), the measured values are thresholds (Greater 3). Fitting a linear function to paired values according to the L2 norm corresponds to the standard linear regression. A depicts the model fit (straight line) to the measured values (black boxes), ignoring any thresholds. For x = 5, the model value ypred taken from the regression curve is 3.4, above the measured threshold value 3. Therefore, in the next iteration the ymeas* value is set to the model value 3.4. B shows the new linear regression with the adjusted ymeas* values. This procedure is repeated until the ymeas* values no longer change (8 iterations, panel C).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC1173087&req=5

Figure 4: Iterative model fitting using the L2<> norm. In this example, the model is a linear function which is fitted to a set of paired values (x, ymeas). For two of the x values (x = 3 and x = 5), the measured values are thresholds (Greater 3). Fitting a linear function to paired values according to the L2 norm corresponds to the standard linear regression. A depicts the model fit (straight line) to the measured values (black boxes), ignoring any thresholds. For x = 5, the model value ypred taken from the regression curve is 3.4, above the measured threshold value 3. Therefore, in the next iteration the ymeas* value is set to the model value 3.4. B shows the new linear regression with the adjusted ymeas* values. This procedure is repeated until the ymeas* values no longer change (8 iterations, panel C).

Mentions: The SMM method is, to the best of our knowledge, the only method designed to extract information from such boundary values. This is done by means of the novel L2<> norm, illustrated in Table 1. For example, if an experimental measurement corresponds to an upper boundary ymeas > z, and the predicted value is greater than z, then the distance between ymeas and ypred is zero. This norm has the useful property that any analytical solution according to the L2 norm can be converted into a solution according to the L2<> norm through an iterative process: First, all measurements including boundary values are treated as normal values, and the solution using the L2 norm is found. In a second step, for each ypred, ymeas value pair for which the L2<> norm would be zero, the ymeas value is set to its corresponding ypred value. For these ymeas* values, the distance L2(ypred, ymeas*) = L2<> (ypred, ymeas). These ymeas* values are then used to solve again according to the L2 norm. This process is repeated until the ymeas* values no longer change, as illustrated in Figure 4.


Generating quantitative models describing the sequence specificity of biological processes with the stabilized matrix method.

Peters B, Sette A - BMC Bioinformatics (2005)

Iterative model fitting using the L2<> norm. In this example, the model is a linear function which is fitted to a set of paired values (x, ymeas). For two of the x values (x = 3 and x = 5), the measured values are thresholds (Greater 3). Fitting a linear function to paired values according to the L2 norm corresponds to the standard linear regression. A depicts the model fit (straight line) to the measured values (black boxes), ignoring any thresholds. For x = 5, the model value ypred taken from the regression curve is 3.4, above the measured threshold value 3. Therefore, in the next iteration the ymeas* value is set to the model value 3.4. B shows the new linear regression with the adjusted ymeas* values. This procedure is repeated until the ymeas* values no longer change (8 iterations, panel C).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC1173087&req=5

Figure 4: Iterative model fitting using the L2<> norm. In this example, the model is a linear function which is fitted to a set of paired values (x, ymeas). For two of the x values (x = 3 and x = 5), the measured values are thresholds (Greater 3). Fitting a linear function to paired values according to the L2 norm corresponds to the standard linear regression. A depicts the model fit (straight line) to the measured values (black boxes), ignoring any thresholds. For x = 5, the model value ypred taken from the regression curve is 3.4, above the measured threshold value 3. Therefore, in the next iteration the ymeas* value is set to the model value 3.4. B shows the new linear regression with the adjusted ymeas* values. This procedure is repeated until the ymeas* values no longer change (8 iterations, panel C).
Mentions: The SMM method is, to the best of our knowledge, the only method designed to extract information from such boundary values. This is done by means of the novel L2<> norm, illustrated in Table 1. For example, if an experimental measurement corresponds to an upper boundary ymeas > z, and the predicted value is greater than z, then the distance between ymeas and ypred is zero. This norm has the useful property that any analytical solution according to the L2 norm can be converted into a solution according to the L2<> norm through an iterative process: First, all measurements including boundary values are treated as normal values, and the solution using the L2 norm is found. In a second step, for each ypred, ymeas value pair for which the L2<> norm would be zero, the ymeas value is set to its corresponding ypred value. For these ymeas* values, the distance L2(ypred, ymeas*) = L2<> (ypred, ymeas). These ymeas* values are then used to solve again according to the L2 norm. This process is repeated until the ymeas* values no longer change, as illustrated in Figure 4.

Bottom Line: First, they can provide a summary of experimental results, allowing for a deeper understanding of the mechanisms involved in sequence recognition.This method has been successfully applied to predicting peptide binding to MHC molecules, peptide transport by the transporter associated with antigen presentation (TAP) and proteasomal cleavage of protein sequences.Making the SMM method publicly available enables bioinformaticians and experimental biologists to easily access it, to compare its performance to other prediction methods, and to extend it to other applications.

View Article: PubMed Central - HTML - PubMed

Affiliation: La Jolla Institute for Allergy and Immunology, 3030 Bunker Hill Street, Suite 326, San Diego, CA 92109, USA. bjoern_peters@gmx.net

ABSTRACT

Background: Many processes in molecular biology involve the recognition of short sequences of nucleic-or amino acids, such as the binding of immunogenic peptides to major histocompatibility complex (MHC) molecules. From experimental data, a model of the sequence specificity of these processes can be constructed, such as a sequence motif, a scoring matrix or an artificial neural network. The purpose of these models is two-fold. First, they can provide a summary of experimental results, allowing for a deeper understanding of the mechanisms involved in sequence recognition. Second, such models can be used to predict the experimental outcome for yet untested sequences. In the past we reported the development of a method to generate such models called the Stabilized Matrix Method (SMM). This method has been successfully applied to predicting peptide binding to MHC molecules, peptide transport by the transporter associated with antigen presentation (TAP) and proteasomal cleavage of protein sequences.

Results: Herein we report the implementation of the SMM algorithm as a publicly available software package. Specific features determining the type of problems the method is most appropriate for are discussed. Advantageous features of the package are: (1) the output generated is easy to interpret, (2) input and output are both quantitative, (3) specific computational strategies to handle experimental noise are built in, (4) the algorithm is designed to effectively handle bounded experimental data, (5) experimental data from randomized peptide libraries and conventional peptides can easily be combined, and (6) it is possible to incorporate pair interactions between positions of a sequence.

Conclusion: Making the SMM method publicly available enables bioinformaticians and experimental biologists to easily access it, to compare its performance to other prediction methods, and to extend it to other applications.

Show MeSH
Related in: MedlinePlus