Limits...
Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics.

Brusniak MY, Bodenmiller B, Campbell D, Cooke K, Eddes J, Garbutt A, Lau H, Letarte S, Mueller LN, Sharma V, Vitek O, Zhang N, Aebersold R, Watts JD - BMC Bioinformatics (2008)

Bottom Line: However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis.The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools.For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for Systems Biology, 1441 North 34th Street, Seattle, WA 98103, USA. mbrusnia@systemsbiology.org

ABSTRACT

Background: Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics.

Results: We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling.

Conclusion: The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field.

Show MeSH

Related in: MedlinePlus

Verification of a Corra-identified Ark1 kinase substrate peptide/protein. Following targeted MS/MS identification of the top-ranked Corra-identified discriminatory features (see Figure 10 and Table 2) ion chromatograms were extracted from all LC-MS runs for the peptide RHS*LGLNEAKK (m/z = 444.895 [M+3H]3+), where S* represents phosphoserine. This peptide was derived from the protein YDR293C, and was confirmed as present in all 3 control sample analyses, but absent in all 3 Ark1 knockout analyses, as would be expected. For all six plots, a relative abundance of 100% was manually set to 107 ion counts so that all were on the same scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2651178&req=5

Figure 11: Verification of a Corra-identified Ark1 kinase substrate peptide/protein. Following targeted MS/MS identification of the top-ranked Corra-identified discriminatory features (see Figure 10 and Table 2) ion chromatograms were extracted from all LC-MS runs for the peptide RHS*LGLNEAKK (m/z = 444.895 [M+3H]3+), where S* represents phosphoserine. This peptide was derived from the protein YDR293C, and was confirmed as present in all 3 control sample analyses, but absent in all 3 Ark1 knockout analyses, as would be expected. For all six plots, a relative abundance of 100% was manually set to 107 ion counts so that all were on the same scale.

Mentions: From these data analyses, as with the diabetes study above, we next made an inclusion list for targeted MS/MS, to try and identify some of the phosphopeptides lost in the Ark1 knockout yeast versus the control. Table 2 lists the top 12 most discriminatory peptides, with a log Odds of ≥ 2.2, and that also matched a peptide sequence by MS/MS, with a PeptideProphet score of ≥ 0.7 (representing a false discovery rate of 5%). Ark1 is known to be involved in endocytosis and actin reorganization, as also are 4 other proteins from Table 2 (YOL109W, YBL037W, YMR109W, and YJR083C), demonstrating that Corra successfully enabled the generation of potentially biologically relevant information. Finally, for confirmation purposes, Figure 11 shows extracted ion chromatograms, for all 6 LC-MS runs, for the identified YDR293 peptide, RHS*LGLNEAKK (where S* represents phosphoserine) at m/z = 444.895 [M+3H]3+, confirming it's detection in all 3 replicate analyses of the control strain, and its absence in all 3 replicate analyses of the Ark1 knockout strain.


Corra: Computational framework and tools for LC-MS discovery and targeted mass spectrometry-based proteomics.

Brusniak MY, Bodenmiller B, Campbell D, Cooke K, Eddes J, Garbutt A, Lau H, Letarte S, Mueller LN, Sharma V, Vitek O, Zhang N, Aebersold R, Watts JD - BMC Bioinformatics (2008)

Verification of a Corra-identified Ark1 kinase substrate peptide/protein. Following targeted MS/MS identification of the top-ranked Corra-identified discriminatory features (see Figure 10 and Table 2) ion chromatograms were extracted from all LC-MS runs for the peptide RHS*LGLNEAKK (m/z = 444.895 [M+3H]3+), where S* represents phosphoserine. This peptide was derived from the protein YDR293C, and was confirmed as present in all 3 control sample analyses, but absent in all 3 Ark1 knockout analyses, as would be expected. For all six plots, a relative abundance of 100% was manually set to 107 ion counts so that all were on the same scale.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2651178&req=5

Figure 11: Verification of a Corra-identified Ark1 kinase substrate peptide/protein. Following targeted MS/MS identification of the top-ranked Corra-identified discriminatory features (see Figure 10 and Table 2) ion chromatograms were extracted from all LC-MS runs for the peptide RHS*LGLNEAKK (m/z = 444.895 [M+3H]3+), where S* represents phosphoserine. This peptide was derived from the protein YDR293C, and was confirmed as present in all 3 control sample analyses, but absent in all 3 Ark1 knockout analyses, as would be expected. For all six plots, a relative abundance of 100% was manually set to 107 ion counts so that all were on the same scale.
Mentions: From these data analyses, as with the diabetes study above, we next made an inclusion list for targeted MS/MS, to try and identify some of the phosphopeptides lost in the Ark1 knockout yeast versus the control. Table 2 lists the top 12 most discriminatory peptides, with a log Odds of ≥ 2.2, and that also matched a peptide sequence by MS/MS, with a PeptideProphet score of ≥ 0.7 (representing a false discovery rate of 5%). Ark1 is known to be involved in endocytosis and actin reorganization, as also are 4 other proteins from Table 2 (YOL109W, YBL037W, YMR109W, and YJR083C), demonstrating that Corra successfully enabled the generation of potentially biologically relevant information. Finally, for confirmation purposes, Figure 11 shows extracted ion chromatograms, for all 6 LC-MS runs, for the identified YDR293 peptide, RHS*LGLNEAKK (where S* represents phosphoserine) at m/z = 444.895 [M+3H]3+, confirming it's detection in all 3 replicate analyses of the control strain, and its absence in all 3 replicate analyses of the Ark1 knockout strain.

Bottom Line: However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis.The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools.For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute for Systems Biology, 1441 North 34th Street, Seattle, WA 98103, USA. mbrusnia@systemsbiology.org

ABSTRACT

Background: Quantitative proteomics holds great promise for identifying proteins that are differentially abundant between populations representing different physiological or disease states. A range of computational tools is now available for both isotopically labeled and label-free liquid chromatography mass spectrometry (LC-MS) based quantitative proteomics. However, they are generally not comparable to each other in terms of functionality, user interfaces, information input/output, and do not readily facilitate appropriate statistical data analysis. These limitations, along with the array of choices, present a daunting prospect for biologists, and other researchers not trained in bioinformatics, who wish to use LC-MS-based quantitative proteomics.

Results: We have developed Corra, a computational framework and tools for discovery-based LC-MS proteomics. Corra extends and adapts existing algorithms used for LC-MS-based proteomics, and statistical algorithms, originally developed for microarray data analyses, appropriate for LC-MS data analysis. Corra also adapts software engineering technologies (e.g. Google Web Toolkit, distributed processing) so that computationally intense data processing and statistical analyses can run on a remote server, while the user controls and manages the process from their own computer via a simple web interface. Corra also allows the user to output significantly differentially abundant LC-MS-detected peptide features in a form compatible with subsequent sequence identification via tandem mass spectrometry (MS/MS). We present two case studies to illustrate the application of Corra to commonly performed LC-MS-based biological workflows: a pilot biomarker discovery study of glycoproteins isolated from human plasma samples relevant to type 2 diabetes, and a study in yeast to identify in vivo targets of the protein kinase Ark1 via phosphopeptide profiling.

Conclusion: The Corra computational framework leverages computational innovation to enable biologists or other researchers to process, analyze and visualize LC-MS data with what would otherwise be a complex and not user-friendly suite of tools. Corra enables appropriate statistical analyses, with controlled false-discovery rates, ultimately to inform subsequent targeted identification of differentially abundant peptides by MS/MS. For the user not trained in bioinformatics, Corra represents a complete, customizable, free and open source computational platform enabling LC-MS-based proteomic workflows, and as such, addresses an unmet need in the LC-MS proteomics field.

Show MeSH
Related in: MedlinePlus