Limits...
Time-series alignment by non-negative multiple generalized canonical correlation analysis.

Fischer B, Roth V, Buhmann JM - BMC Bioinformatics (2007)

Bottom Line: The alignment function is learned in a supervised fashion.We compare our approach with previously published methods for aligning mass spectrometry data on a large proteomics dataset.The proposed method significantly increases the number of proteins that are identified as being differentially expressed in different biological samples.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Computational Science, ETH Zurich, Switzerland. bernd.fischer@inf.ethz.ch

ABSTRACT

Background: Quantitative analysis of differential protein expressions requires to align temporal elution measurements from liquid chromatography coupled to mass spectrometry (LC/MS). We propose multiple Canonical Correlation Analysis (mCCA) as a method to align the non-linearly distorted time scales of repeated LC/MS experiments in a robust way.

Results: Multiple canonical correlation analysis is able to map several time series to a consensus time scale. The alignment function is learned in a supervised fashion. We compare our approach with previously published methods for aligning mass spectrometry data on a large proteomics dataset. The proposed method significantly increases the number of proteins that are identified as being differentially expressed in different biological samples.

Conclusion: Jointly aligning multiple liquid chromatography/mass spectrometry samples by mCCA substantially increases the detection rate of potential bio-markers which significantly improves the interpretability of LC/MS data.

Show MeSH

Related in: MedlinePlus

Ratio of the number of proteins classified as significantly over-/underexpressed as function of the estimated false discovery rate. The ratio is between multiple CCA and pair-wise ridge regression.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2230505&req=5

Figure 4: Ratio of the number of proteins classified as significantly over-/underexpressed as function of the estimated false discovery rate. The ratio is between multiple CCA and pair-wise ridge regression.

Mentions: To compare the sensitivities of the alignment methods, we compare the number of differently abundant proteins detected by multiple CCA with the detections by ridge regression. The ratio of these two detection rates is shown as a function of the false discovery rate in Figure 4. The gain by multiple CCA is between 2% and 22% for different false discovery rates. The choice of a suitable false discovery rate depends on the proteomics application. In a biomarker discovery scenario we are interested in a fairly small false discovery rate to reduce the amount of work for experimental validation. In high throughput screening scenarios, bio-scientists are interested to find potential bio-markers that are further investigated by an additional subsequent analysis, and, therefore, they can accept more false discovery detections.


Time-series alignment by non-negative multiple generalized canonical correlation analysis.

Fischer B, Roth V, Buhmann JM - BMC Bioinformatics (2007)

Ratio of the number of proteins classified as significantly over-/underexpressed as function of the estimated false discovery rate. The ratio is between multiple CCA and pair-wise ridge regression.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2230505&req=5

Figure 4: Ratio of the number of proteins classified as significantly over-/underexpressed as function of the estimated false discovery rate. The ratio is between multiple CCA and pair-wise ridge regression.
Mentions: To compare the sensitivities of the alignment methods, we compare the number of differently abundant proteins detected by multiple CCA with the detections by ridge regression. The ratio of these two detection rates is shown as a function of the false discovery rate in Figure 4. The gain by multiple CCA is between 2% and 22% for different false discovery rates. The choice of a suitable false discovery rate depends on the proteomics application. In a biomarker discovery scenario we are interested in a fairly small false discovery rate to reduce the amount of work for experimental validation. In high throughput screening scenarios, bio-scientists are interested to find potential bio-markers that are further investigated by an additional subsequent analysis, and, therefore, they can accept more false discovery detections.

Bottom Line: The alignment function is learned in a supervised fashion.We compare our approach with previously published methods for aligning mass spectrometry data on a large proteomics dataset.The proposed method significantly increases the number of proteins that are identified as being differentially expressed in different biological samples.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Computational Science, ETH Zurich, Switzerland. bernd.fischer@inf.ethz.ch

ABSTRACT

Background: Quantitative analysis of differential protein expressions requires to align temporal elution measurements from liquid chromatography coupled to mass spectrometry (LC/MS). We propose multiple Canonical Correlation Analysis (mCCA) as a method to align the non-linearly distorted time scales of repeated LC/MS experiments in a robust way.

Results: Multiple canonical correlation analysis is able to map several time series to a consensus time scale. The alignment function is learned in a supervised fashion. We compare our approach with previously published methods for aligning mass spectrometry data on a large proteomics dataset. The proposed method significantly increases the number of proteins that are identified as being differentially expressed in different biological samples.

Conclusion: Jointly aligning multiple liquid chromatography/mass spectrometry samples by mCCA substantially increases the detection rate of potential bio-markers which significantly improves the interpretability of LC/MS data.

Show MeSH
Related in: MedlinePlus