Limits...
A guide through the computational analysis of isotope-labeled mass spectrometry-based quantitative proteomics data: an application study.

Albaum SP, Hahne H, Otto A, Haußmann U, Becher D, Poetsch A, Goesmann A, Nattkemper TW - Proteome Sci (2011)

Bottom Line: This work provides guidance through the jungle of computational methods to analyze mass spectrometry-based isotope-labeled datasets and recommends an effective and easy-to-use evaluation strategy.Special focus is placed on the application and validation of cluster analysis methods.All applied methods were implemented within the rich internet application QuPE 4.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computational Genomics, Center for Biotechnology (CeBiTec), Bielefeld University, Germany. alu@cebitec.uni-bielefeld.de.

ABSTRACT

Background: Mass spectrometry-based proteomics has reached a stage where it is possible to comprehensively analyze the whole proteome of a cell in one experiment. Here, the employment of stable isotopes has become a standard technique to yield relative abundance values of proteins. In recent times, more and more experiments are conducted that depict not only a static image of the up- or down-regulated proteins at a distinct time point but instead compare developmental stages of an organism or varying experimental conditions.

Results: Although the scientific questions behind these experiments are of course manifold, there are, nevertheless, two questions that commonly arise: 1) which proteins are differentially regulated regarding the selected experimental conditions, and 2) are there groups of proteins that show similar abundance ratios, indicating that they have a similar turnover? We give advice on how these two questions can be answered and comprehensively compare a variety of commonly applied computational methods and their outcomes.

Conclusions: This work provides guidance through the jungle of computational methods to analyze mass spectrometry-based isotope-labeled datasets and recommends an effective and easy-to-use evaluation strategy. We demonstrate our approach with three recently published datasets on Bacillus subtilis 12 and Corynebacterium glutamicum 3. Special focus is placed on the application and validation of cluster analysis methods. All applied methods were implemented within the rich internet application QuPE 4. Results can be found at http://qupe.cebitec.uni-bielefeld.de.

No MeSH data available.


Davies and Bouldin. Instead of simply proposing a cluster index,Davies and Bouldin formulated a general framework for the evaluation of theoutcomes of cluster algorithms. In contrast to other indexes, an optimalcluster solution is indicated by the minimal calculated index value. Forinstance, for the two cluster algorithms K-means and Neuralgas a localminimum can be located around the 30-cluster solution. A generalinterpretation of this index, however, seems to be difficult due to a strongtendency towards constantly decreasing index values with regard to largecluster numbers.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3142201&req=5

Figure 9: Davies and Bouldin. Instead of simply proposing a cluster index,Davies and Bouldin formulated a general framework for the evaluation of theoutcomes of cluster algorithms. In contrast to other indexes, an optimalcluster solution is indicated by the minimal calculated index value. Forinstance, for the two cluster algorithms K-means and Neuralgas a localminimum can be located around the 30-cluster solution. A generalinterpretation of this index, however, seems to be difficult due to a strongtendency towards constantly decreasing index values with regard to largecluster numbers.

Mentions: Davies and Bouldin formulated a general framework for the evaluation of theoutcomes of cluster algorithms [44]. Aninstance of their index provided by Halkidi et. al[28] follows the idea that anoptimal solution to the clustering problem has been found as soon as for eachcluster no other utmost similar cluster-with regard to the intra-cluster error sumof squares as well as the distance between clusters-can be identified. In contrastto other indexes, this is indicated by the minimal calculated index value (seeFigure 9). In experiment A, for instance, for the twocluster algorithms K-means and Neuralgas, a local minimum can be located aroundthe 30-cluster solution. A general interpretation of this index, however, seems tobe difficult due to a strong tendency towards constantly decreasing index valueswith regard to large cluster numbers. An exception are both correlation-basedcluster algorithms (Average/Pearson correlation, Average/Uncentered Pearson): atleast for experiment C, index values seem constantly to increase providingnevertheless no clear statement with regard to an optimal clustering of thedata.


A guide through the computational analysis of isotope-labeled mass spectrometry-based quantitative proteomics data: an application study.

Albaum SP, Hahne H, Otto A, Haußmann U, Becher D, Poetsch A, Goesmann A, Nattkemper TW - Proteome Sci (2011)

Davies and Bouldin. Instead of simply proposing a cluster index,Davies and Bouldin formulated a general framework for the evaluation of theoutcomes of cluster algorithms. In contrast to other indexes, an optimalcluster solution is indicated by the minimal calculated index value. Forinstance, for the two cluster algorithms K-means and Neuralgas a localminimum can be located around the 30-cluster solution. A generalinterpretation of this index, however, seems to be difficult due to a strongtendency towards constantly decreasing index values with regard to largecluster numbers.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3142201&req=5

Figure 9: Davies and Bouldin. Instead of simply proposing a cluster index,Davies and Bouldin formulated a general framework for the evaluation of theoutcomes of cluster algorithms. In contrast to other indexes, an optimalcluster solution is indicated by the minimal calculated index value. Forinstance, for the two cluster algorithms K-means and Neuralgas a localminimum can be located around the 30-cluster solution. A generalinterpretation of this index, however, seems to be difficult due to a strongtendency towards constantly decreasing index values with regard to largecluster numbers.
Mentions: Davies and Bouldin formulated a general framework for the evaluation of theoutcomes of cluster algorithms [44]. Aninstance of their index provided by Halkidi et. al[28] follows the idea that anoptimal solution to the clustering problem has been found as soon as for eachcluster no other utmost similar cluster-with regard to the intra-cluster error sumof squares as well as the distance between clusters-can be identified. In contrastto other indexes, this is indicated by the minimal calculated index value (seeFigure 9). In experiment A, for instance, for the twocluster algorithms K-means and Neuralgas, a local minimum can be located aroundthe 30-cluster solution. A general interpretation of this index, however, seems tobe difficult due to a strong tendency towards constantly decreasing index valueswith regard to large cluster numbers. An exception are both correlation-basedcluster algorithms (Average/Pearson correlation, Average/Uncentered Pearson): atleast for experiment C, index values seem constantly to increase providingnevertheless no clear statement with regard to an optimal clustering of thedata.

Bottom Line: This work provides guidance through the jungle of computational methods to analyze mass spectrometry-based isotope-labeled datasets and recommends an effective and easy-to-use evaluation strategy.Special focus is placed on the application and validation of cluster analysis methods.All applied methods were implemented within the rich internet application QuPE 4.

View Article: PubMed Central - HTML - PubMed

Affiliation: Computational Genomics, Center for Biotechnology (CeBiTec), Bielefeld University, Germany. alu@cebitec.uni-bielefeld.de.

ABSTRACT

Background: Mass spectrometry-based proteomics has reached a stage where it is possible to comprehensively analyze the whole proteome of a cell in one experiment. Here, the employment of stable isotopes has become a standard technique to yield relative abundance values of proteins. In recent times, more and more experiments are conducted that depict not only a static image of the up- or down-regulated proteins at a distinct time point but instead compare developmental stages of an organism or varying experimental conditions.

Results: Although the scientific questions behind these experiments are of course manifold, there are, nevertheless, two questions that commonly arise: 1) which proteins are differentially regulated regarding the selected experimental conditions, and 2) are there groups of proteins that show similar abundance ratios, indicating that they have a similar turnover? We give advice on how these two questions can be answered and comprehensively compare a variety of commonly applied computational methods and their outcomes.

Conclusions: This work provides guidance through the jungle of computational methods to analyze mass spectrometry-based isotope-labeled datasets and recommends an effective and easy-to-use evaluation strategy. We demonstrate our approach with three recently published datasets on Bacillus subtilis 12 and Corynebacterium glutamicum 3. Special focus is placed on the application and validation of cluster analysis methods. All applied methods were implemented within the rich internet application QuPE 4. Results can be found at http://qupe.cebitec.uni-bielefeld.de.

No MeSH data available.