Limits...
Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data.

Choi H, Kim S, Gingras AC, Nesvizhskii AI - Mol. Syst. Biol. (2010)

Bottom Line: In doing so, nested clustering effectively addresses the problem of overrepresentation of interactions involving baits proteins as compared with proteins only identified as preys.The method does not require specification of the number of bait clusters, which is an advantage against existing model-based clustering methods.We also discuss general challenges of analyzing and interpreting clustering results in the context of AP-MS data.

View Article: PubMed Central - PubMed

Affiliation: Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA.

ABSTRACT
Affinity purification followed by mass spectrometry (AP-MS) has become a common approach for identifying protein-protein interactions (PPIs) and complexes. However, data analysis and visualization often rely on generic approaches that do not take advantage of the quantitative nature of AP-MS. We present a novel computational method, nested clustering, for biclustering of label-free quantitative AP-MS data. Our approach forms bait clusters based on the similarity of quantitative interaction profiles and identifies submatrices of prey proteins showing consistent quantitative association within bait clusters. In doing so, nested clustering effectively addresses the problem of overrepresentation of interactions involving baits proteins as compared with proteins only identified as preys. The method does not require specification of the number of bait clusters, which is an advantage against existing model-based clustering methods. We illustrate the performance of the algorithm using two published intermediate scale human PPI data sets, which are representative of the AP-MS data generated from mammalian cells. We also discuss general challenges of analyzing and interpreting clustering results in the context of AP-MS data.

Show MeSH
Application of nested clustering to TIP49a/b data set. (A) Heatmap of the raw spectral count data organized using estimated mean values. (B) Heatmap of the estimated mean spectral counts. (C) Network visualization of SRCAP, TRRAP, hINO80, and Prefoldin complexes in Sardiu et al (2008). Green and brown nodes are baits and preys, respectively. Baits are shown as circles of larger size to indicate that they are the anchors of protein complexes constructed by nested clustering. Red circles indicate large-protein complexes identified in the form of submatrices.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2913403&req=5

f2: Application of nested clustering to TIP49a/b data set. (A) Heatmap of the raw spectral count data organized using estimated mean values. (B) Heatmap of the estimated mean spectral counts. (C) Network visualization of SRCAP, TRRAP, hINO80, and Prefoldin complexes in Sardiu et al (2008). Green and brown nodes are baits and preys, respectively. Baits are shown as circles of larger size to indicate that they are the anchors of protein complexes constructed by nested clustering. Red circles indicate large-protein complexes identified in the form of submatrices.

Mentions: After identifying bait clusters, preys were assigned to mixture component distributions of abundance within each bait cluster, resulting in nested clustering of prey proteins, as represented by colored rectangles in the heatmap (estimated mean abundance in Figure 2B; raw spectral count data in Figure 2A). Notice that, due to the automatic clustering property of DPM model, the estimated mean and variance of all boxes (nested prey clusters) were ‘regularized' toward a small pool of common values, yielding a small number of submatrices. To assemble these boxes into protein complexes, all boxes sharing the same mean abundance value in each bait cluster were combined together.


Analysis of protein complexes through model-based biclustering of label-free quantitative AP-MS data.

Choi H, Kim S, Gingras AC, Nesvizhskii AI - Mol. Syst. Biol. (2010)

Application of nested clustering to TIP49a/b data set. (A) Heatmap of the raw spectral count data organized using estimated mean values. (B) Heatmap of the estimated mean spectral counts. (C) Network visualization of SRCAP, TRRAP, hINO80, and Prefoldin complexes in Sardiu et al (2008). Green and brown nodes are baits and preys, respectively. Baits are shown as circles of larger size to indicate that they are the anchors of protein complexes constructed by nested clustering. Red circles indicate large-protein complexes identified in the form of submatrices.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2913403&req=5

f2: Application of nested clustering to TIP49a/b data set. (A) Heatmap of the raw spectral count data organized using estimated mean values. (B) Heatmap of the estimated mean spectral counts. (C) Network visualization of SRCAP, TRRAP, hINO80, and Prefoldin complexes in Sardiu et al (2008). Green and brown nodes are baits and preys, respectively. Baits are shown as circles of larger size to indicate that they are the anchors of protein complexes constructed by nested clustering. Red circles indicate large-protein complexes identified in the form of submatrices.
Mentions: After identifying bait clusters, preys were assigned to mixture component distributions of abundance within each bait cluster, resulting in nested clustering of prey proteins, as represented by colored rectangles in the heatmap (estimated mean abundance in Figure 2B; raw spectral count data in Figure 2A). Notice that, due to the automatic clustering property of DPM model, the estimated mean and variance of all boxes (nested prey clusters) were ‘regularized' toward a small pool of common values, yielding a small number of submatrices. To assemble these boxes into protein complexes, all boxes sharing the same mean abundance value in each bait cluster were combined together.

Bottom Line: In doing so, nested clustering effectively addresses the problem of overrepresentation of interactions involving baits proteins as compared with proteins only identified as preys.The method does not require specification of the number of bait clusters, which is an advantage against existing model-based clustering methods.We also discuss general challenges of analyzing and interpreting clustering results in the context of AP-MS data.

View Article: PubMed Central - PubMed

Affiliation: Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA.

ABSTRACT
Affinity purification followed by mass spectrometry (AP-MS) has become a common approach for identifying protein-protein interactions (PPIs) and complexes. However, data analysis and visualization often rely on generic approaches that do not take advantage of the quantitative nature of AP-MS. We present a novel computational method, nested clustering, for biclustering of label-free quantitative AP-MS data. Our approach forms bait clusters based on the similarity of quantitative interaction profiles and identifies submatrices of prey proteins showing consistent quantitative association within bait clusters. In doing so, nested clustering effectively addresses the problem of overrepresentation of interactions involving baits proteins as compared with proteins only identified as preys. The method does not require specification of the number of bait clusters, which is an advantage against existing model-based clustering methods. We illustrate the performance of the algorithm using two published intermediate scale human PPI data sets, which are representative of the AP-MS data generated from mammalian cells. We also discuss general challenges of analyzing and interpreting clustering results in the context of AP-MS data.

Show MeSH