Limits...
Template-free detection of macromolecular complexes in cryo electron tomograms.

Xu M, Beck M, Alber F - Bioinformatics (2011)

Bottom Line: Cryo electron tomography (CryoET) produces 3D density maps of biological specimen in its near native states.Applied to small cells, cryoET produces 3D snapshots of the cellular distributions of large complexes.However, so far only a small fraction of all protein complexes have been structurally resolved.

View Article: PubMed Central - PubMed

Affiliation: Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA.

ABSTRACT

Motivation: Cryo electron tomography (CryoET) produces 3D density maps of biological specimen in its near native states. Applied to small cells, cryoET produces 3D snapshots of the cellular distributions of large complexes. However, retrieving this information is non-trivial due to the low resolution and low signal-to-noise ratio in tomograms. Current pattern recognition methods identify complexes by matching known structures to the cryo electron tomogram. However, so far only a small fraction of all protein complexes have been structurally resolved. It is, therefore, of great importance to develop template-free methods for the discovery of previously unknown protein complexes in cryo electron tomograms.

Results: Here, we have developed an inference method for the template-free discovery of frequently occurring protein complexes in cryo electron tomograms. We provide a first proof-of-principle of the approach and assess its applicability using realistically simulated tomograms, allowing for the inclusion of noise and distortions due to missing wedge and electron optical factors. Our method is a step toward the template-free discovery of the shapes, abundance and spatial distributions of previously unknown macromolecular complexes in whole cell tomograms.

Contact: alber@usc.edu

Show MeSH

Related in: MedlinePlus

(A) (Left panel) Initial classification for a density region that contains a proteasome complex (blue color). It is evident that the proximity of the complex contains voxels that are false classified as being part of another complex class (grey color). (Middle panel) After GHMRF-based refinement, most of the voxels assigned to the second complex class have been removed. (Right panel) Original density map of the proteasome complex at 4 nm resolution, shown without noise, missing wedge, CTF and MTF distortions. (B) Classification for a tomogram of set 1: left panel shows the initial density map of the sample collection of four different types of complexes, each with 10 copies. (middle panel) Based on this sample a tomogram is simulated with an SNR of 0.5. (Right panel) The GHMRF-based classification discovers several sets of recurrent density patterns that represent the different complexes in the sample. (C) (Top panel) The initial classification discovers five different classes of patterns, each containing several instances. (Middle panel) The GHMRF-based reclassification improves the predictions considerably. (Lower panel) The four different classes of complexes in the initial dataset. It is evident that complexes in class 3 have been divided into two classes in the GHMRF-based classification. However, all complexes classified to the same class are identical. (The selected example shows an average classification performance.)
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3117359&req=5

Figure 6: (A) (Left panel) Initial classification for a density region that contains a proteasome complex (blue color). It is evident that the proximity of the complex contains voxels that are false classified as being part of another complex class (grey color). (Middle panel) After GHMRF-based refinement, most of the voxels assigned to the second complex class have been removed. (Right panel) Original density map of the proteasome complex at 4 nm resolution, shown without noise, missing wedge, CTF and MTF distortions. (B) Classification for a tomogram of set 1: left panel shows the initial density map of the sample collection of four different types of complexes, each with 10 copies. (middle panel) Based on this sample a tomogram is simulated with an SNR of 0.5. (Right panel) The GHMRF-based classification discovers several sets of recurrent density patterns that represent the different complexes in the sample. (C) (Top panel) The initial classification discovers five different classes of patterns, each containing several instances. (Middle panel) The GHMRF-based reclassification improves the predictions considerably. (Lower panel) The four different classes of complexes in the initial dataset. It is evident that complexes in class 3 have been divided into two classes in the GHMRF-based classification. However, all complexes classified to the same class are identical. (The selected example shows an average classification performance.)

Mentions: To compare the influence of noise, we have generated 50 tomograms for four different SNR levels (Tables 1 and 2). For benchmark set 1, the average precision of the initial classification is 0.44 with an average recall of 0.54 for tomograms without noise. With increasing noise levels, the performance reduces to 0.39 and 0.5 for the precision and recall, respectively. As expected, the GHMRF model improves significantly the precision and recall. For tomograms with the highest noise level, the average precision is improved from 0.39 to 0.44 in comparison to the initial classification (Table 1 and Fig. 6). These observations indicate that about 40% of all voxels can be predicted as members of the correct pattern class, even when significant noise and distortions are present in the tomogram. This excellent performance is in a similar range as classifications based on template matching.Fig. 6.


Template-free detection of macromolecular complexes in cryo electron tomograms.

Xu M, Beck M, Alber F - Bioinformatics (2011)

(A) (Left panel) Initial classification for a density region that contains a proteasome complex (blue color). It is evident that the proximity of the complex contains voxels that are false classified as being part of another complex class (grey color). (Middle panel) After GHMRF-based refinement, most of the voxels assigned to the second complex class have been removed. (Right panel) Original density map of the proteasome complex at 4 nm resolution, shown without noise, missing wedge, CTF and MTF distortions. (B) Classification for a tomogram of set 1: left panel shows the initial density map of the sample collection of four different types of complexes, each with 10 copies. (middle panel) Based on this sample a tomogram is simulated with an SNR of 0.5. (Right panel) The GHMRF-based classification discovers several sets of recurrent density patterns that represent the different complexes in the sample. (C) (Top panel) The initial classification discovers five different classes of patterns, each containing several instances. (Middle panel) The GHMRF-based reclassification improves the predictions considerably. (Lower panel) The four different classes of complexes in the initial dataset. It is evident that complexes in class 3 have been divided into two classes in the GHMRF-based classification. However, all complexes classified to the same class are identical. (The selected example shows an average classification performance.)
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3117359&req=5

Figure 6: (A) (Left panel) Initial classification for a density region that contains a proteasome complex (blue color). It is evident that the proximity of the complex contains voxels that are false classified as being part of another complex class (grey color). (Middle panel) After GHMRF-based refinement, most of the voxels assigned to the second complex class have been removed. (Right panel) Original density map of the proteasome complex at 4 nm resolution, shown without noise, missing wedge, CTF and MTF distortions. (B) Classification for a tomogram of set 1: left panel shows the initial density map of the sample collection of four different types of complexes, each with 10 copies. (middle panel) Based on this sample a tomogram is simulated with an SNR of 0.5. (Right panel) The GHMRF-based classification discovers several sets of recurrent density patterns that represent the different complexes in the sample. (C) (Top panel) The initial classification discovers five different classes of patterns, each containing several instances. (Middle panel) The GHMRF-based reclassification improves the predictions considerably. (Lower panel) The four different classes of complexes in the initial dataset. It is evident that complexes in class 3 have been divided into two classes in the GHMRF-based classification. However, all complexes classified to the same class are identical. (The selected example shows an average classification performance.)
Mentions: To compare the influence of noise, we have generated 50 tomograms for four different SNR levels (Tables 1 and 2). For benchmark set 1, the average precision of the initial classification is 0.44 with an average recall of 0.54 for tomograms without noise. With increasing noise levels, the performance reduces to 0.39 and 0.5 for the precision and recall, respectively. As expected, the GHMRF model improves significantly the precision and recall. For tomograms with the highest noise level, the average precision is improved from 0.39 to 0.44 in comparison to the initial classification (Table 1 and Fig. 6). These observations indicate that about 40% of all voxels can be predicted as members of the correct pattern class, even when significant noise and distortions are present in the tomogram. This excellent performance is in a similar range as classifications based on template matching.Fig. 6.

Bottom Line: Cryo electron tomography (CryoET) produces 3D density maps of biological specimen in its near native states.Applied to small cells, cryoET produces 3D snapshots of the cellular distributions of large complexes.However, so far only a small fraction of all protein complexes have been structurally resolved.

View Article: PubMed Central - PubMed

Affiliation: Program in Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA.

ABSTRACT

Motivation: Cryo electron tomography (CryoET) produces 3D density maps of biological specimen in its near native states. Applied to small cells, cryoET produces 3D snapshots of the cellular distributions of large complexes. However, retrieving this information is non-trivial due to the low resolution and low signal-to-noise ratio in tomograms. Current pattern recognition methods identify complexes by matching known structures to the cryo electron tomogram. However, so far only a small fraction of all protein complexes have been structurally resolved. It is, therefore, of great importance to develop template-free methods for the discovery of previously unknown protein complexes in cryo electron tomograms.

Results: Here, we have developed an inference method for the template-free discovery of frequently occurring protein complexes in cryo electron tomograms. We provide a first proof-of-principle of the approach and assess its applicability using realistically simulated tomograms, allowing for the inclusion of noise and distortions due to missing wedge and electron optical factors. Our method is a step toward the template-free discovery of the shapes, abundance and spatial distributions of previously unknown macromolecular complexes in whole cell tomograms.

Contact: alber@usc.edu

Show MeSH
Related in: MedlinePlus