Limits...
An efficient protein complex mining algorithm based on Multistage Kernel Extension.

Shen X, Zhao Y, Li Y, He T, Yang J, Hu X - BMC Bioinformatics (2014)

Bottom Line: This process is repeated, extending the current kernel to form protein complex.In the end, overlapped protein complexes are merged to form the final protein complex set.MKE also performs better than the classical clique percolation method both on Gene Ontology semantic similarity and co-localization enrichment and can effectively identify protein complexes with biological significance in the PPI network.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: In recent years, many protein complex mining algorithms, such as classical clique percolation (CPM) method and markov clustering (MCL) algorithm, have developed for protein-protein interaction network. However, most of the available algorithms primarily concentrate on mining dense protein subgraphs as protein complexes, failing to take into account the inherent organizational structure within protein complexes. Thus, there is a critical need to study the possibility of mining protein complexes using the topological information hidden in edges. Moreover, the recent massive experimental analyses reveal that protein complexes have their own intrinsic organization.

Methods: Inspired by the formation process of cliques of the complex social network and the centrality-lethality rule, we propose a new protein complex mining algorithm called Multistage Kernel Extension (MKE) algorithm, integrating the idea of critical proteins recognition in the Protein- Protein Interaction (PPI) network,. MKE first recognizes the nodes with high degree as the first level kernel of protein complex, and then adds the weighted best neighbour node of the first level kernel into the current kernel to form the second level kernel of the protein complex. This process is repeated, extending the current kernel to form protein complex. In the end, overlapped protein complexes are merged to form the final protein complex set.

Results: Here MKE has better accuracy compared with the classical clique percolation method and markov clustering algorithm. MKE also performs better than the classical clique percolation method both on Gene Ontology semantic similarity and co-localization enrichment and can effectively identify protein complexes with biological significance in the PPI network.

Show MeSH

Related in: MedlinePlus

The average size of complexes predicted under different extended level parameter α . The impact that the extended level parameter α . α  has on the Krogan and Collins dataset.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4255745&req=5

Figure 2: The average size of complexes predicted under different extended level parameter α . The impact that the extended level parameter α . α has on the Krogan and Collins dataset.

Mentions: The MKE algorithm generates the protein complexes by extending the kernel clusters, but different datasets have different optimal extended level parameter α . The impact that the extended level parameter α has on the Krogan and Collins dataset is shown in Figure 2, the average size of the protein complexes the algorithm discovered on the Krogan dataset dramatically increases with the larger parameter α . Although the average size of the protein complexes is stable when parameter α is 4, it is far from the average size of the reference protein complex set. On Collins dataset, the average size of the protein complexes basically exhibits the similar trends. From Figure 2, for Krogan and Collins datasets, when and respectively, it can be seen that the average size of the protein complexes becomes stable. It indicates that all of kernels in the Krogan and Collins datasets experience at least 4 or 7 times of extension to meet the condition that the increased number of nodes in the next kernel extension is less than that in the previous kernel extension.


An efficient protein complex mining algorithm based on Multistage Kernel Extension.

Shen X, Zhao Y, Li Y, He T, Yang J, Hu X - BMC Bioinformatics (2014)

The average size of complexes predicted under different extended level parameter α . The impact that the extended level parameter α . α  has on the Krogan and Collins dataset.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4255745&req=5

Figure 2: The average size of complexes predicted under different extended level parameter α . The impact that the extended level parameter α . α has on the Krogan and Collins dataset.
Mentions: The MKE algorithm generates the protein complexes by extending the kernel clusters, but different datasets have different optimal extended level parameter α . The impact that the extended level parameter α has on the Krogan and Collins dataset is shown in Figure 2, the average size of the protein complexes the algorithm discovered on the Krogan dataset dramatically increases with the larger parameter α . Although the average size of the protein complexes is stable when parameter α is 4, it is far from the average size of the reference protein complex set. On Collins dataset, the average size of the protein complexes basically exhibits the similar trends. From Figure 2, for Krogan and Collins datasets, when and respectively, it can be seen that the average size of the protein complexes becomes stable. It indicates that all of kernels in the Krogan and Collins datasets experience at least 4 or 7 times of extension to meet the condition that the increased number of nodes in the next kernel extension is less than that in the previous kernel extension.

Bottom Line: This process is repeated, extending the current kernel to form protein complex.In the end, overlapped protein complexes are merged to form the final protein complex set.MKE also performs better than the classical clique percolation method both on Gene Ontology semantic similarity and co-localization enrichment and can effectively identify protein complexes with biological significance in the PPI network.

View Article: PubMed Central - HTML - PubMed

ABSTRACT

Background: In recent years, many protein complex mining algorithms, such as classical clique percolation (CPM) method and markov clustering (MCL) algorithm, have developed for protein-protein interaction network. However, most of the available algorithms primarily concentrate on mining dense protein subgraphs as protein complexes, failing to take into account the inherent organizational structure within protein complexes. Thus, there is a critical need to study the possibility of mining protein complexes using the topological information hidden in edges. Moreover, the recent massive experimental analyses reveal that protein complexes have their own intrinsic organization.

Methods: Inspired by the formation process of cliques of the complex social network and the centrality-lethality rule, we propose a new protein complex mining algorithm called Multistage Kernel Extension (MKE) algorithm, integrating the idea of critical proteins recognition in the Protein- Protein Interaction (PPI) network,. MKE first recognizes the nodes with high degree as the first level kernel of protein complex, and then adds the weighted best neighbour node of the first level kernel into the current kernel to form the second level kernel of the protein complex. This process is repeated, extending the current kernel to form protein complex. In the end, overlapped protein complexes are merged to form the final protein complex set.

Results: Here MKE has better accuracy compared with the classical clique percolation method and markov clustering algorithm. MKE also performs better than the classical clique percolation method both on Gene Ontology semantic similarity and co-localization enrichment and can effectively identify protein complexes with biological significance in the PPI network.

Show MeSH
Related in: MedlinePlus