Limits...
Active semi-supervised community detection based on must-link and cannot-link constraints.

Cheng J, Leng M, Li L, Zhou H, Chen X - PLoS ONE (2014)

Bottom Line: Many community detection algorithms have been proposed, but how to incorporate the prior knowledge in the detection process remains a challenging problem.Extensive experiments were carried out, and the experimental results show that the introduction of active learning into the problem of community detection makes a success.Our proposed method can extract high-quality community structures from networks, and significantly outperforms other comparison methods.

View Article: PubMed Central - PubMed

Affiliation: School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China.

ABSTRACT
Community structure detection is of great importance because it can help in discovering the relationship between the function and the topology structure of a network. Many community detection algorithms have been proposed, but how to incorporate the prior knowledge in the detection process remains a challenging problem. In this paper, we propose a semi-supervised community detection algorithm, which makes full utilization of the must-link and cannot-link constraints to guide the process of community detection and thereby extracts high-quality community structures from networks. To acquire the high-quality must-link and cannot-link constraints, we also propose a semi-supervised component generation algorithm based on active learning, which actively selects nodes with maximum utility for the proposed semi-supervised community detection algorithm step by step, and then generates the must-link and cannot-link constraints by accessing a noiseless oracle. Extensive experiments were carried out, and the experimental results show that the introduction of active learning into the problem of community detection makes a success. Our proposed method can extract high-quality community structures from networks, and significantly outperforms other comparison methods.

Show MeSH
A simple two-community network.If the nodes are selected according to their degree values, only node  will be selected, and community  will be ignored. However, using the score value in conjunction with degree value of every node in the network as the condition, we will select node  (or ) from the network at least, which means that the selected nodes can cover all of the ground truth communities. (The different node shapes and shades indicate different communities, the black lines are the edges within communities, and the light-gray connections represent the edges across different communities. This illustration style is also applied in the following figures.)
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4201489&req=5

pone-0110088-g001: A simple two-community network.If the nodes are selected according to their degree values, only node will be selected, and community will be ignored. However, using the score value in conjunction with degree value of every node in the network as the condition, we will select node (or ) from the network at least, which means that the selected nodes can cover all of the ground truth communities. (The different node shapes and shades indicate different communities, the black lines are the edges within communities, and the light-gray connections represent the edges across different communities. This illustration style is also applied in the following figures.)

Mentions: Although Algorithm 1 needs nodes with larger degrees to be taken as community seeds to facilitate the expansion of the communities, if we select nodes using only their degrees as a condition, the nodes in small communities will necessarily be ignored. For example, in the simple two-community network illustrated in Figure 1, only node will be selected according to the values of the node degrees. It is obviously that the selected nodes do not cover all of the ground truth communities. To solve this problem, we calculate a degree-related score for every node in the network using the following formula:


Active semi-supervised community detection based on must-link and cannot-link constraints.

Cheng J, Leng M, Li L, Zhou H, Chen X - PLoS ONE (2014)

A simple two-community network.If the nodes are selected according to their degree values, only node  will be selected, and community  will be ignored. However, using the score value in conjunction with degree value of every node in the network as the condition, we will select node  (or ) from the network at least, which means that the selected nodes can cover all of the ground truth communities. (The different node shapes and shades indicate different communities, the black lines are the edges within communities, and the light-gray connections represent the edges across different communities. This illustration style is also applied in the following figures.)
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4201489&req=5

pone-0110088-g001: A simple two-community network.If the nodes are selected according to their degree values, only node will be selected, and community will be ignored. However, using the score value in conjunction with degree value of every node in the network as the condition, we will select node (or ) from the network at least, which means that the selected nodes can cover all of the ground truth communities. (The different node shapes and shades indicate different communities, the black lines are the edges within communities, and the light-gray connections represent the edges across different communities. This illustration style is also applied in the following figures.)
Mentions: Although Algorithm 1 needs nodes with larger degrees to be taken as community seeds to facilitate the expansion of the communities, if we select nodes using only their degrees as a condition, the nodes in small communities will necessarily be ignored. For example, in the simple two-community network illustrated in Figure 1, only node will be selected according to the values of the node degrees. It is obviously that the selected nodes do not cover all of the ground truth communities. To solve this problem, we calculate a degree-related score for every node in the network using the following formula:

Bottom Line: Many community detection algorithms have been proposed, but how to incorporate the prior knowledge in the detection process remains a challenging problem.Extensive experiments were carried out, and the experimental results show that the introduction of active learning into the problem of community detection makes a success.Our proposed method can extract high-quality community structures from networks, and significantly outperforms other comparison methods.

View Article: PubMed Central - PubMed

Affiliation: School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu Province, China.

ABSTRACT
Community structure detection is of great importance because it can help in discovering the relationship between the function and the topology structure of a network. Many community detection algorithms have been proposed, but how to incorporate the prior knowledge in the detection process remains a challenging problem. In this paper, we propose a semi-supervised community detection algorithm, which makes full utilization of the must-link and cannot-link constraints to guide the process of community detection and thereby extracts high-quality community structures from networks. To acquire the high-quality must-link and cannot-link constraints, we also propose a semi-supervised component generation algorithm based on active learning, which actively selects nodes with maximum utility for the proposed semi-supervised community detection algorithm step by step, and then generates the must-link and cannot-link constraints by accessing a noiseless oracle. Extensive experiments were carried out, and the experimental results show that the introduction of active learning into the problem of community detection makes a success. Our proposed method can extract high-quality community structures from networks, and significantly outperforms other comparison methods.

Show MeSH