Limits...
The reconstruction of complex networks with community structure.

Zhang P, Wang F, Wang X, Zeng A, Xiao J - Sci Rep (2015)

Bottom Line: Link prediction is a fundamental problem with applications in many fields ranging from biology to computer science.In the literature, most effort has been devoted to estimate the likelihood of the existence of a link between two nodes, based on observed links and nodes' attributes in a network.We find that our method has high prediction accuracy and is very effective in reconstructing the inter-community links.

View Article: PubMed Central - PubMed

Affiliation: School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, P.R. China.

ABSTRACT
Link prediction is a fundamental problem with applications in many fields ranging from biology to computer science. In the literature, most effort has been devoted to estimate the likelihood of the existence of a link between two nodes, based on observed links and nodes' attributes in a network. In this paper, we apply several representative link prediction methods to reconstruct the network, namely to add the missing links with high likelihood of existence back to the network. We find that all these existing methods fail to identify the links connecting different communities, resulting in a poor reproduction of the topological and dynamical properties of the true network. To solve this problem, we propose a community-based link prediction method. We find that our method has high prediction accuracy and is very effective in reconstructing the inter-community links.

No MeSH data available.


The illustration of the community-based link prediction method.The network on the left is the network consisting of the links in the training set. The nodes within one community are marked by the same color. The solid links represent the observed links and the dashed links stand for the predicted links. When β = 0, the inter-community missing links are ranked higher than the intra-community missing links in the prediction list. Therefore, mainly inter-community links are added to the network by the link prediction method. When β = 1, the intra-community missing links are ranked higher than the inter-community missing links in the prediction list, and mainly intra-community links are added to the network. When β = 0.05, the results are mixed, both inter- and intra-community missing links are added to the network. The similarity measure used in this toy network is CN.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4664866&req=5

f1: The illustration of the community-based link prediction method.The network on the left is the network consisting of the links in the training set. The nodes within one community are marked by the same color. The solid links represent the observed links and the dashed links stand for the predicted links. When β = 0, the inter-community missing links are ranked higher than the intra-community missing links in the prediction list. Therefore, mainly inter-community links are added to the network by the link prediction method. When β = 1, the intra-community missing links are ranked higher than the inter-community missing links in the prediction list, and mainly intra-community links are added to the network. When β = 0.05, the results are mixed, both inter- and intra-community missing links are added to the network. The similarity measure used in this toy network is CN.

Mentions: In this paper, we focus on the networks with community structure. According to the definition, the nodes within a community are densely connected while the nodes across communities are much more sparsely connected. In this kind of networks, the inter-community links are in general more difficult to be predicted. Without these inter-community links, the average shortest path length of the reconstructed networks would be much higher than the original networks, and the transportation dynamics40 in this network would be much slower and congested in the reconstructed networks. In order to solve this problem, we propose a community-based link prediction method. We first detect the communities by using the algorithm41 in the training set. Then the similarity scores between unconnected node pairs are computed by some classic local similarity measures (i.e. the CN or RA methods, see the Methods section for definitions). We also consider three global link prediction methods323942, the results are similar to those of CN and RA (see Supplementary Information (SI)). A tunable parameter β ∈[0, 1] is proposed to combine the information of communities and node similarity for link prediction. In practice, the node pairs are classified as intra-community pairs and inter-community pairs. Within each classification, the node pairs are ranked in descending order according to the similarity measures. controls the probability that the intra-community node pairs ranked higher than the inter-community node pairs (see the Methods section for details). This method is inspired by ref. 43 but used here for a different goal. For convenience, when the method is combined with common neighbor similarity, it is called community-based CN method (CBCN). Similarly, it is called community-based RA method (CBRA) when it is combined with the resource allocation similarity. The illustration of the method is shown in Fig. 1. Like previous works43, we adopt to evaluate the accuracy of the link prediction. In addition, we propose to monitor the average edge-betweenness of the predicted links (calculated by adding those predicted links to the network). If the average edge-betweenness is high, more inter-community links are predicted (For the solid evidences, see SI). In fact, measuring the average betweenness of the reconstructed network is also a good evaluation metric for this issue. Despite some quantitative difference, the results are qualitatively consistent with the results when is used (see results in SI).


The reconstruction of complex networks with community structure.

Zhang P, Wang F, Wang X, Zeng A, Xiao J - Sci Rep (2015)

The illustration of the community-based link prediction method.The network on the left is the network consisting of the links in the training set. The nodes within one community are marked by the same color. The solid links represent the observed links and the dashed links stand for the predicted links. When β = 0, the inter-community missing links are ranked higher than the intra-community missing links in the prediction list. Therefore, mainly inter-community links are added to the network by the link prediction method. When β = 1, the intra-community missing links are ranked higher than the inter-community missing links in the prediction list, and mainly intra-community links are added to the network. When β = 0.05, the results are mixed, both inter- and intra-community missing links are added to the network. The similarity measure used in this toy network is CN.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4664866&req=5

f1: The illustration of the community-based link prediction method.The network on the left is the network consisting of the links in the training set. The nodes within one community are marked by the same color. The solid links represent the observed links and the dashed links stand for the predicted links. When β = 0, the inter-community missing links are ranked higher than the intra-community missing links in the prediction list. Therefore, mainly inter-community links are added to the network by the link prediction method. When β = 1, the intra-community missing links are ranked higher than the inter-community missing links in the prediction list, and mainly intra-community links are added to the network. When β = 0.05, the results are mixed, both inter- and intra-community missing links are added to the network. The similarity measure used in this toy network is CN.
Mentions: In this paper, we focus on the networks with community structure. According to the definition, the nodes within a community are densely connected while the nodes across communities are much more sparsely connected. In this kind of networks, the inter-community links are in general more difficult to be predicted. Without these inter-community links, the average shortest path length of the reconstructed networks would be much higher than the original networks, and the transportation dynamics40 in this network would be much slower and congested in the reconstructed networks. In order to solve this problem, we propose a community-based link prediction method. We first detect the communities by using the algorithm41 in the training set. Then the similarity scores between unconnected node pairs are computed by some classic local similarity measures (i.e. the CN or RA methods, see the Methods section for definitions). We also consider three global link prediction methods323942, the results are similar to those of CN and RA (see Supplementary Information (SI)). A tunable parameter β ∈[0, 1] is proposed to combine the information of communities and node similarity for link prediction. In practice, the node pairs are classified as intra-community pairs and inter-community pairs. Within each classification, the node pairs are ranked in descending order according to the similarity measures. controls the probability that the intra-community node pairs ranked higher than the inter-community node pairs (see the Methods section for details). This method is inspired by ref. 43 but used here for a different goal. For convenience, when the method is combined with common neighbor similarity, it is called community-based CN method (CBCN). Similarly, it is called community-based RA method (CBRA) when it is combined with the resource allocation similarity. The illustration of the method is shown in Fig. 1. Like previous works43, we adopt to evaluate the accuracy of the link prediction. In addition, we propose to monitor the average edge-betweenness of the predicted links (calculated by adding those predicted links to the network). If the average edge-betweenness is high, more inter-community links are predicted (For the solid evidences, see SI). In fact, measuring the average betweenness of the reconstructed network is also a good evaluation metric for this issue. Despite some quantitative difference, the results are qualitatively consistent with the results when is used (see results in SI).

Bottom Line: Link prediction is a fundamental problem with applications in many fields ranging from biology to computer science.In the literature, most effort has been devoted to estimate the likelihood of the existence of a link between two nodes, based on observed links and nodes' attributes in a network.We find that our method has high prediction accuracy and is very effective in reconstructing the inter-community links.

View Article: PubMed Central - PubMed

Affiliation: School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, P.R. China.

ABSTRACT
Link prediction is a fundamental problem with applications in many fields ranging from biology to computer science. In the literature, most effort has been devoted to estimate the likelihood of the existence of a link between two nodes, based on observed links and nodes' attributes in a network. In this paper, we apply several representative link prediction methods to reconstruct the network, namely to add the missing links with high likelihood of existence back to the network. We find that all these existing methods fail to identify the links connecting different communities, resulting in a poor reproduction of the topological and dynamical properties of the true network. To solve this problem, we propose a community-based link prediction method. We find that our method has high prediction accuracy and is very effective in reconstructing the inter-community links.

No MeSH data available.