Limits...
Identifying influential nodes in large-scale directed networks: the role of clustering.

Chen DB, Gao H, Lü L, Zhou T - PLoS ONE (2013)

Bottom Line: Inspired by these facts, we propose a local ranking algorithm named ClusterRank, which takes into account not only the number of neighbors and the neighbors' influences, but also the clustering coefficient.Subject to the susceptible-infected-recovered (SIR) spreading model with constant infectivity, experimental results on two directed networks, a social network extracted from delicious.com and a large-scale short-message communication network, demonstrate that the ClusterRank outperforms some benchmark algorithms such as PageRank and LeaderRank.Furthermore, ClusterRank can also be applied to undirected networks where the superiority of ClusterRank is significant compared with degree centrality and k-core decomposition.

View Article: PubMed Central - PubMed

Affiliation: Web Sciences Center, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, People's Republic of China ; Department of Physics, University of Fribourg, Fribourg, Switzerland.

ABSTRACT
Identifying influential nodes in very large-scale directed networks is a big challenge relevant to disparate applications, such as accelerating information propagation, controlling rumors and diseases, designing search engines, and understanding hierarchical organization of social and biological networks. Known methods range from node centralities, such as degree, closeness and betweenness, to diffusion-based processes, like PageRank and LeaderRank. Some of these methods already take into account the influences of a node's neighbors but do not directly make use of the interactions among it's neighbors. Local clustering is known to have negative impacts on the information spreading. We further show empirically that it also plays a negative role in generating local connections. Inspired by these facts, we propose a local ranking algorithm named ClusterRank, which takes into account not only the number of neighbors and the neighbors' influences, but also the clustering coefficient. Subject to the susceptible-infected-recovered (SIR) spreading model with constant infectivity, experimental results on two directed networks, a social network extracted from delicious.com and a large-scale short-message communication network, demonstrate that the ClusterRank outperforms some benchmark algorithms such as PageRank and LeaderRank. Furthermore, ClusterRank can also be applied to undirected networks where the superiority of ClusterRank is significant compared with degree centrality and k-core decomposition. In addition, ClusterRank, only making use of local information, is much more efficient than global methods: It takes only 191 seconds for a network with about [Formula: see text] nodes, more than 15 times faster than PageRank.

Show MeSH
An example network with 38 nodes and 110 directed edges.Although nodes 0 and 37 have the same out-degree, node 37 is of higher influence (subject to spreading dynamics) than node 0. The clustering coefficients of these two nodes are  and .
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3814409&req=5

pone-0077455-g003: An example network with 38 nodes and 110 directed edges.Although nodes 0 and 37 have the same out-degree, node 37 is of higher influence (subject to spreading dynamics) than node 0. The clustering coefficients of these two nodes are and .

Mentions: Based on the empirical observation, we here propose a local ranking index, named ClusterRank, to quantify the influence of a node by taking into account not only its direct influence (measured by the number of its followers) and influences of its neighbors, but also its clustering coefficient. Mathematically, the ClusterRank score of node is defined as:(4)where the term accounts for the effect of ’s local clustering and the term ‘+1’ results from the contribution of itself. Usually, the local clustering plays a negative role in spreading [28], [29], [40] since if ’s followers closely interact with each other rather than with other nodes, the spreading initiated from node is more likely to be confined in a local region. On the contrary, if ’s neighbors are mostly connected with nodes other than ’s neighbors, the information will quickly spread to a large scope. For example, in figure 3, although node 0 has the same out-degree with node 37, node 37, with lower clustering, is of higher influence than node 0, since most of node 37’s neighbors point to nodes other than themselves and thus can send the information to wide audiences. We here adopt a simple exponential function, namely , a decreasing function of . Actually, we can apply a more complicated form by introducing a new parameter, such as or . However, it adds little value to rank nodes but make the analysis more complicated. Indeed, the perspective and results of this paper are not limited by a very specific function of .


Identifying influential nodes in large-scale directed networks: the role of clustering.

Chen DB, Gao H, Lü L, Zhou T - PLoS ONE (2013)

An example network with 38 nodes and 110 directed edges.Although nodes 0 and 37 have the same out-degree, node 37 is of higher influence (subject to spreading dynamics) than node 0. The clustering coefficients of these two nodes are  and .
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3814409&req=5

pone-0077455-g003: An example network with 38 nodes and 110 directed edges.Although nodes 0 and 37 have the same out-degree, node 37 is of higher influence (subject to spreading dynamics) than node 0. The clustering coefficients of these two nodes are and .
Mentions: Based on the empirical observation, we here propose a local ranking index, named ClusterRank, to quantify the influence of a node by taking into account not only its direct influence (measured by the number of its followers) and influences of its neighbors, but also its clustering coefficient. Mathematically, the ClusterRank score of node is defined as:(4)where the term accounts for the effect of ’s local clustering and the term ‘+1’ results from the contribution of itself. Usually, the local clustering plays a negative role in spreading [28], [29], [40] since if ’s followers closely interact with each other rather than with other nodes, the spreading initiated from node is more likely to be confined in a local region. On the contrary, if ’s neighbors are mostly connected with nodes other than ’s neighbors, the information will quickly spread to a large scope. For example, in figure 3, although node 0 has the same out-degree with node 37, node 37, with lower clustering, is of higher influence than node 0, since most of node 37’s neighbors point to nodes other than themselves and thus can send the information to wide audiences. We here adopt a simple exponential function, namely , a decreasing function of . Actually, we can apply a more complicated form by introducing a new parameter, such as or . However, it adds little value to rank nodes but make the analysis more complicated. Indeed, the perspective and results of this paper are not limited by a very specific function of .

Bottom Line: Inspired by these facts, we propose a local ranking algorithm named ClusterRank, which takes into account not only the number of neighbors and the neighbors' influences, but also the clustering coefficient.Subject to the susceptible-infected-recovered (SIR) spreading model with constant infectivity, experimental results on two directed networks, a social network extracted from delicious.com and a large-scale short-message communication network, demonstrate that the ClusterRank outperforms some benchmark algorithms such as PageRank and LeaderRank.Furthermore, ClusterRank can also be applied to undirected networks where the superiority of ClusterRank is significant compared with degree centrality and k-core decomposition.

View Article: PubMed Central - PubMed

Affiliation: Web Sciences Center, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, People's Republic of China ; Department of Physics, University of Fribourg, Fribourg, Switzerland.

ABSTRACT
Identifying influential nodes in very large-scale directed networks is a big challenge relevant to disparate applications, such as accelerating information propagation, controlling rumors and diseases, designing search engines, and understanding hierarchical organization of social and biological networks. Known methods range from node centralities, such as degree, closeness and betweenness, to diffusion-based processes, like PageRank and LeaderRank. Some of these methods already take into account the influences of a node's neighbors but do not directly make use of the interactions among it's neighbors. Local clustering is known to have negative impacts on the information spreading. We further show empirically that it also plays a negative role in generating local connections. Inspired by these facts, we propose a local ranking algorithm named ClusterRank, which takes into account not only the number of neighbors and the neighbors' influences, but also the clustering coefficient. Subject to the susceptible-infected-recovered (SIR) spreading model with constant infectivity, experimental results on two directed networks, a social network extracted from delicious.com and a large-scale short-message communication network, demonstrate that the ClusterRank outperforms some benchmark algorithms such as PageRank and LeaderRank. Furthermore, ClusterRank can also be applied to undirected networks where the superiority of ClusterRank is significant compared with degree centrality and k-core decomposition. In addition, ClusterRank, only making use of local information, is much more efficient than global methods: It takes only 191 seconds for a network with about [Formula: see text] nodes, more than 15 times faster than PageRank.

Show MeSH