Limits...
A Geometric Representation of Collective Attention Flows.

Shi P, Huang X, Wang J, Zhang J, Deng S, Wu Y - PLoS ONE (2015)

Bottom Line: As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW.And the patterns are stable across different periods.Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.

View Article: PubMed Central - PubMed

Affiliation: Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China; School of Systems Science, Beijing Normal University, Beijing, China.

ABSTRACT
With the fast development of Internet and WWW, "information overload" has become an overwhelming problem, and collective attention of users will play a more important role nowadays. As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW. In this paper, we propose a method to embed a large number of web sites into a high dimensional Euclidean space according to the novel concept of flow distance, which both considers connection topology between sites and collective click behaviors of users. With this geometric representation, we visualize the attention flow in the data set of Indiana university clickstream over one day. It turns out that all the websites can be embedded into a 20 dimensional ball, in which, close sites are always visited by users sequentially. The distributions of websites, attention flows, and dissipations can be divided into three spherical crowns (core, interim, and periphery). 20% popular sites (Google.com, Myspace.com, Facebook.com, etc.) attracting 75% attention flows with only 55% dissipations (log off users) locate in the central layer with the radius 4.1. While 60% sites attracting only about 22% traffics with almost 38% dissipations locate in the middle area with radius between 4.1 and 6.3. Other 20% sites are far from the central area. All the cumulative distributions of variables can be well fitted by "S"-shaped curves. And the patterns are stable across different periods. Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.

No MeSH data available.


The distribution of flow distances (lij) on October 10, 2006, March 10, 2007, September 10, 2007, and February 10, 2008.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4556699&req=5

pone.0136243.g002: The distribution of flow distances (lij) on October 10, 2006, March 10, 2007, September 10, 2007, and February 10, 2008.

Mentions: We use the open flow network to model the clickstream data of October 10,2006 (158232 nodes included), March 10,2007 (85080 nodes included), September 10,2007 (138047 nodes included), and February 10,2008 (111189 nodes included) (see Method section) and we calculate the flow distances lijs (see Method section) for all node pairs of the flow network. The distributions of all flow distances are shown in Fig 2. We find that they are similar in different times and the average distances in four snapshots are all close to 4.5 which exhibiting the small world effect. The flow distance notion both considers the topological closeness of websites and the average real behaviors of surfing which is apparently different from the traditional shortest path distance [30] and random walk distance [31–34] on close flow networks (see the detailed discussions in the method section).


A Geometric Representation of Collective Attention Flows.

Shi P, Huang X, Wang J, Zhang J, Deng S, Wu Y - PLoS ONE (2015)

The distribution of flow distances (lij) on October 10, 2006, March 10, 2007, September 10, 2007, and February 10, 2008.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4556699&req=5

pone.0136243.g002: The distribution of flow distances (lij) on October 10, 2006, March 10, 2007, September 10, 2007, and February 10, 2008.
Mentions: We use the open flow network to model the clickstream data of October 10,2006 (158232 nodes included), March 10,2007 (85080 nodes included), September 10,2007 (138047 nodes included), and February 10,2008 (111189 nodes included) (see Method section) and we calculate the flow distances lijs (see Method section) for all node pairs of the flow network. The distributions of all flow distances are shown in Fig 2. We find that they are similar in different times and the average distances in four snapshots are all close to 4.5 which exhibiting the small world effect. The flow distance notion both considers the topological closeness of websites and the average real behaviors of surfing which is apparently different from the traditional shortest path distance [30] and random walk distance [31–34] on close flow networks (see the detailed discussions in the method section).

Bottom Line: As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW.And the patterns are stable across different periods.Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.

View Article: PubMed Central - PubMed

Affiliation: Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China; School of Systems Science, Beijing Normal University, Beijing, China.

ABSTRACT
With the fast development of Internet and WWW, "information overload" has become an overwhelming problem, and collective attention of users will play a more important role nowadays. As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW. In this paper, we propose a method to embed a large number of web sites into a high dimensional Euclidean space according to the novel concept of flow distance, which both considers connection topology between sites and collective click behaviors of users. With this geometric representation, we visualize the attention flow in the data set of Indiana university clickstream over one day. It turns out that all the websites can be embedded into a 20 dimensional ball, in which, close sites are always visited by users sequentially. The distributions of websites, attention flows, and dissipations can be divided into three spherical crowns (core, interim, and periphery). 20% popular sites (Google.com, Myspace.com, Facebook.com, etc.) attracting 75% attention flows with only 55% dissipations (log off users) locate in the central layer with the radius 4.1. While 60% sites attracting only about 22% traffics with almost 38% dissipations locate in the middle area with radius between 4.1 and 6.3. Other 20% sites are far from the central area. All the cumulative distributions of variables can be well fitted by "S"-shaped curves. And the patterns are stable across different periods. Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.

No MeSH data available.