Limits...
A Geometric Representation of Collective Attention Flows.

Shi P, Huang X, Wang J, Zhang J, Deng S, Wu Y - PLoS ONE (2015)

Bottom Line: As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW.And the patterns are stable across different periods.Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.

View Article: PubMed Central - PubMed

Affiliation: Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China; School of Systems Science, Beijing Normal University, Beijing, China.

ABSTRACT
With the fast development of Internet and WWW, "information overload" has become an overwhelming problem, and collective attention of users will play a more important role nowadays. As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW. In this paper, we propose a method to embed a large number of web sites into a high dimensional Euclidean space according to the novel concept of flow distance, which both considers connection topology between sites and collective click behaviors of users. With this geometric representation, we visualize the attention flow in the data set of Indiana university clickstream over one day. It turns out that all the websites can be embedded into a 20 dimensional ball, in which, close sites are always visited by users sequentially. The distributions of websites, attention flows, and dissipations can be divided into three spherical crowns (core, interim, and periphery). 20% popular sites (Google.com, Myspace.com, Facebook.com, etc.) attracting 75% attention flows with only 55% dissipations (log off users) locate in the central layer with the radius 4.1. While 60% sites attracting only about 22% traffics with almost 38% dissipations locate in the middle area with radius between 4.1 and 6.3. Other 20% sites are far from the central area. All the cumulative distributions of variables can be well fitted by "S"-shaped curves. And the patterns are stable across different periods. Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.

No MeSH data available.


Related in: MedlinePlus

The effectiveness of the embedding.(A) The average distortion of embedding algorithm decreases with the embedding dimension. (B) The average distortion of embedding algorithm in different iterations. The inset shows the enlarged lower left corner of the line. (C) shows the relationship between all the Euclidean distances and the flow distances as the final best results of the embedding.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4556699&req=5

pone.0136243.g004: The effectiveness of the embedding.(A) The average distortion of embedding algorithm decreases with the embedding dimension. (B) The average distortion of embedding algorithm in different iterations. The inset shows the enlarged lower left corner of the line. (C) shows the relationship between all the Euclidean distances and the flow distances as the final best results of the embedding.

Mentions: By using websites category data provided by Blue Coat Systems, Inc., we classify websites into 6 classes according to their domain names. And the sites with same class may show similar contents which are always visited sequentially by users. This phenomenon can be observed in Fig 3D for classes of News ecreation, Fig 3C Education and Fig 3B Adults because they locate three distinct regions of the map. However, other sites like search engines, social networks always provide synthetic contents or services, such that they scatter in the space all around. Fig 4 shows the embedding effectiveness analysis. A demonstrates that the average distortion decreases with the embedding dimensions. B shows the variation of average distortion during the iterations. C gives the comparison between the Euclidian distance and Cij.


A Geometric Representation of Collective Attention Flows.

Shi P, Huang X, Wang J, Zhang J, Deng S, Wu Y - PLoS ONE (2015)

The effectiveness of the embedding.(A) The average distortion of embedding algorithm decreases with the embedding dimension. (B) The average distortion of embedding algorithm in different iterations. The inset shows the enlarged lower left corner of the line. (C) shows the relationship between all the Euclidean distances and the flow distances as the final best results of the embedding.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4556699&req=5

pone.0136243.g004: The effectiveness of the embedding.(A) The average distortion of embedding algorithm decreases with the embedding dimension. (B) The average distortion of embedding algorithm in different iterations. The inset shows the enlarged lower left corner of the line. (C) shows the relationship between all the Euclidean distances and the flow distances as the final best results of the embedding.
Mentions: By using websites category data provided by Blue Coat Systems, Inc., we classify websites into 6 classes according to their domain names. And the sites with same class may show similar contents which are always visited sequentially by users. This phenomenon can be observed in Fig 3D for classes of News ecreation, Fig 3C Education and Fig 3B Adults because they locate three distinct regions of the map. However, other sites like search engines, social networks always provide synthetic contents or services, such that they scatter in the space all around. Fig 4 shows the embedding effectiveness analysis. A demonstrates that the average distortion decreases with the embedding dimensions. B shows the variation of average distortion during the iterations. C gives the comparison between the Euclidian distance and Cij.

Bottom Line: As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW.And the patterns are stable across different periods.Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.

View Article: PubMed Central - PubMed

Affiliation: Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha, China; School of Systems Science, Beijing Normal University, Beijing, China.

ABSTRACT
With the fast development of Internet and WWW, "information overload" has become an overwhelming problem, and collective attention of users will play a more important role nowadays. As a result, knowing how collective attention distributes and flows among different websites is the first step to understand the underlying dynamics of attention on WWW. In this paper, we propose a method to embed a large number of web sites into a high dimensional Euclidean space according to the novel concept of flow distance, which both considers connection topology between sites and collective click behaviors of users. With this geometric representation, we visualize the attention flow in the data set of Indiana university clickstream over one day. It turns out that all the websites can be embedded into a 20 dimensional ball, in which, close sites are always visited by users sequentially. The distributions of websites, attention flows, and dissipations can be divided into three spherical crowns (core, interim, and periphery). 20% popular sites (Google.com, Myspace.com, Facebook.com, etc.) attracting 75% attention flows with only 55% dissipations (log off users) locate in the central layer with the radius 4.1. While 60% sites attracting only about 22% traffics with almost 38% dissipations locate in the middle area with radius between 4.1 and 6.3. Other 20% sites are far from the central area. All the cumulative distributions of variables can be well fitted by "S"-shaped curves. And the patterns are stable across different periods. Thus, the overall distribution and the dynamics of collective attention on websites can be well exhibited by this geometric representation.

No MeSH data available.


Related in: MedlinePlus