Limits...
Network-Based and Binless Frequency Analyses.

Derrible S, Ahmad N - PLoS ONE (2015)

Bottom Line: The methodology is validated by sampling 12 typical distributions, and it is applied to a number of real-world data sets with both spatial and temporal components.The methodology can be applied to any data set and provides a robust means to uncover meaningful patterns and trends.A free python script and a tutorial are also made available to facilitate the application of the method.

View Article: PubMed Central - PubMed

Affiliation: Complex and Sustainable Urban Networks (CSUN) Laboratory, University of Illinois at Chicago, Chicago, IL, United States of America.

ABSTRACT
We introduce and develop a new network-based and binless methodology to perform frequency analyses and produce histograms. In contrast with traditional frequency analysis techniques that use fixed intervals to bin values, we place a range ±ζ around each individual value in a data set and count the number of values within that range, which allows us to compare every single value of a data set with one another. In essence, the methodology is identical to the construction of a network, where two values are connected if they lie within a given a range (±ζ). The value with the highest degree (i.e., most connections) is therefore assimilated to the mode of the distribution. To select an optimal range, we look at the stability of the proportion of nodes in the largest cluster. The methodology is validated by sampling 12 typical distributions, and it is applied to a number of real-world data sets with both spatial and temporal components. The methodology can be applied to any data set and provides a robust means to uncover meaningful patterns and trends. A free python script and a tutorial are also made available to facilitate the application of the method.

No MeSH data available.


Impact of ζ on NB Methodology and Network Properties.(A)-(C) Scatter plots of values and their degrees (i.e., number of connections) for ζ = 1, 10, 20 respectively. (D) Histogram of distribution where the bin size of 14.70 was calculated using Scott’s rule. We can see that the right shape is obtained but the bins are large. (E) Evolution of pg with ζ. (F) Evolution of the D and Lavg with ζ. In practice, we choose ζ as a low percentage of the median of the distribution, say 1%, and we then increase it gradually until the value of pg becomes constant over several increases of ζ (this number of increases depends on the magnitude of increase between every ζ); here, we find the network becomes stable for ζ = 10.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4631440&req=5

pone.0142108.g001: Impact of ζ on NB Methodology and Network Properties.(A)-(C) Scatter plots of values and their degrees (i.e., number of connections) for ζ = 1, 10, 20 respectively. (D) Histogram of distribution where the bin size of 14.70 was calculated using Scott’s rule. We can see that the right shape is obtained but the bins are large. (E) Evolution of pg with ζ. (F) Evolution of the D and Lavg with ζ. In practice, we choose ζ as a low percentage of the median of the distribution, say 1%, and we then increase it gradually until the value of pg becomes constant over several increases of ζ (this number of increases depends on the magnitude of increase between every ζ); here, we find the network becomes stable for ζ = 10.

Mentions: Using this network analogy, the frequency/count in Fig 1 (detailed later) simply takes the form of the degree di of a node i (i.e., the number of connections), defined from A as:di=∑j=1nAij(2)


Network-Based and Binless Frequency Analyses.

Derrible S, Ahmad N - PLoS ONE (2015)

Impact of ζ on NB Methodology and Network Properties.(A)-(C) Scatter plots of values and their degrees (i.e., number of connections) for ζ = 1, 10, 20 respectively. (D) Histogram of distribution where the bin size of 14.70 was calculated using Scott’s rule. We can see that the right shape is obtained but the bins are large. (E) Evolution of pg with ζ. (F) Evolution of the D and Lavg with ζ. In practice, we choose ζ as a low percentage of the median of the distribution, say 1%, and we then increase it gradually until the value of pg becomes constant over several increases of ζ (this number of increases depends on the magnitude of increase between every ζ); here, we find the network becomes stable for ζ = 10.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4631440&req=5

pone.0142108.g001: Impact of ζ on NB Methodology and Network Properties.(A)-(C) Scatter plots of values and their degrees (i.e., number of connections) for ζ = 1, 10, 20 respectively. (D) Histogram of distribution where the bin size of 14.70 was calculated using Scott’s rule. We can see that the right shape is obtained but the bins are large. (E) Evolution of pg with ζ. (F) Evolution of the D and Lavg with ζ. In practice, we choose ζ as a low percentage of the median of the distribution, say 1%, and we then increase it gradually until the value of pg becomes constant over several increases of ζ (this number of increases depends on the magnitude of increase between every ζ); here, we find the network becomes stable for ζ = 10.
Mentions: Using this network analogy, the frequency/count in Fig 1 (detailed later) simply takes the form of the degree di of a node i (i.e., the number of connections), defined from A as:di=∑j=1nAij(2)

Bottom Line: The methodology is validated by sampling 12 typical distributions, and it is applied to a number of real-world data sets with both spatial and temporal components.The methodology can be applied to any data set and provides a robust means to uncover meaningful patterns and trends.A free python script and a tutorial are also made available to facilitate the application of the method.

View Article: PubMed Central - PubMed

Affiliation: Complex and Sustainable Urban Networks (CSUN) Laboratory, University of Illinois at Chicago, Chicago, IL, United States of America.

ABSTRACT
We introduce and develop a new network-based and binless methodology to perform frequency analyses and produce histograms. In contrast with traditional frequency analysis techniques that use fixed intervals to bin values, we place a range ±ζ around each individual value in a data set and count the number of values within that range, which allows us to compare every single value of a data set with one another. In essence, the methodology is identical to the construction of a network, where two values are connected if they lie within a given a range (±ζ). The value with the highest degree (i.e., most connections) is therefore assimilated to the mode of the distribution. To select an optimal range, we look at the stability of the proportion of nodes in the largest cluster. The methodology is validated by sampling 12 typical distributions, and it is applied to a number of real-world data sets with both spatial and temporal components. The methodology can be applied to any data set and provides a robust means to uncover meaningful patterns and trends. A free python script and a tutorial are also made available to facilitate the application of the method.

No MeSH data available.