Limits...
SCOUT: simultaneous time segmentation and community detection in dynamic networks

View Article: PubMed Central - PubMed

ABSTRACT

Many evolving complex real-world systems can be modeled via dynamic networks. An important problem in dynamic network research is community detection, which finds groups of topologically related nodes. Typically, this problem is approached by assuming either that each time point has a distinct community organization or that all time points share a single community organization. The reality likely lies between these two extremes. To find the compromise, we consider community detection in the context of the problem of segment detection, which identifies contiguous time periods with consistent network structure. Consequently, we formulate a combined problem of segment community detection (SCD), which simultaneously partitions the network into contiguous time segments with consistent community organization and finds this community organization for each segment. To solve SCD, we introduce SCOUT, an optimization framework that explicitly considers both segmentation quality and partition quality. SCOUT addresses limitations of existing methods that can be adapted to solve SCD, which consider only one of segmentation quality or partition quality. In a thorough evaluation, SCOUT outperforms the existing methods in terms of both accuracy and computational complexity. We apply SCOUT to biological network data to study human aging.

No MeSH data available.


Method rankings with respect to SimB for synthetic network configurations with (a) 50–100 and (b) 500–1000 nodes per snapshot. For each configuration, we compare the four methods’ SimB scores (averages over all instances of the given configuration) to identify the first, second, third, and fourth best method; ties are allowed. We summarize these results over all considered configurations by measuring, how many times the given method (x-axis) is ranked as the first, second, third, and fourth best method (expressed as the percentage of all considered configurations; y-axis). “N/A” means that the given method (GHRG for the larger networks) could not be run. The darker the given bar, the better the method performance.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5121586&req=5

f3: Method rankings with respect to SimB for synthetic network configurations with (a) 50–100 and (b) 500–1000 nodes per snapshot. For each configuration, we compare the four methods’ SimB scores (averages over all instances of the given configuration) to identify the first, second, third, and fourth best method; ties are allowed. We summarize these results over all considered configurations by measuring, how many times the given method (x-axis) is ranked as the first, second, third, and fourth best method (expressed as the percentage of all considered configurations; y-axis). “N/A” means that the given method (GHRG for the larger networks) could not be run. The darker the given bar, the better the method performance.

Mentions: For SimB, SCOUT again outperforms the other methods for all configurations (Fig. 3). The other methods are comparable to each other (Fig. 2(d) and Supplementary S16). Importantly, in most cases, SCOUT not only improves upon the existing methods but also its improvement is statistically significant. Namely, SCOUT statistically significantly improves upon the best existing method in 75%, 65%, and 55% of all cases at p-value threshold of 0.05, 0.01, and 0.001, respectively (Supplementary Table S1). Note that the above percentages could not be perfect, since for 20% of all configurations (namely, the four configurations with the minimum number of ground truth segments), in addition to SCOUT that achieves the perfect SimB, Multi-Step also (unfairly, per our above discussion) achieves the perfect SimB and is thus comparable to SCOUT.


SCOUT: simultaneous time segmentation and community detection in dynamic networks
Method rankings with respect to SimB for synthetic network configurations with (a) 50–100 and (b) 500–1000 nodes per snapshot. For each configuration, we compare the four methods’ SimB scores (averages over all instances of the given configuration) to identify the first, second, third, and fourth best method; ties are allowed. We summarize these results over all considered configurations by measuring, how many times the given method (x-axis) is ranked as the first, second, third, and fourth best method (expressed as the percentage of all considered configurations; y-axis). “N/A” means that the given method (GHRG for the larger networks) could not be run. The darker the given bar, the better the method performance.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5121586&req=5

f3: Method rankings with respect to SimB for synthetic network configurations with (a) 50–100 and (b) 500–1000 nodes per snapshot. For each configuration, we compare the four methods’ SimB scores (averages over all instances of the given configuration) to identify the first, second, third, and fourth best method; ties are allowed. We summarize these results over all considered configurations by measuring, how many times the given method (x-axis) is ranked as the first, second, third, and fourth best method (expressed as the percentage of all considered configurations; y-axis). “N/A” means that the given method (GHRG for the larger networks) could not be run. The darker the given bar, the better the method performance.
Mentions: For SimB, SCOUT again outperforms the other methods for all configurations (Fig. 3). The other methods are comparable to each other (Fig. 2(d) and Supplementary S16). Importantly, in most cases, SCOUT not only improves upon the existing methods but also its improvement is statistically significant. Namely, SCOUT statistically significantly improves upon the best existing method in 75%, 65%, and 55% of all cases at p-value threshold of 0.05, 0.01, and 0.001, respectively (Supplementary Table S1). Note that the above percentages could not be perfect, since for 20% of all configurations (namely, the four configurations with the minimum number of ground truth segments), in addition to SCOUT that achieves the perfect SimB, Multi-Step also (unfairly, per our above discussion) achieves the perfect SimB and is thus comparable to SCOUT.

View Article: PubMed Central - PubMed

ABSTRACT

Many evolving complex real-world systems can be modeled via dynamic networks. An important problem in dynamic network research is community detection, which finds groups of topologically related nodes. Typically, this problem is approached by assuming either that each time point has a distinct community organization or that all time points share a single community organization. The reality likely lies between these two extremes. To find the compromise, we consider community detection in the context of the problem of segment detection, which identifies contiguous time periods with consistent network structure. Consequently, we formulate a combined problem of segment community detection (SCD), which simultaneously partitions the network into contiguous time segments with consistent community organization and finds this community organization for each segment. To solve SCD, we introduce SCOUT, an optimization framework that explicitly considers both segmentation quality and partition quality. SCOUT addresses limitations of existing methods that can be adapted to solve SCD, which consider only one of segmentation quality or partition quality. In a thorough evaluation, SCOUT outperforms the existing methods in terms of both accuracy and computational complexity. We apply SCOUT to biological network data to study human aging.

No MeSH data available.