Limits...
Improving clustering by imposing network information.

Gerber S, Horenko I - Sci Adv (2015)

Bottom Line: Cluster analysis is one of the most popular data analysis tools in a wide range of applied disciplines.The introduced approach is illustrated on the problem of a noninvasive unsupervised brain signal classification.This task is faced with several challenging difficulties such as nonstationary noisy signals and a small sample size, combined with a high-dimensional feature space and huge noise-to-signal ratios.

View Article: PubMed Central - PubMed

Affiliation: Università della Svizzera Italiana, Via Giuseppe Buffi 13, 6900 Lugano, Switzerland.

ABSTRACT
Cluster analysis is one of the most popular data analysis tools in a wide range of applied disciplines. We propose and justify a computationally efficient and straightforward-to-implement way of imposing the available information from networks/graphs (a priori available in many application areas) on a broad family of clustering methods. The introduced approach is illustrated on the problem of a noninvasive unsupervised brain signal classification. This task is faced with several challenging difficulties such as nonstationary noisy signals and a small sample size, combined with a high-dimensional feature space and huge noise-to-signal ratios. Applying this approach results in an exact unsupervised classification of very short signals, opening new possibilities for clustering methods in the area of a noninvasive brain-computer interface.

No MeSH data available.


An example of the imposed network and a cluster model discrimination.(A) Imposed (linear) graph: a priori persistency assumption for the underlying dynamics in time. (B) Comparing information content of EEG clusterings: graphs of the AIC values for K = 1 to 3 as a function of the regularization constant ϵ2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4643807&req=5

Figure 1: An example of the imposed network and a cluster model discrimination.(A) Imposed (linear) graph: a priori persistency assumption for the underlying dynamics in time. (B) Comparing information content of EEG clusterings: graphs of the AIC values for K = 1 to 3 as a function of the regularization constant ϵ2.

Mentions: Then, inserting the clustering assumption from above into //θ(·)//G, we obtain‖θ(·)‖G≤∑i=1K‖θi‖2(γiα)TDGγiα≤C¯K<+∞,(3)where is some (unknown) constant, //θi// is a Euclidean norm for cluster parameters θi and DG = P − 2W + Q (with diagonal matrices Puu ≡ Σv/(v,u)∈EWv,u and Quu ≡ Σv/(u,v)∈EWu,v; please see chapter 1.2 of the Supplementary Text for a detailed derivation). To give a concrete example, when dealing with problems of time series analysis, index u is denoting the time index of every particular data point, and the underlying graph G is a linear graph shown in Fig. 1A. Then, kernel weight W can be defined as Wu,v = 1 (for //u − v// = 1) and Wu,v = 0 (for //u − v// ≠ 1), and the resulting DG will be a tridiagonal positive semidefinite symmetric Laplacian matrix. This case will be particularly important in a context of time series clustering methods considered below.


Improving clustering by imposing network information.

Gerber S, Horenko I - Sci Adv (2015)

An example of the imposed network and a cluster model discrimination.(A) Imposed (linear) graph: a priori persistency assumption for the underlying dynamics in time. (B) Comparing information content of EEG clusterings: graphs of the AIC values for K = 1 to 3 as a function of the regularization constant ϵ2.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4643807&req=5

Figure 1: An example of the imposed network and a cluster model discrimination.(A) Imposed (linear) graph: a priori persistency assumption for the underlying dynamics in time. (B) Comparing information content of EEG clusterings: graphs of the AIC values for K = 1 to 3 as a function of the regularization constant ϵ2.
Mentions: Then, inserting the clustering assumption from above into //θ(·)//G, we obtain‖θ(·)‖G≤∑i=1K‖θi‖2(γiα)TDGγiα≤C¯K<+∞,(3)where is some (unknown) constant, //θi// is a Euclidean norm for cluster parameters θi and DG = P − 2W + Q (with diagonal matrices Puu ≡ Σv/(v,u)∈EWv,u and Quu ≡ Σv/(u,v)∈EWu,v; please see chapter 1.2 of the Supplementary Text for a detailed derivation). To give a concrete example, when dealing with problems of time series analysis, index u is denoting the time index of every particular data point, and the underlying graph G is a linear graph shown in Fig. 1A. Then, kernel weight W can be defined as Wu,v = 1 (for //u − v// = 1) and Wu,v = 0 (for //u − v// ≠ 1), and the resulting DG will be a tridiagonal positive semidefinite symmetric Laplacian matrix. This case will be particularly important in a context of time series clustering methods considered below.

Bottom Line: Cluster analysis is one of the most popular data analysis tools in a wide range of applied disciplines.The introduced approach is illustrated on the problem of a noninvasive unsupervised brain signal classification.This task is faced with several challenging difficulties such as nonstationary noisy signals and a small sample size, combined with a high-dimensional feature space and huge noise-to-signal ratios.

View Article: PubMed Central - PubMed

Affiliation: Università della Svizzera Italiana, Via Giuseppe Buffi 13, 6900 Lugano, Switzerland.

ABSTRACT
Cluster analysis is one of the most popular data analysis tools in a wide range of applied disciplines. We propose and justify a computationally efficient and straightforward-to-implement way of imposing the available information from networks/graphs (a priori available in many application areas) on a broad family of clustering methods. The introduced approach is illustrated on the problem of a noninvasive unsupervised brain signal classification. This task is faced with several challenging difficulties such as nonstationary noisy signals and a small sample size, combined with a high-dimensional feature space and huge noise-to-signal ratios. Applying this approach results in an exact unsupervised classification of very short signals, opening new possibilities for clustering methods in the area of a noninvasive brain-computer interface.

No MeSH data available.