Limits...
Optimized data fusion for K-means Laplacian clustering.

Yu S, Liu X, Tranchevent LC, Glänzel W, Suykens JA, De Moor B, Moreau Y - Bioinformatics (2010)

Bottom Line: The proposed Optimized Kernel Laplacian Clustering (OKLC) algorithms perform significantly better than other methods.Moreover, the coefficients of kernels and Laplacians optimized by OKLC show some correlation with the rank of performance of individual data source.Though in our evaluation the K values are predefined, in practical studies, the optimal cluster number can be consistently estimated from the eigenspectrum of the combined kernel Laplacian matrix.

View Article: PubMed Central - PubMed

Affiliation: Signals, Identification, System Theory and Automation, Department of Electrical Engineering, Katholieke Universiteit Leuven, Leuven-Heverlee, Belgium. shiyu@uchicago.edu

ABSTRACT

Motivation: We propose a novel algorithm to combine multiple kernels and Laplacians for clustering analysis. The new algorithm is formulated on a Rayleigh quotient objective function and is solved as a bi-level alternating minimization procedure. Using the proposed algorithm, the coefficients of kernels and Laplacians can be optimized automatically.

Results: Three variants of the algorithm are proposed. The performance is systematically validated on two real-life data fusion applications. The proposed Optimized Kernel Laplacian Clustering (OKLC) algorithms perform significantly better than other methods. Moreover, the coefficients of kernels and Laplacians optimized by OKLC show some correlation with the rank of performance of individual data source. Though in our evaluation the K values are predefined, in practical studies, the optimal cluster number can be consistently estimated from the eigenspectrum of the combined kernel Laplacian matrix.

Availability: The MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/oklc.html.

Show MeSH
The plot of eigenvalues (A and B) of the optimal kernel-Laplacian combination obtained by all OKLC models. The parameter K is set as equivalent as the reference label numbers.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC3008636&req=5

Figure 3: The plot of eigenvalues (A and B) of the optimal kernel-Laplacian combination obtained by all OKLC models. The parameter K is set as equivalent as the reference label numbers.

Mentions: As a spectral clustering algorithm, the optimal cluster number of OKLC can be estimated by checking the plot of eigenvalues (von Luxburg, 2007). To demonstrate this, we investigated the dominant eigenvalues of the optimized combination of kernels and Laplacians. In Figure 3, we compare the difference of three OKLC models with the pre-defined K (set as equal to the number of class labels). In practical research, one can predict the optimal cluster number by checking the ‘elbow’ of the eigenvalue plot. As shown in Figure 3, the ‘elbow’ in disease data is quite obvious at the number of 14. In journal data, the ‘elbow’ is more likely to range from 6 to 12. All the three OKLC models show a similar trend on the eigenvalue plot. Moreover, in Supplementary Material 9 we also compare the eigenvalue curves using different K values as input. As shown, the eigenvalue plot is quite stable with respect to the different inputs of K, which means the optimized kernel and Laplacian coefficients are quite independent with the K value. This advantage enables a reliable prediction about the optimal cluster number by integrating multiple data sources.Fig. 3.


Optimized data fusion for K-means Laplacian clustering.

Yu S, Liu X, Tranchevent LC, Glänzel W, Suykens JA, De Moor B, Moreau Y - Bioinformatics (2010)

The plot of eigenvalues (A and B) of the optimal kernel-Laplacian combination obtained by all OKLC models. The parameter K is set as equivalent as the reference label numbers.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC3008636&req=5

Figure 3: The plot of eigenvalues (A and B) of the optimal kernel-Laplacian combination obtained by all OKLC models. The parameter K is set as equivalent as the reference label numbers.
Mentions: As a spectral clustering algorithm, the optimal cluster number of OKLC can be estimated by checking the plot of eigenvalues (von Luxburg, 2007). To demonstrate this, we investigated the dominant eigenvalues of the optimized combination of kernels and Laplacians. In Figure 3, we compare the difference of three OKLC models with the pre-defined K (set as equal to the number of class labels). In practical research, one can predict the optimal cluster number by checking the ‘elbow’ of the eigenvalue plot. As shown in Figure 3, the ‘elbow’ in disease data is quite obvious at the number of 14. In journal data, the ‘elbow’ is more likely to range from 6 to 12. All the three OKLC models show a similar trend on the eigenvalue plot. Moreover, in Supplementary Material 9 we also compare the eigenvalue curves using different K values as input. As shown, the eigenvalue plot is quite stable with respect to the different inputs of K, which means the optimized kernel and Laplacian coefficients are quite independent with the K value. This advantage enables a reliable prediction about the optimal cluster number by integrating multiple data sources.Fig. 3.

Bottom Line: The proposed Optimized Kernel Laplacian Clustering (OKLC) algorithms perform significantly better than other methods.Moreover, the coefficients of kernels and Laplacians optimized by OKLC show some correlation with the rank of performance of individual data source.Though in our evaluation the K values are predefined, in practical studies, the optimal cluster number can be consistently estimated from the eigenspectrum of the combined kernel Laplacian matrix.

View Article: PubMed Central - PubMed

Affiliation: Signals, Identification, System Theory and Automation, Department of Electrical Engineering, Katholieke Universiteit Leuven, Leuven-Heverlee, Belgium. shiyu@uchicago.edu

ABSTRACT

Motivation: We propose a novel algorithm to combine multiple kernels and Laplacians for clustering analysis. The new algorithm is formulated on a Rayleigh quotient objective function and is solved as a bi-level alternating minimization procedure. Using the proposed algorithm, the coefficients of kernels and Laplacians can be optimized automatically.

Results: Three variants of the algorithm are proposed. The performance is systematically validated on two real-life data fusion applications. The proposed Optimized Kernel Laplacian Clustering (OKLC) algorithms perform significantly better than other methods. Moreover, the coefficients of kernels and Laplacians optimized by OKLC show some correlation with the rank of performance of individual data source. Though in our evaluation the K values are predefined, in practical studies, the optimal cluster number can be consistently estimated from the eigenspectrum of the combined kernel Laplacian matrix.

Availability: The MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/oklc.html.

Show MeSH