Limits...
Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH
A bird-eye view of the comparison of the dendrograms which are produced by DTW and Euclidean similarity functions respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3351073&req=5

fig8: A bird-eye view of the comparison of the dendrograms which are produced by DTW and Euclidean similarity functions respectively.

Mentions: From a bird-eye view, Figure 8 shows clearly on the efficacy of the two similarity functions in hierarchical time series clustering. DTW has generally less chaotic grouping than Euclidean similarity function. Without showing the dendrogram for each of the other varieties of similarity functions, a comparison table below shows the performance of grouping of each technique. The performance is estimated by counting the number of mislocated groups in the dendrogram by that particular similarity function. DTW performs consistently well, while Minkowski shows an optima when the power was increasing from 1 to 10. This observation confirms that DTW is suitable for time series clustering as time series do vary in time domain more or less to certain extent.


Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

A bird-eye view of the comparison of the dendrograms which are produced by DTW and Euclidean similarity functions respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3351073&req=5

fig8: A bird-eye view of the comparison of the dendrograms which are produced by DTW and Euclidean similarity functions respectively.
Mentions: From a bird-eye view, Figure 8 shows clearly on the efficacy of the two similarity functions in hierarchical time series clustering. DTW has generally less chaotic grouping than Euclidean similarity function. Without showing the dendrogram for each of the other varieties of similarity functions, a comparison table below shows the performance of grouping of each technique. The performance is estimated by counting the number of mislocated groups in the dendrogram by that particular similarity function. DTW performs consistently well, while Minkowski shows an optima when the power was increasing from 1 to 10. This observation confirms that DTW is suitable for time series clustering as time series do vary in time domain more or less to certain extent.

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH