Limits...
Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH
(a) Six characteristic groups at the dendrogram by using Euclidean similarity function. (b) The corresponding row numbers of the dataset at the dendrogram by using Euclidean similarity function.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3351073&req=5

fig7: (a) Six characteristic groups at the dendrogram by using Euclidean similarity function. (b) The corresponding row numbers of the dataset at the dendrogram by using Euclidean similarity function.

Mentions: The hierarchical clustering algorithm that we applied in the experiment is implemented in R which is a free software environment for statistical computing and graphics (http://www.r-project.org/). The synthetic data are first sampled with a ratio of 10% for producing the first iteration of data points and clusters. DTW that serves as the similarity function is embedded in the clustering algorithm for processing the time series data till convergence. The experiment is repeated with other similarity functions for comparison. A snapshot of the resulting dendrogram by using DTW similarity function is shown in Figure 6(a). It can be seen that the dendrogram by DTW can effectively partition the time series into six distinct groups that represent six speech types. The groupings, by the DTW dendrogram as shown in Figure 6(b), can be used to map over to the actual row number of the dataset that has a total of 600 rows. In other words, the time series that are indexed by the row numbers can be allocated to the six groups by the dendrogram as a result of time series clustering. In another counter example, similarity function by Euclidean distance is applied in the experiment; we can easily see that the groupings at the dendrogram as shown in Figure 7(a) are not in perfect order at all. We can safely remark that DTW is superior to Euclidean in the clustering.


Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

(a) Six characteristic groups at the dendrogram by using Euclidean similarity function. (b) The corresponding row numbers of the dataset at the dendrogram by using Euclidean similarity function.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3351073&req=5

fig7: (a) Six characteristic groups at the dendrogram by using Euclidean similarity function. (b) The corresponding row numbers of the dataset at the dendrogram by using Euclidean similarity function.
Mentions: The hierarchical clustering algorithm that we applied in the experiment is implemented in R which is a free software environment for statistical computing and graphics (http://www.r-project.org/). The synthetic data are first sampled with a ratio of 10% for producing the first iteration of data points and clusters. DTW that serves as the similarity function is embedded in the clustering algorithm for processing the time series data till convergence. The experiment is repeated with other similarity functions for comparison. A snapshot of the resulting dendrogram by using DTW similarity function is shown in Figure 6(a). It can be seen that the dendrogram by DTW can effectively partition the time series into six distinct groups that represent six speech types. The groupings, by the DTW dendrogram as shown in Figure 6(b), can be used to map over to the actual row number of the dataset that has a total of 600 rows. In other words, the time series that are indexed by the row numbers can be allocated to the six groups by the dendrogram as a result of time series clustering. In another counter example, similarity function by Euclidean distance is applied in the experiment; we can easily see that the groupings at the dendrogram as shown in Figure 7(a) are not in perfect order at all. We can safely remark that DTW is superior to Euclidean in the clustering.

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH