Limits...
Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH
The proposed model of voice classification with hierarchical clustering.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3351073&req=5

fig3: The proposed model of voice classification with hierarchical clustering.

Mentions: As shown in Figure 3, an example scenario by the proposed model is a surveillance eavesdropper that collects from a secret meeting a total of n voice traces. The voice traces may be spoken by more than one speaker, one trace per speaker at a time, and each voice trace can be encoded by m coefficient attributes regardless of how long the conversation is. The voices are assumed to be undistorted and not intermixed. The voices that are in the form of time series can be submitted for hierarchical clustering for self-grouping. Hierarchical clustering instead of others is applied because it gives a layered structure of groupings which we do not know in advance in different resolutions. After the clustering, not only we know how the speakers whose voices are distinctively grouped, the number of unique voices (hence the number of speakers) can also be identified. In essence, it may be possible to infer from the groupings that how many speakers there are in the meeting, what characteristics they have in each group. However, it requires further verification and probably extra information to infer detailed assertions such as gender, age, and the emotions of the speech.


Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

The proposed model of voice classification with hierarchical clustering.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3351073&req=5

fig3: The proposed model of voice classification with hierarchical clustering.
Mentions: As shown in Figure 3, an example scenario by the proposed model is a surveillance eavesdropper that collects from a secret meeting a total of n voice traces. The voice traces may be spoken by more than one speaker, one trace per speaker at a time, and each voice trace can be encoded by m coefficient attributes regardless of how long the conversation is. The voices are assumed to be undistorted and not intermixed. The voices that are in the form of time series can be submitted for hierarchical clustering for self-grouping. Hierarchical clustering instead of others is applied because it gives a layered structure of groupings which we do not know in advance in different resolutions. After the clustering, not only we know how the speakers whose voices are distinctively grouped, the number of unique voices (hence the number of speakers) can also be identified. In essence, it may be possible to infer from the groupings that how many speakers there are in the meeting, what characteristics they have in each group. However, it requires further verification and probably extra information to infer detailed assertions such as gender, age, and the emotions of the speech.

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH