Limits...
Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH

Related in: MedlinePlus

(a) Attributes of the a voice time series; (b) transformed attributes called Haar coefficient of the wavelet representation of the time series.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3351073&req=5

fig9: (a) Attributes of the a voice time series; (b) transformed attributes called Haar coefficient of the wavelet representation of the time series.

Mentions: The subsequent experiment is to build a decision tree after the grouping has been formed by hierarchical time series clustering. There are two choices of decision trees to be recommended. RIPPER function is suggested to be run for generated comprehensible rules that are in the form of IF-THEN-ELSE. The rules specify a sequence of conditions meeting which in order lead to a predefined class label. When a new voiceprint is received, pass it over the rules by checking its coefficient values that can determine which class label the voiceprint fits in. The other decision tree algorithm is the classical C5.0 or J48 with pruning mode on, in WEKA which is an open source of machine learning algorithms for solving data mining problems implemented in Java and open sourced under the GPL (http://archive.ics.uci.edu/ml). The time series data, however, are converted to their corresponding frequency domain by Discrete Wavelet Transformation (DWT). DWT applies the the Haar wavelet transform which was invented by Kristian Sandberg from University of Colorado at Boulder, USA in year 2000. DWT in principle works better than time series points in classification because DWT can find where the energies are concentrated in the frequency domain, and remarkable coefficients called Haar attributes are well describing the characteristics of the time series. A comparison of the original coefficients in time domain and transformed coefficient in frequency domain can be seen that wavelets after the transformation have sharper and narrower statistical distribution than the time series points, in Figure 9. DWT is implemented in the plug-in filter in WEKA called “weka.filters.unsupervised.attribute.Wavelet.”


Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

(a) Attributes of the a voice time series; (b) transformed attributes called Haar coefficient of the wavelet representation of the time series.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3351073&req=5

fig9: (a) Attributes of the a voice time series; (b) transformed attributes called Haar coefficient of the wavelet representation of the time series.
Mentions: The subsequent experiment is to build a decision tree after the grouping has been formed by hierarchical time series clustering. There are two choices of decision trees to be recommended. RIPPER function is suggested to be run for generated comprehensible rules that are in the form of IF-THEN-ELSE. The rules specify a sequence of conditions meeting which in order lead to a predefined class label. When a new voiceprint is received, pass it over the rules by checking its coefficient values that can determine which class label the voiceprint fits in. The other decision tree algorithm is the classical C5.0 or J48 with pruning mode on, in WEKA which is an open source of machine learning algorithms for solving data mining problems implemented in Java and open sourced under the GPL (http://archive.ics.uci.edu/ml). The time series data, however, are converted to their corresponding frequency domain by Discrete Wavelet Transformation (DWT). DWT applies the the Haar wavelet transform which was invented by Kristian Sandberg from University of Colorado at Boulder, USA in year 2000. DWT in principle works better than time series points in classification because DWT can find where the energies are concentrated in the frequency domain, and remarkable coefficients called Haar attributes are well describing the characteristics of the time series. A comparison of the original coefficients in time domain and transformed coefficient in frequency domain can be seen that wavelets after the transformation have sharper and narrower statistical distribution than the time series points, in Figure 9. DWT is implemented in the plug-in filter in WEKA called “weka.filters.unsupervised.attribute.Wavelet.”

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH
Related in: MedlinePlus