Limits...
Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH
Pseudo code of dynamic time wrap algorithm.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3351073&req=5

alg1: Pseudo code of dynamic time wrap algorithm.

Mentions: Many similarity measures are available such as Manhattan, Euclidean and Minkowski just to name a few. In our experiments, a range of popular similarity functions are compared in performance in order to observe which one performs the best. Table 2 shows a list of performance results in the percentage of correctly clustered groups by using various similarity functions. Because the nature of the data points that we are working with is time series, we choose to use Dynamic Time Warping function (DTW) as a distance measure that finds optimal alignment between two sequences of time series data points. DTW a pairwise comparison of the feature (or attribute) vectors in each time series. It finds an optimal match between two sequences that allows for stretched or compressed sections of the sequences. In other words it allows some flexibility for matching two sequences that may vary slightly in speed or time. The sequences are “warped” nonlinearly in the time dimension to determine a measure of their similarity independent of certain nonlinear variations in the time dimension. It is popular in the application of signal processing where two signal patterns are to be matched in similarity. Particularly suitable DTW is for matching sequences that may have missing information or various lengths, on condition that the sequences are long enough for matching. In theory, DTW is most suitable for voice wave patterns because exact matching for such patterns often may not occur, and voice wave patterns may vary slightly in time domain. A comparison will be given in our experiment to verify this hypothesis. The pseudo code of the DTW algorithm is given in Algorithm 1.


Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

Pseudo code of dynamic time wrap algorithm.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3351073&req=5

alg1: Pseudo code of dynamic time wrap algorithm.
Mentions: Many similarity measures are available such as Manhattan, Euclidean and Minkowski just to name a few. In our experiments, a range of popular similarity functions are compared in performance in order to observe which one performs the best. Table 2 shows a list of performance results in the percentage of correctly clustered groups by using various similarity functions. Because the nature of the data points that we are working with is time series, we choose to use Dynamic Time Warping function (DTW) as a distance measure that finds optimal alignment between two sequences of time series data points. DTW a pairwise comparison of the feature (or attribute) vectors in each time series. It finds an optimal match between two sequences that allows for stretched or compressed sections of the sequences. In other words it allows some flexibility for matching two sequences that may vary slightly in speed or time. The sequences are “warped” nonlinearly in the time dimension to determine a measure of their similarity independent of certain nonlinear variations in the time dimension. It is popular in the application of signal processing where two signal patterns are to be matched in similarity. Particularly suitable DTW is for matching sequences that may have missing information or various lengths, on condition that the sequences are long enough for matching. In theory, DTW is most suitable for voice wave patterns because exact matching for such patterns often may not occur, and voice wave patterns may vary slightly in time domain. A comparison will be given in our experiment to verify this hypothesis. The pseudo code of the DTW algorithm is given in Algorithm 1.

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH