Limits...
Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH
Workings of voice verification and voice identification systems.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3351073&req=5

fig1: Workings of voice verification and voice identification systems.

Mentions: We can see that both VV and VI require a priori condition that a set of voiceprints must have already been known for the matching of new samples to proceed. This is akin to database query or supervised learning where preknown samples must be initially used to train up a decision model, so testing and matching of new sample can follow. A generic example is illustrated in Figure 1. What if in a scenario where a handful of unknown voiceprints are collected, but we wish to obtain some information about them? Such scenarios may include but not limited to security surveillance problems [3] where a list of voice traces are captured from a monitored area, how many unique speakers there are, their ages, and genders, and from their speech accents which ethnic backgrounds these people belong to; customer-service applications where callers will be automatically classified from their tones to categories of their needs and emotions. It was only until recently, voice classification (VC) that attempts to determine if a speaker should be classified to a particular characteristic group rather than to a particular individual has gained popularity. VC can help complement the security of VV and VI systems too. In Figure 2 an example of a voice biometric system is being compromised; through hacking, the content of a voiceprint B is modified to that of another voiceprint (let us say A) that has a higher access authority. That can be done by replay attack or injecting vocal features of A into B. Because the database of the voiceprints just like an encrypted list of passwords in a file system is accessed individually, each voiceprint is protected independently; allowing the existence of two same voiceprints goes undetected. So an imposter with B′ can cheat gaining a restricted access right by matching B′ to A in a VI system. VC could be used to prevent this fraud by checking how many unique items there are in different groups. If extra voiceprints suddenly emerge or have gone missing from a group, the integrities of the voiceprints must have changed.


Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification.

Fong S - J. Biomed. Biotechnol. (2012)

Workings of voice verification and voice identification systems.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3351073&req=5

fig1: Workings of voice verification and voice identification systems.
Mentions: We can see that both VV and VI require a priori condition that a set of voiceprints must have already been known for the matching of new samples to proceed. This is akin to database query or supervised learning where preknown samples must be initially used to train up a decision model, so testing and matching of new sample can follow. A generic example is illustrated in Figure 1. What if in a scenario where a handful of unknown voiceprints are collected, but we wish to obtain some information about them? Such scenarios may include but not limited to security surveillance problems [3] where a list of voice traces are captured from a monitored area, how many unique speakers there are, their ages, and genders, and from their speech accents which ethnic backgrounds these people belong to; customer-service applications where callers will be automatically classified from their tones to categories of their needs and emotions. It was only until recently, voice classification (VC) that attempts to determine if a speaker should be classified to a particular characteristic group rather than to a particular individual has gained popularity. VC can help complement the security of VV and VI systems too. In Figure 2 an example of a voice biometric system is being compromised; through hacking, the content of a voiceprint B is modified to that of another voiceprint (let us say A) that has a higher access authority. That can be done by replay attack or injecting vocal features of A into B. Because the database of the voiceprints just like an encrypted list of passwords in a file system is accessed individually, each voiceprint is protected independently; allowing the existence of two same voiceprints goes undetected. So an imposter with B′ can cheat gaining a restricted access right by matching B′ to A in a VI system. VC could be used to prevent this fraud by checking how many unique items there are in different groups. If extra voiceprints suddenly emerge or have gone missing from a group, the integrities of the voiceprints must have changed.

Bottom Line: Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth.In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree.The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, University of Macau, Taipa, Macau. ccfong@umac.mo

ABSTRACT
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm.

Show MeSH