Limits...
Adaptive Multi-Rate Compression Effects on Vowel Analysis.

Ireland D, Knuepffer C, McBride SJ - Front Bioeng Biotechnol (2015)

Bottom Line: Signal processing on digitally sampled vowel sounds for the detection of pathological voices has been firmly established.This work examines compression artifacts on vowel speech samples that have been compressed using the adaptive multi-rate codec at various bit-rates.We believe this work will have potential impact for future research on remote monitoring as the identification and exclusion of an ill-defined speech feature that has been hitherto used, will ultimately increase the robustness of the system.

View Article: PubMed Central - PubMed

Affiliation: Computational Informatics, Australian e-Health Research Centre, CSIRO , Brisbane, QLD , Australia.

ABSTRACT
Signal processing on digitally sampled vowel sounds for the detection of pathological voices has been firmly established. This work examines compression artifacts on vowel speech samples that have been compressed using the adaptive multi-rate codec at various bit-rates. Whereas previous work has used the sensitivity of machine learning algorithm to test for accuracy, this work examines the changes in the extracted speech features themselves and thus report new findings on the usefulness of a particular feature. We believe this work will have potential impact for future research on remote monitoring as the identification and exclusion of an ill-defined speech feature that has been hitherto used, will ultimately increase the robustness of the system.

No MeSH data available.


MFCC errors for each spoken vowel when the audio signal is compressed using AMR-WB codec at 23.85 kbps.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4542648&req=5

Figure 10: MFCC errors for each spoken vowel when the audio signal is compressed using AMR-WB codec at 23.85 kbps.

Mentions: Given the significant distortion of some speech features, it is desirable to examine if these distortions are equal for each vowel, or if certain vowels are more sensitive to audio compression. To that end, the computed error values are further categorized into each unique vowel rather than gender. For brevity, this work only considers vowel signals compressed only with AMR-WB codec at 23.85 kbps; thus, this work reflects the highest obtainable accuracy with the AMR-WB codec. Figures 8–10 show the error for each vowel and feature. Here, the vowels are ordered based on the position of F1 in the frequency spectrum, where vowel oa has the lowest F1 and vowel iy has the highest; the remaining vowels are ordered in ascending order. Initially, it was suspected that this order shows a steady increase in error but Figures 8–10 show this not to be entirely true. Table 4 gives the order of the vowels with ascending mean and SD.


Adaptive Multi-Rate Compression Effects on Vowel Analysis.

Ireland D, Knuepffer C, McBride SJ - Front Bioeng Biotechnol (2015)

MFCC errors for each spoken vowel when the audio signal is compressed using AMR-WB codec at 23.85 kbps.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4542648&req=5

Figure 10: MFCC errors for each spoken vowel when the audio signal is compressed using AMR-WB codec at 23.85 kbps.
Mentions: Given the significant distortion of some speech features, it is desirable to examine if these distortions are equal for each vowel, or if certain vowels are more sensitive to audio compression. To that end, the computed error values are further categorized into each unique vowel rather than gender. For brevity, this work only considers vowel signals compressed only with AMR-WB codec at 23.85 kbps; thus, this work reflects the highest obtainable accuracy with the AMR-WB codec. Figures 8–10 show the error for each vowel and feature. Here, the vowels are ordered based on the position of F1 in the frequency spectrum, where vowel oa has the lowest F1 and vowel iy has the highest; the remaining vowels are ordered in ascending order. Initially, it was suspected that this order shows a steady increase in error but Figures 8–10 show this not to be entirely true. Table 4 gives the order of the vowels with ascending mean and SD.

Bottom Line: Signal processing on digitally sampled vowel sounds for the detection of pathological voices has been firmly established.This work examines compression artifacts on vowel speech samples that have been compressed using the adaptive multi-rate codec at various bit-rates.We believe this work will have potential impact for future research on remote monitoring as the identification and exclusion of an ill-defined speech feature that has been hitherto used, will ultimately increase the robustness of the system.

View Article: PubMed Central - PubMed

Affiliation: Computational Informatics, Australian e-Health Research Centre, CSIRO , Brisbane, QLD , Australia.

ABSTRACT
Signal processing on digitally sampled vowel sounds for the detection of pathological voices has been firmly established. This work examines compression artifacts on vowel speech samples that have been compressed using the adaptive multi-rate codec at various bit-rates. Whereas previous work has used the sensitivity of machine learning algorithm to test for accuracy, this work examines the changes in the extracted speech features themselves and thus report new findings on the usefulness of a particular feature. We believe this work will have potential impact for future research on remote monitoring as the identification and exclusion of an ill-defined speech feature that has been hitherto used, will ultimately increase the robustness of the system.

No MeSH data available.