Limits...
Adaptive Multi-Rate Compression Effects on Vowel Analysis.

Ireland D, Knuepffer C, McBride SJ - Front Bioeng Biotechnol (2015)

Bottom Line: Signal processing on digitally sampled vowel sounds for the detection of pathological voices has been firmly established.This work examines compression artifacts on vowel speech samples that have been compressed using the adaptive multi-rate codec at various bit-rates.We believe this work will have potential impact for future research on remote monitoring as the identification and exclusion of an ill-defined speech feature that has been hitherto used, will ultimately increase the robustness of the system.

View Article: PubMed Central - PubMed

Affiliation: Computational Informatics, Australian e-Health Research Centre, CSIRO , Brisbane, QLD , Australia.

ABSTRACT
Signal processing on digitally sampled vowel sounds for the detection of pathological voices has been firmly established. This work examines compression artifacts on vowel speech samples that have been compressed using the adaptive multi-rate codec at various bit-rates. Whereas previous work has used the sensitivity of machine learning algorithm to test for accuracy, this work examines the changes in the extracted speech features themselves and thus report new findings on the usefulness of a particular feature. We believe this work will have potential impact for future research on remote monitoring as the identification and exclusion of an ill-defined speech feature that has been hitherto used, will ultimately increase the robustness of the system.

No MeSH data available.


Error for each speech feature when the audio signal is compressed using AMR-WB codec at 12.65 kbps.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4542648&req=5

Figure 5: Error for each speech feature when the audio signal is compressed using AMR-WB codec at 12.65 kbps.

Mentions: Table 3 shows the mean and SD (in brackets) of the resultant error when the audio is compressed using the AMR-WB codec at all possible bit-rates.Table elements in boldface represent metrics that show a high significance using the Welch t-test. The complete data for bit-rates 12.65 kbps, 18.25 kbps, and 23.85 kbps are given in box-and-whisker form in Figures 5–7, respectively. These figures reflect the lowest and highest possible bit-rate currently possible using AMR-WB codec. Referring to the Table 3, jitter and shimmer are shown to still exhibit significant distortion when the audio signal is compressed. The Welch-t test shows significance for each gender group and bit-rate for shimmer. The jitter metric showed no significance for bit-rates >8.85 kbps. The remaining features however showed a significant reduction in error particularly when the bit-rate increased. As in the AMR-NB, jitter and shimmer showed a tendency to be over-estimated while the MFCC were under-estimated. Clearly the AMR-WB codec is superior as expected due to the higher bit-rate and frequency bandwidth.


Adaptive Multi-Rate Compression Effects on Vowel Analysis.

Ireland D, Knuepffer C, McBride SJ - Front Bioeng Biotechnol (2015)

Error for each speech feature when the audio signal is compressed using AMR-WB codec at 12.65 kbps.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4542648&req=5

Figure 5: Error for each speech feature when the audio signal is compressed using AMR-WB codec at 12.65 kbps.
Mentions: Table 3 shows the mean and SD (in brackets) of the resultant error when the audio is compressed using the AMR-WB codec at all possible bit-rates.Table elements in boldface represent metrics that show a high significance using the Welch t-test. The complete data for bit-rates 12.65 kbps, 18.25 kbps, and 23.85 kbps are given in box-and-whisker form in Figures 5–7, respectively. These figures reflect the lowest and highest possible bit-rate currently possible using AMR-WB codec. Referring to the Table 3, jitter and shimmer are shown to still exhibit significant distortion when the audio signal is compressed. The Welch-t test shows significance for each gender group and bit-rate for shimmer. The jitter metric showed no significance for bit-rates >8.85 kbps. The remaining features however showed a significant reduction in error particularly when the bit-rate increased. As in the AMR-NB, jitter and shimmer showed a tendency to be over-estimated while the MFCC were under-estimated. Clearly the AMR-WB codec is superior as expected due to the higher bit-rate and frequency bandwidth.

Bottom Line: Signal processing on digitally sampled vowel sounds for the detection of pathological voices has been firmly established.This work examines compression artifacts on vowel speech samples that have been compressed using the adaptive multi-rate codec at various bit-rates.We believe this work will have potential impact for future research on remote monitoring as the identification and exclusion of an ill-defined speech feature that has been hitherto used, will ultimately increase the robustness of the system.

View Article: PubMed Central - PubMed

Affiliation: Computational Informatics, Australian e-Health Research Centre, CSIRO , Brisbane, QLD , Australia.

ABSTRACT
Signal processing on digitally sampled vowel sounds for the detection of pathological voices has been firmly established. This work examines compression artifacts on vowel speech samples that have been compressed using the adaptive multi-rate codec at various bit-rates. Whereas previous work has used the sensitivity of machine learning algorithm to test for accuracy, this work examines the changes in the extracted speech features themselves and thus report new findings on the usefulness of a particular feature. We believe this work will have potential impact for future research on remote monitoring as the identification and exclusion of an ill-defined speech feature that has been hitherto used, will ultimately increase the robustness of the system.

No MeSH data available.