Nonlinear feature based classification of speech under stress

Citation
Gj. Zhou et al., Nonlinear feature based classification of speech under stress, IEEE SPEECH, 9(3), 2001, pp. 201-216
Citations number
57
Language
INGLESE
art.tipo
Article
Categorie Soggetti
Eletrical & Eletronics Engineeing
Journal title
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING
ISSN journal
1063-6676 → ACNP
Volume
9
Issue
3
Year of publication
2001
Pages
201 - 216
Database
ISI
SICI code
1063-6676(200103)9:3<201:NFBCOS>2.0.ZU;2-5
Abstract
Studies have shown that variability introduced by stress or emotion can sev erely reduce speech recognition accuracy. Techniques for detecting or asses sing the presence of stress could help improve the robustness of speech rec ognition systems. Although some acoustic variables derived from linear spee ch production theory have been investigated as indicators of stress, they a re not always consistent. In this paper, three new features derived from th e nonlinear Teager energy operator (TEO) are investigated for stress classi fication. It is believed that the TEO based features are better able to ref lect the nonlinear airflow structure of speech production under adverse str essful conditions. The features proposed include TEO-decomposed FM variatio n (TEO-FM-Var), normalized TEO autocorrelation envelope area (TEO-Auto-Env) , and critical band based TEO autocorrelation envelope area (TEO-CB-Auto-En v), The proposed features are evaluated for the task of stress classificati on using simulated and actual stressed speech and it is shown that the TEO- CB-Auto-Env feature outperforms traditional pitch and mel-frequency cepstru m coefficients (MFCC) substantially, Performance for TEO based features are maintained in both text-dependent and text-independent models, while perfo rmance of traditional features degrades in text-independent models, Overall neutral versus stress classification rates are also shown to be more consi stent across different stress styles.