Language-Independent Automatic Evaluation of Intelligibility of Chronically Hoarse PersonsHaderlein T.a, b · Middag C.c · Martens J.-P.c · Döllinger M.a, d · Nöth E.b, e
aPhoniatrische und pädaudiologische Abteilung, Universitätsklinikum Erlangen, and bLehrstuhl für Mustererkennung, Universität Erlangen-Nürnberg, Erlangen, Germany; cVakgroep voor Elektronica en Informatiesystemen, Universiteit Gent, Gent, Belgium; dCommunication Sciences and Disorders Department, Louisiana State University, Baton Rouge, La., USA; eElectrical and Computer Engineering Department, Faculty of Engineering, King Abdulaziz University Jeddah, Jeddah, Saudi Arabia
Do you have an account?
- Rent for 48h to view
- Buy Cloud Access for unlimited viewing via different devices
- Synchronizing in the ReadCube Cloud
- Printing and saving restrictions apply
Rental: USD 8.50
Cloud: USD 20.00
Objective: Automatic intelligibility assessment using automatic speech recognition is usually language specific. In this study, a language-independent approach is proposed. It uses models that are trained with Flemish speech, and it is applied to assess chronically hoarse German speakers. The research questions are here: is it possible to construct suitable acoustic features that generalize to other languages and a speech disorder, and is the generated model for intelligibility also suitable for specific subtypes of that disorder, i.e. functional and organic dysphonia? Patients and Methods: 73 German-speaking persons with chronic hoarseness read the text ‘Der Nordwind und die Sonne'. Perceptual intelligibility scores were used as ground truth during the training of an automatic model that converts speaker level acoustic measurements into intelligibility scores. Cross-validation is used to assess model performance. Results: The interrater agreement for all patients (n = 73) and for the functional and organic dysphonia subgroups (n = 45 and n = 24) are r = 0.82, r = 0.83 and r = 0.75, respectively. The automatic assessment based on phonologically based acoustic models revealed correlations between perceptual and automatic intelligibility ratings of r = 0.79 (all patients), r = 0.78 (functional dysphonia) and r = 0.80 (organic dysphonia). Conclusion: The automatic, objective measurement of intelligibility is a valuable instrument in an evidence-based clinical practice.
© 2015 S. Karger AG, Basel
Biddle AK, Watson LR, Hooper CR, Lohr KN, Sutton SF: Criteria for determining disability in speech-language disorders. Summary, evidence report/technology assessment: No 52. AHRQ Publication No 02-E009. Rockville, Agency for Healthcare Research and Quality, 2002.
- Eadie TL, Kapsner M, Rosenzweig J, Waugh P, Hillel A, Merati A: The role of experience on judgments of dysphonia. J Voice 2010;24:564-573.
- De Bruijn MJ, ten Bosch L, Kuik DJ, Quené H, Langendijk JA, Leemans CR, Verdonck-de Leeuw IM: Objective acoustic-phonetic speech analysis in patients treated for oral or oropharyngeal cancer. Folia Phoniatr Logop 2009;61:180-187.
- Fraile R, Sáenz-Lechón N, Godino-Llorente JI, Osma-Ruiz V, Fredouille C: Automatic detection of laryngeal pathologies in records of sustained vowels by means of mel-frequency cepstral coefficient parameters and differentiation of patients by sex. Folia Phoniatr Logop 2009;61:146-152.
- Van Gogh C, Festen J, Verdonck-de Leeuw I, Parker A, Traissac L, Cheesman A, Mahieu H: Acoustical analysis of tracheoesophageal voice. Speech Commun 2005;47:160-168.
- Fröhlich M, Michaelis D, Strube HW, Kruse E: Acoustic voice analysis by means of the hoarseness diagram. J Speech Lang Hear Res 2000;43:706-720.
- Ainsworth W, Singh W: Perceptual comparison of neoglottal, oesophageal and normal speech. Folia Phoniatr 1992;44:297-307.
- Van As CJ, Koopmans-van Beinum FJ, Pols LC, Hilgers FJ: Perceptual evaluation of tracheoesophageal speech by naive and experienced judges through the use of semantic differential scales. J Speech Lang Hear Res 2003;46:947-959.
- Bellandese M, Lerman J, Gilbert H: An acoustic analysis of excellent female esophageal, tracheoesophageal, and laryngeal speakers. J Speech Lang Hear Res 2001;44:1315-1320.
- Moerman M, Pieters G, Martens JP, van der Borgt MJ, Dejonckere P: Objective evaluation of the quality of substitution voices. Eur Arch Otorhinolaryngol 2004;261:541-547.
- Haderlein T: Automatic Evaluation of Tracheoesophageal Substitute Voices. Berlin, Logos, 2007.
- Haderlein T, Nöth E, Batliner A, Eysholdt U, Rosanowski F: Automatic intelligibility assessment of pathologic speech over the telephone. Logoped Phoniatr Vocol 2011;36:175-181.
- Bocklet T, Riedhammer K, Nöth E, Eysholdt U, Haderlein T: Automatic intelligibility assessment of speakers after laryngeal cancer by means of acoustic modeling. J Voice 2012;26:390-397.
Middag C, Saeys Y, Martens J-P: Towards an ASR-free objective analysis of pathological speech. Proc Interspeech 2010 International Speech Communication Association, Makuhari, 2010, pp 294-297.
Middag C, Bocklet T, Martens J-P, Nöth E: Combining phonological and acoustic ASR-free features for pathological speech intelligibility assessment. Proc Interspeech 2011 International Speech Communication Association, Florence, 2011, pp 3005-3008.
- Davis SB, Mermelstein P: Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 1980;28:357-366.
- Roy N, Stemple J, Merrill RM, Thomas L: Epidemiology of voice disorders in the elderly: preliminary findings. Laryngoscope 2007;17:628-633.
International Phonetic Association: Handbook of the International Phonetic Association. Cambridge, Cambridge University Press, 1999.
Van Immerseel L, Martens J-P: AMPEX disordered voice analyzer (computer program). Digital Speech and Signal Processing research group, Ghent University, Ghent, Belgium. http://dssp.elis.ugent.be/downloads-software (accessed May 14, 2014).
Middag C: Automatic Analysis of Pathological Speech; PhD thesis, Ghent University, 2012.
- Van Immerseel LM, Martens JP: Pitch and voiced/unvoiced determination with an auditory model. J Acoust Soc Am 1992;91:3511-3526.
- Smola AJ, Schölkopf B: A tutorial on support vector regression. Statistics Comput 2004;14:199-222.
Krippendorff K: Content Analysis: An Introduction to Its Methodology, ed 3. Thousand Oaks, Sage, 2013.
- Weismer G, Martin R: Acoustic and perceptual approaches to the study of intelligibility; in Kent R (ed): Intelligibility in Speech Disorders. Philadelphia, Benjamins Publishing Co, 1992, pp 67-118.
- Lin E, Hornibrook J, Ormond T: Evaluating iPhone recordings for acoustic voice assessment. Folia Phoniatr Logop 2012;64:122-130.
Article / Publication Details
Copyright / Drug Dosage / DisclaimerCopyright: All rights reserved. No part of this publication may be translated into other languages, reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, microcopying, or by any information storage and retrieval system, without permission in writing from the publisher.
Drug Dosage: The authors and the publisher have exerted every effort to ensure that drug selection and dosage set forth in this text are in accord with current recommendations and practice at the time of publication. However, in view of ongoing research, changes in government regulations, and the constant flow of information relating to drug therapy and drug reactions, the reader is urged to check the package insert for each drug for any changes in indications and dosage and for added warnings and precautions. This is particularly important when the recommended agent is a new and/or infrequently employed drug.
Disclaimer: The statements, opinions and data contained in this publication are solely those of the individual authors and contributors and not of the publishers and the editor(s). The appearance of advertisements or/and product references in the publication is not a warranty, endorsement, or approval of the products or services advertised or of their effectiveness, quality or safety. The publisher and the editor(s) disclaim responsibility for any injury to persons or property resulting from any ideas, methods, instructions or products referred to in the content or advertisements.