TY - GEN
T1 - Identification of Nasalization and Nasal Assimilation from Children’s Speech
AU - Ramteke, Pravin Bhaskar
AU - Supanekar, Sujata
AU - Aithal, Venkataraja
AU - Koolagudi, Shashidhar G.
N1 - Funding Information:
The authors would like to thank the Cognitive Science Research Initiative (CSRI), Department of Science & Technology, Government of India, Grant no. SR/CSRI/ 49/2015, for its financial support on this work.
Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020
Y1 - 2020
N2 - In children, nasalization is a commonly observed phonological process where the non-nasal sounds are substituted with nasal sounds. Here, an attempt has been made for the identification of nasalization and nasal assimilation. The properties of nasal sounds and nasalized voiced sounds are explored using MFCCs extracted from Hilbert envelope of the numerator of group delay (HNGD) Spectrum. HNGD Spectrum highlights the formants in the speech and extra nasal formant in the vicinity of first formant in nasalized voiced sounds. Features extracted from correctly pronounced and mispronounced words are compared using Dynamic Time Warping (DTW) algorithm. The nature of the deviation of DTW comparison path from its diagonal behavior is analyzed for the identification of mispronunciation. The combination of FFT based MFCCs and HNGD spectrum based MFCCs are observed to achieve highest accuracy of 82.22% within the tolerance range of ±50 ms.
AB - In children, nasalization is a commonly observed phonological process where the non-nasal sounds are substituted with nasal sounds. Here, an attempt has been made for the identification of nasalization and nasal assimilation. The properties of nasal sounds and nasalized voiced sounds are explored using MFCCs extracted from Hilbert envelope of the numerator of group delay (HNGD) Spectrum. HNGD Spectrum highlights the formants in the speech and extra nasal formant in the vicinity of first formant in nasalized voiced sounds. Features extracted from correctly pronounced and mispronounced words are compared using Dynamic Time Warping (DTW) algorithm. The nature of the deviation of DTW comparison path from its diagonal behavior is analyzed for the identification of mispronunciation. The combination of FFT based MFCCs and HNGD spectrum based MFCCs are observed to achieve highest accuracy of 82.22% within the tolerance range of ±50 ms.
UR - http://www.scopus.com/inward/record.url?scp=85098289459&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85098289459&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-66187-8_23
DO - 10.1007/978-3-030-66187-8_23
M3 - Conference contribution
AN - SCOPUS:85098289459
SN - 9783030661861
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 244
EP - 253
BT - Mining Intelligence and Knowledge Exploration - 7th International Conference, MIKE 2019, Proceedings
A2 - B.R., P.
A2 - Thenkanidiyoor, Veena
A2 - Prasath, Rajendra
A2 - Vanga, Odelu
PB - Springer Science and Business Media Deutschland GmbH
T2 - 7th International Conference on Mining Intelligence and Knowledge Exploration, MIKE 2019
Y2 - 19 December 2019 through 22 December 2019
ER -