Automatic glottis localization and segmentation in stroboscopic videos using deep neural network

M. V. Achuth Rao, Rahul Krishnamurthy, Pebbili Gopikishore, Veeramani Priyadharshini, Prasanta Kumar Ghosh

Research output: Contribution to journalConference article

Abstract

Exact analysis of the glottal vibration patten is vital for assessing voice pathologies. One of the primary steps in this analysis is automatic glottis segmentation, which, in turn, has two main parts, namely, glottis localization and the glottis segmentation. In this paper, we propose a deep neural network (DNN) based automatic glottis localization and segmentation scheme. We pose the problem as a classification problem where colors of each pixel and its neighborhood is classified as belonging to inside or outside the glottis region. We further process the classification result to get the biggest cluster, which is declared as the segmented glottis. The proposed algorithm is evaluated on a dataset comprising of stroboscopic videos from 18 subjects where the glottis region is marked by the three Speech Language Pathologists (SLPs). On average, the proposed DNN based segmentation scheme achieves a localization performance of 65.33% and segmentation DICE score of 0.74 (absolute), which is better than the baseline scheme by 22.66% and 0.09 respectively. We also find that the DICE score obtained by the DNN based segmentation scheme correlates well with the average DICE score computed between annotation provided by any two SLPs suggesting the robustness of the proposed glottis segmentation scheme.

Original languageEnglish
Pages (from-to)3007-3011
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2018-September
DOIs
Publication statusPublished - 01-01-2018
Externally publishedYes
Event19th Annual Conference of the International Speech Communication, INTERSPEECH 2018 - Hyderabad, India
Duration: 02-09-201806-09-2018

Fingerprint

Segmentation
Neural Networks
Pathology
Pixels
Color
Deep neural networks
Glottis
Localization
Classification Problems
Correlate
Annotation
Baseline
Vibration
Pixel
Robustness
Language
Speech

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

Achuth Rao, M. V. ; Krishnamurthy, Rahul ; Gopikishore, Pebbili ; Priyadharshini, Veeramani ; Ghosh, Prasanta Kumar. / Automatic glottis localization and segmentation in stroboscopic videos using deep neural network. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. 2018 ; Vol. 2018-September. pp. 3007-3011.
@article{68c1684f616c46a5870b22309393cbf8,
title = "Automatic glottis localization and segmentation in stroboscopic videos using deep neural network",
abstract = "Exact analysis of the glottal vibration patten is vital for assessing voice pathologies. One of the primary steps in this analysis is automatic glottis segmentation, which, in turn, has two main parts, namely, glottis localization and the glottis segmentation. In this paper, we propose a deep neural network (DNN) based automatic glottis localization and segmentation scheme. We pose the problem as a classification problem where colors of each pixel and its neighborhood is classified as belonging to inside or outside the glottis region. We further process the classification result to get the biggest cluster, which is declared as the segmented glottis. The proposed algorithm is evaluated on a dataset comprising of stroboscopic videos from 18 subjects where the glottis region is marked by the three Speech Language Pathologists (SLPs). On average, the proposed DNN based segmentation scheme achieves a localization performance of 65.33{\%} and segmentation DICE score of 0.74 (absolute), which is better than the baseline scheme by 22.66{\%} and 0.09 respectively. We also find that the DICE score obtained by the DNN based segmentation scheme correlates well with the average DICE score computed between annotation provided by any two SLPs suggesting the robustness of the proposed glottis segmentation scheme.",
author = "{Achuth Rao}, {M. V.} and Rahul Krishnamurthy and Pebbili Gopikishore and Veeramani Priyadharshini and Ghosh, {Prasanta Kumar}",
year = "2018",
month = "1",
day = "1",
doi = "10.21437/Interspeech.2018-2572",
language = "English",
volume = "2018-September",
pages = "3007--3011",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

Automatic glottis localization and segmentation in stroboscopic videos using deep neural network. / Achuth Rao, M. V.; Krishnamurthy, Rahul; Gopikishore, Pebbili; Priyadharshini, Veeramani; Ghosh, Prasanta Kumar.

In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2018-September, 01.01.2018, p. 3007-3011.

Research output: Contribution to journalConference article

TY - JOUR

T1 - Automatic glottis localization and segmentation in stroboscopic videos using deep neural network

AU - Achuth Rao, M. V.

AU - Krishnamurthy, Rahul

AU - Gopikishore, Pebbili

AU - Priyadharshini, Veeramani

AU - Ghosh, Prasanta Kumar

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Exact analysis of the glottal vibration patten is vital for assessing voice pathologies. One of the primary steps in this analysis is automatic glottis segmentation, which, in turn, has two main parts, namely, glottis localization and the glottis segmentation. In this paper, we propose a deep neural network (DNN) based automatic glottis localization and segmentation scheme. We pose the problem as a classification problem where colors of each pixel and its neighborhood is classified as belonging to inside or outside the glottis region. We further process the classification result to get the biggest cluster, which is declared as the segmented glottis. The proposed algorithm is evaluated on a dataset comprising of stroboscopic videos from 18 subjects where the glottis region is marked by the three Speech Language Pathologists (SLPs). On average, the proposed DNN based segmentation scheme achieves a localization performance of 65.33% and segmentation DICE score of 0.74 (absolute), which is better than the baseline scheme by 22.66% and 0.09 respectively. We also find that the DICE score obtained by the DNN based segmentation scheme correlates well with the average DICE score computed between annotation provided by any two SLPs suggesting the robustness of the proposed glottis segmentation scheme.

AB - Exact analysis of the glottal vibration patten is vital for assessing voice pathologies. One of the primary steps in this analysis is automatic glottis segmentation, which, in turn, has two main parts, namely, glottis localization and the glottis segmentation. In this paper, we propose a deep neural network (DNN) based automatic glottis localization and segmentation scheme. We pose the problem as a classification problem where colors of each pixel and its neighborhood is classified as belonging to inside or outside the glottis region. We further process the classification result to get the biggest cluster, which is declared as the segmented glottis. The proposed algorithm is evaluated on a dataset comprising of stroboscopic videos from 18 subjects where the glottis region is marked by the three Speech Language Pathologists (SLPs). On average, the proposed DNN based segmentation scheme achieves a localization performance of 65.33% and segmentation DICE score of 0.74 (absolute), which is better than the baseline scheme by 22.66% and 0.09 respectively. We also find that the DICE score obtained by the DNN based segmentation scheme correlates well with the average DICE score computed between annotation provided by any two SLPs suggesting the robustness of the proposed glottis segmentation scheme.

UR - http://www.scopus.com/inward/record.url?scp=85054982171&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054982171&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2018-2572

DO - 10.21437/Interspeech.2018-2572

M3 - Conference article

VL - 2018-September

SP - 3007

EP - 3011

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -