Cargando…

Rethinking glottal midline detection

A healthy voice is crucial for verbal communication and hence in daily as well as professional life. The basis for a healthy voice are the sound producing vocal folds in the larynx. A hallmark of healthy vocal fold oscillation is the symmetric motion of the left and right vocal fold. Clinically, vid...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kist, Andreas M., Zilker, Julian, Gómez, Pablo, Schützenberger, Anne, Döllinger, Michael
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693305/ https://www.ncbi.nlm.nih.gov/pubmed/33244031 http://dx.doi.org/10.1038/s41598-020-77216-6

_version_	1783614713519669248
author	Kist, Andreas M. Zilker, Julian Gómez, Pablo Schützenberger, Anne Döllinger, Michael
author_facet	Kist, Andreas M. Zilker, Julian Gómez, Pablo Schützenberger, Anne Döllinger, Michael
author_sort	Kist, Andreas M.
collection	PubMed
description	A healthy voice is crucial for verbal communication and hence in daily as well as professional life. The basis for a healthy voice are the sound producing vocal folds in the larynx. A hallmark of healthy vocal fold oscillation is the symmetric motion of the left and right vocal fold. Clinically, videoendoscopy is applied to assess the symmetry of the oscillation and evaluated subjectively. High-speed videoendoscopy, an emerging method that allows quantification of the vocal fold oscillation, is more commonly employed in research due to the amount of data and the complex, semi-automatic analysis. In this study, we provide a comprehensive evaluation of methods that detect fully automatically the glottal midline. We used a biophysical model to simulate different vocal fold oscillations, extended the openly available BAGLS dataset using manual annotations, utilized both, simulations and annotated endoscopic images, to train deep neural networks at different stages of the analysis workflow, and compared these to established computer vision algorithms. We found that classical computer vision perform well on detecting the glottal midline in glottis segmentation data, but are outperformed by deep neural networks on this task. We further suggest GlottisNet, a multi-task neural architecture featuring the simultaneous prediction of both, the opening between the vocal folds and the symmetry axis, leading to a huge step forward towards clinical applicability of quantitative, deep learning-assisted laryngeal endoscopy, by fully automating segmentation and midline detection.
format	Online Article Text
id	pubmed-7693305
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-76933052020-11-30 Rethinking glottal midline detection Kist, Andreas M. Zilker, Julian Gómez, Pablo Schützenberger, Anne Döllinger, Michael Sci Rep Article A healthy voice is crucial for verbal communication and hence in daily as well as professional life. The basis for a healthy voice are the sound producing vocal folds in the larynx. A hallmark of healthy vocal fold oscillation is the symmetric motion of the left and right vocal fold. Clinically, videoendoscopy is applied to assess the symmetry of the oscillation and evaluated subjectively. High-speed videoendoscopy, an emerging method that allows quantification of the vocal fold oscillation, is more commonly employed in research due to the amount of data and the complex, semi-automatic analysis. In this study, we provide a comprehensive evaluation of methods that detect fully automatically the glottal midline. We used a biophysical model to simulate different vocal fold oscillations, extended the openly available BAGLS dataset using manual annotations, utilized both, simulations and annotated endoscopic images, to train deep neural networks at different stages of the analysis workflow, and compared these to established computer vision algorithms. We found that classical computer vision perform well on detecting the glottal midline in glottis segmentation data, but are outperformed by deep neural networks on this task. We further suggest GlottisNet, a multi-task neural architecture featuring the simultaneous prediction of both, the opening between the vocal folds and the symmetry axis, leading to a huge step forward towards clinical applicability of quantitative, deep learning-assisted laryngeal endoscopy, by fully automating segmentation and midline detection. Nature Publishing Group UK 2020-11-26 /pmc/articles/PMC7693305/ /pubmed/33244031 http://dx.doi.org/10.1038/s41598-020-77216-6 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Kist, Andreas M. Zilker, Julian Gómez, Pablo Schützenberger, Anne Döllinger, Michael Rethinking glottal midline detection
title	Rethinking glottal midline detection
title_full	Rethinking glottal midline detection
title_fullStr	Rethinking glottal midline detection
title_full_unstemmed	Rethinking glottal midline detection
title_short	Rethinking glottal midline detection
title_sort	rethinking glottal midline detection
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7693305/ https://www.ncbi.nlm.nih.gov/pubmed/33244031 http://dx.doi.org/10.1038/s41598-020-77216-6
work_keys_str_mv	AT kistandreasm rethinkingglottalmidlinedetection AT zilkerjulian rethinkingglottalmidlinedetection AT gomezpablo rethinkingglottalmidlinedetection AT schutzenbergeranne rethinkingglottalmidlinedetection AT dollingermichael rethinkingglottalmidlinedetection

Rethinking glottal midline detection

Ejemplares similares