Cargando…

Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models

The aim of this study was to analyze retrospectively the influence of different acoustic and language models in order to determine the most important effects to the clinical performance of an Estonian language-based non-commercial radiology-oriented automatic speech recognition (ASR) system. An ASR...

Descripción completa

Detalles Bibliográficos
Autores principales:	Paats, A., Alumäe, T., Meister, E., Fridolin, I.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2018
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6148813/ https://www.ncbi.nlm.nih.gov/pubmed/29713836 http://dx.doi.org/10.1007/s10278-018-0085-8

_version_	1783356782886780928
author	Paats, A. Alumäe, T. Meister, E. Fridolin, I.
author_facet	Paats, A. Alumäe, T. Meister, E. Fridolin, I.
author_sort	Paats, A.
collection	PubMed
description	The aim of this study was to analyze retrospectively the influence of different acoustic and language models in order to determine the most important effects to the clinical performance of an Estonian language-based non-commercial radiology-oriented automatic speech recognition (ASR) system. An ASR system was developed for Estonian language in radiology domain by utilizing open-source software components (Kaldi toolkit, Thrax). The ASR system was trained with the real radiology text reports and dictations collected during development phases. The final version of the ASR system was tested by 11 radiologists who dictated 219 reports in total, in spontaneous manner in a real clinical environment. The audio files collected in the final phase were used to measure the performance of different versions of the ASR system retrospectively. ASR system versions were evaluated by word error rate (WER) for each speaker and modality and by WER difference for the first and the last version of the ASR system. Total average WER for the final version throughout all material was improved from 18.4% of the first version (v1) to 5.8% of the last (v8) version which corresponds to relative improvement of 68.5%. WER improvement was strongly related to modality and radiologist. In summary, the performance of the final ASR system version was close to optimal, delivering similar results to all modalities and being independent on user, the complexity of the radiology reports, user experience, and speech characteristics.
format	Online Article Text
id	pubmed-6148813
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-61488132018-09-26 Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models Paats, A. Alumäe, T. Meister, E. Fridolin, I. J Digit Imaging Article The aim of this study was to analyze retrospectively the influence of different acoustic and language models in order to determine the most important effects to the clinical performance of an Estonian language-based non-commercial radiology-oriented automatic speech recognition (ASR) system. An ASR system was developed for Estonian language in radiology domain by utilizing open-source software components (Kaldi toolkit, Thrax). The ASR system was trained with the real radiology text reports and dictations collected during development phases. The final version of the ASR system was tested by 11 radiologists who dictated 219 reports in total, in spontaneous manner in a real clinical environment. The audio files collected in the final phase were used to measure the performance of different versions of the ASR system retrospectively. ASR system versions were evaluated by word error rate (WER) for each speaker and modality and by WER difference for the first and the last version of the ASR system. Total average WER for the final version throughout all material was improved from 18.4% of the first version (v1) to 5.8% of the last (v8) version which corresponds to relative improvement of 68.5%. WER improvement was strongly related to modality and radiologist. In summary, the performance of the final ASR system version was close to optimal, delivering similar results to all modalities and being independent on user, the complexity of the radiology reports, user experience, and speech characteristics. Springer International Publishing 2018-04-30 2018-10 /pmc/articles/PMC6148813/ /pubmed/29713836 http://dx.doi.org/10.1007/s10278-018-0085-8 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Article Paats, A. Alumäe, T. Meister, E. Fridolin, I. Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models
title	Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models
title_full	Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models
title_fullStr	Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models
title_full_unstemmed	Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models
title_short	Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models
title_sort	retrospective analysis of clinical performance of an estonian speech recognition system for radiology: effects of different acoustic and language models
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6148813/ https://www.ncbi.nlm.nih.gov/pubmed/29713836 http://dx.doi.org/10.1007/s10278-018-0085-8
work_keys_str_mv	AT paatsa retrospectiveanalysisofclinicalperformanceofanestonianspeechrecognitionsystemforradiologyeffectsofdifferentacousticandlanguagemodels AT alumaet retrospectiveanalysisofclinicalperformanceofanestonianspeechrecognitionsystemforradiologyeffectsofdifferentacousticandlanguagemodels AT meistere retrospectiveanalysisofclinicalperformanceofanestonianspeechrecognitionsystemforradiologyeffectsofdifferentacousticandlanguagemodels AT fridolini retrospectiveanalysisofclinicalperformanceofanestonianspeechrecognitionsystemforradiologyeffectsofdifferentacousticandlanguagemodels

Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models

Ejemplares similares