Cargando…

Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations

BACKGROUND: Over a tenth of preventable adverse events in health care are caused by failures in information flow. These failures are tangible in clinical handover; regardless of good verbal handover, from two-thirds to all of this information is lost after 3-5 shifts if notes are taken by hand, or n...

Descripción completa

Detalles Bibliográficos
Autores principales:	Suominen, Hanna, Zhou, Liyuan, Hanlen, Leif, Ferraro, Gabriela
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Gunther Eysenbach 2015
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4427705/ https://www.ncbi.nlm.nih.gov/pubmed/25917752 http://dx.doi.org/10.2196/medinform.4321

_version_	1782370764102565888
author	Suominen, Hanna Zhou, Liyuan Hanlen, Leif Ferraro, Gabriela
author_facet	Suominen, Hanna Zhou, Liyuan Hanlen, Leif Ferraro, Gabriela
author_sort	Suominen, Hanna
collection	PubMed
description	BACKGROUND: Over a tenth of preventable adverse events in health care are caused by failures in information flow. These failures are tangible in clinical handover; regardless of good verbal handover, from two-thirds to all of this information is lost after 3-5 shifts if notes are taken by hand, or not at all. Speech recognition and information extraction provide a way to fill out a handover form for clinical proofing and sign-off. OBJECTIVE: The objective of the study was to provide a recorded spoken handover, annotated verbatim transcriptions, and evaluations to support research in spoken and written natural language processing for filling out a clinical handover form. This dataset is based on synthetic patient profiles, thereby avoiding ethical and legal restrictions, while maintaining efficacy for research in speech-to-text conversion and information extraction, based on realistic clinical scenarios. We also introduce a Web app to demonstrate the system design and workflow. METHODS: We experiment with Dragon Medical 11.0 for speech recognition and CRF++ for information extraction. To compute features for information extraction, we also apply CoreNLP, MetaMap, and Ontoserver. Our evaluation uses cross-validation techniques to measure processing correctness. RESULTS: The data provided were a simulation of nursing handover, as recorded using a mobile device, built from simulated patient records and handover scripts, spoken by an Australian registered nurse. Speech recognition recognized 5276 of 7277 words in our 100 test documents correctly. We considered 50 mutually exclusive categories in information extraction and achieved the F1 (ie, the harmonic mean of Precision and Recall) of 0.86 in the category for irrelevant text and the macro-averaged F1 of 0.70 over the remaining 35 nonempty categories of the form in our 101 test documents. CONCLUSIONS: The significance of this study hinges on opening our data, together with the related performance benchmarks and some processing software, to the research and development community for studying clinical documentation and language-processing. The data are used in the CLEFeHealth 2015 evaluation laboratory for a shared task on speech recognition.
format	Online Article Text
id	pubmed-4427705
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	Gunther Eysenbach
record_format	MEDLINE/PubMed
spelling	pubmed-44277052015-05-26 Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations Suominen, Hanna Zhou, Liyuan Hanlen, Leif Ferraro, Gabriela JMIR Med Inform Original Paper BACKGROUND: Over a tenth of preventable adverse events in health care are caused by failures in information flow. These failures are tangible in clinical handover; regardless of good verbal handover, from two-thirds to all of this information is lost after 3-5 shifts if notes are taken by hand, or not at all. Speech recognition and information extraction provide a way to fill out a handover form for clinical proofing and sign-off. OBJECTIVE: The objective of the study was to provide a recorded spoken handover, annotated verbatim transcriptions, and evaluations to support research in spoken and written natural language processing for filling out a clinical handover form. This dataset is based on synthetic patient profiles, thereby avoiding ethical and legal restrictions, while maintaining efficacy for research in speech-to-text conversion and information extraction, based on realistic clinical scenarios. We also introduce a Web app to demonstrate the system design and workflow. METHODS: We experiment with Dragon Medical 11.0 for speech recognition and CRF++ for information extraction. To compute features for information extraction, we also apply CoreNLP, MetaMap, and Ontoserver. Our evaluation uses cross-validation techniques to measure processing correctness. RESULTS: The data provided were a simulation of nursing handover, as recorded using a mobile device, built from simulated patient records and handover scripts, spoken by an Australian registered nurse. Speech recognition recognized 5276 of 7277 words in our 100 test documents correctly. We considered 50 mutually exclusive categories in information extraction and achieved the F1 (ie, the harmonic mean of Precision and Recall) of 0.86 in the category for irrelevant text and the macro-averaged F1 of 0.70 over the remaining 35 nonempty categories of the form in our 101 test documents. CONCLUSIONS: The significance of this study hinges on opening our data, together with the related performance benchmarks and some processing software, to the research and development community for studying clinical documentation and language-processing. The data are used in the CLEFeHealth 2015 evaluation laboratory for a shared task on speech recognition. Gunther Eysenbach 2015-04-27 /pmc/articles/PMC4427705/ /pubmed/25917752 http://dx.doi.org/10.2196/medinform.4321 Text en ©Hanna Suominen, Liyuan Zhou, Leif Hanlen, Gabriela Ferraro. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 27.04.2015. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Suominen, Hanna Zhou, Liyuan Hanlen, Leif Ferraro, Gabriela Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations
title	Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations
title_full	Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations
title_fullStr	Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations
title_full_unstemmed	Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations
title_short	Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations
title_sort	benchmarking clinical speech recognition and information extraction: new data, methods, and evaluations
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4427705/ https://www.ncbi.nlm.nih.gov/pubmed/25917752 http://dx.doi.org/10.2196/medinform.4321
work_keys_str_mv	AT suominenhanna benchmarkingclinicalspeechrecognitionandinformationextractionnewdatamethodsandevaluations AT zhouliyuan benchmarkingclinicalspeechrecognitionandinformationextractionnewdatamethodsandevaluations AT hanlenleif benchmarkingclinicalspeechrecognitionandinformationextractionnewdatamethodsandevaluations AT ferrarogabriela benchmarkingclinicalspeechrecognitionandinformationextractionnewdatamethodsandevaluations

Benchmarking Clinical Speech Recognition and Information Extraction: New Data, Methods, and Evaluations

Ejemplares similares