Cargando…

Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system

OBJECTIVE: Routinely collected healthcare data are a powerful research resource but often lack detailed disease-specific information that is collected in clinical free text, for example, clinic letters. We aim to use natural language processing techniques to extract detailed clinical information fro...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fonferko-Shadrach, Beata, Lacey, Arron S, Roberts, Angus, Akbari, Ashley, Thompson, Simon, Ford, David V, Lyons, Ronan A, Rees, Mark I, Pickrell, William Owen
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BMJ Publishing Group 2019
Materias:	Neurology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6500195/ https://www.ncbi.nlm.nih.gov/pubmed/30940752 http://dx.doi.org/10.1136/bmjopen-2018-023232

_version_	1783415899698495488
author	Fonferko-Shadrach, Beata Lacey, Arron S Roberts, Angus Akbari, Ashley Thompson, Simon Ford, David V Lyons, Ronan A Rees, Mark I Pickrell, William Owen
author_facet	Fonferko-Shadrach, Beata Lacey, Arron S Roberts, Angus Akbari, Ashley Thompson, Simon Ford, David V Lyons, Ronan A Rees, Mark I Pickrell, William Owen
author_sort	Fonferko-Shadrach, Beata
collection	PubMed
description	OBJECTIVE: Routinely collected healthcare data are a powerful research resource but often lack detailed disease-specific information that is collected in clinical free text, for example, clinic letters. We aim to use natural language processing techniques to extract detailed clinical information from epilepsy clinic letters to enrich routinely collected data. DESIGN: We used the general architecture for text engineering (GATE) framework to build an information extraction system, ExECT (extraction of epilepsy clinical text), combining rule-based and statistical techniques. We extracted nine categories of epilepsy information in addition to clinic date and date of birth across 200 clinic letters. We compared the results of our algorithm with a manual review of the letters by an epilepsy clinician. SETTING: De-identified and pseudonymised epilepsy clinic letters from a Health Board serving half a million residents in Wales, UK. RESULTS: We identified 1925 items of information with overall precision, recall and F1 score of 91.4%, 81.4% and 86.1%, respectively. Precision and recall for epilepsy-specific categories were: epilepsy diagnosis (88.1%, 89.0%), epilepsy type (89.8%, 79.8%), focal seizures (96.2%, 69.7%), generalised seizures (88.8%, 52.3%), seizure frequency (86.3%–53.6%), medication (96.1%, 94.0%), CT (55.6%, 58.8%), MRI (82.4%, 68.8%) and electroencephalogram (81.5%, 75.3%). CONCLUSIONS: We have built an automated clinical text extraction system that can accurately extract epilepsy information from free text in clinic letters. This can enhance routinely collected data for research in the UK. The information extracted with ExECT such as epilepsy type, seizure frequency and neurological investigations are often missing from routinely collected data. We propose that our algorithm can bridge this data gap enabling further epilepsy research opportunities. While many of the rules in our pipeline were tailored to extract epilepsy specific information, our methods can be applied to other diseases and also can be used in clinical practice to record patient information in a structured manner.
format	Online Article Text
id	pubmed-6500195
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BMJ Publishing Group
record_format	MEDLINE/PubMed
spelling	pubmed-65001952019-05-21 Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system Fonferko-Shadrach, Beata Lacey, Arron S Roberts, Angus Akbari, Ashley Thompson, Simon Ford, David V Lyons, Ronan A Rees, Mark I Pickrell, William Owen BMJ Open Neurology OBJECTIVE: Routinely collected healthcare data are a powerful research resource but often lack detailed disease-specific information that is collected in clinical free text, for example, clinic letters. We aim to use natural language processing techniques to extract detailed clinical information from epilepsy clinic letters to enrich routinely collected data. DESIGN: We used the general architecture for text engineering (GATE) framework to build an information extraction system, ExECT (extraction of epilepsy clinical text), combining rule-based and statistical techniques. We extracted nine categories of epilepsy information in addition to clinic date and date of birth across 200 clinic letters. We compared the results of our algorithm with a manual review of the letters by an epilepsy clinician. SETTING: De-identified and pseudonymised epilepsy clinic letters from a Health Board serving half a million residents in Wales, UK. RESULTS: We identified 1925 items of information with overall precision, recall and F1 score of 91.4%, 81.4% and 86.1%, respectively. Precision and recall for epilepsy-specific categories were: epilepsy diagnosis (88.1%, 89.0%), epilepsy type (89.8%, 79.8%), focal seizures (96.2%, 69.7%), generalised seizures (88.8%, 52.3%), seizure frequency (86.3%–53.6%), medication (96.1%, 94.0%), CT (55.6%, 58.8%), MRI (82.4%, 68.8%) and electroencephalogram (81.5%, 75.3%). CONCLUSIONS: We have built an automated clinical text extraction system that can accurately extract epilepsy information from free text in clinic letters. This can enhance routinely collected data for research in the UK. The information extracted with ExECT such as epilepsy type, seizure frequency and neurological investigations are often missing from routinely collected data. We propose that our algorithm can bridge this data gap enabling further epilepsy research opportunities. While many of the rules in our pipeline were tailored to extract epilepsy specific information, our methods can be applied to other diseases and also can be used in clinical practice to record patient information in a structured manner. BMJ Publishing Group 2019-04-01 /pmc/articles/PMC6500195/ /pubmed/30940752 http://dx.doi.org/10.1136/bmjopen-2018-023232 Text en © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY. Published by BMJ. This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.
spellingShingle	Neurology Fonferko-Shadrach, Beata Lacey, Arron S Roberts, Angus Akbari, Ashley Thompson, Simon Ford, David V Lyons, Ronan A Rees, Mark I Pickrell, William Owen Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system
title	Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system
title_full	Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system
title_fullStr	Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system
title_full_unstemmed	Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system
title_short	Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system
title_sort	using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the exect (extraction of epilepsy clinical text) system
topic	Neurology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6500195/ https://www.ncbi.nlm.nih.gov/pubmed/30940752 http://dx.doi.org/10.1136/bmjopen-2018-023232
work_keys_str_mv	AT fonferkoshadrachbeata usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem AT laceyarrons usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem AT robertsangus usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem AT akbariashley usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem AT thompsonsimon usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem AT forddavidv usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem AT lyonsronana usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem AT reesmarki usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem AT pickrellwilliamowen usingnaturallanguageprocessingtoextractstructuredepilepsydatafromunstructuredcliniclettersdevelopmentandvalidationoftheexectextractionofepilepsyclinicaltextsystem

Using natural language processing to extract structured epilepsy data from unstructured clinic letters: development and validation of the ExECT (extraction of epilepsy clinical text) system

Ejemplares similares