Cargando…
Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals
OBJECTIVE: The constant progress in computational linguistic methods provides amazing opportunities for discovering information in clinical text and enables the clinical scientist to explore novel approaches to care. However, these new approaches need evaluation. We describe an automated system to c...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BMJ Publishing Group
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147613/ https://www.ncbi.nlm.nih.gov/pubmed/24692393 http://dx.doi.org/10.1136/amiajnl-2013-002601 |
_version_ | 1782332481574273024 |
---|---|
author | Connolly, Brian Matykiewicz, Pawel Bretonnel Cohen, K Standridge, Shannon M Glauser, Tracy A Dlugos, Dennis J Koh, Susan Tham, Eric Pestian, John |
author_facet | Connolly, Brian Matykiewicz, Pawel Bretonnel Cohen, K Standridge, Shannon M Glauser, Tracy A Dlugos, Dennis J Koh, Susan Tham, Eric Pestian, John |
author_sort | Connolly, Brian |
collection | PubMed |
description | OBJECTIVE: The constant progress in computational linguistic methods provides amazing opportunities for discovering information in clinical text and enables the clinical scientist to explore novel approaches to care. However, these new approaches need evaluation. We describe an automated system to compare descriptions of epilepsy patients at three different organizations: Cincinnati Children’s Hospital, the Children’s Hospital Colorado, and the Children’s Hospital of Philadelphia. To our knowledge, there have been no similar previous studies. MATERIALS AND METHODS: In this work, a support vector machine (SVM)-based natural language processing (NLP) algorithm is trained to classify epilepsy progress notes as belonging to a patient with a specific type of epilepsy from a particular hospital. The same SVM is then used to classify notes from another hospital. Our null hypothesis is that an NLP algorithm cannot be trained using epilepsy-specific notes from one hospital and subsequently used to classify notes from another hospital better than a random baseline classifier. The hypothesis is tested using epilepsy progress notes from the three hospitals. RESULTS: We are able to reject the null hypothesis at the 95% level. It is also found that classification was improved by including notes from a second hospital in the SVM training sample. DISCUSSION AND CONCLUSION: With a reasonably uniform epilepsy vocabulary and an NLP-based algorithm able to use this uniformity to classify epilepsy progress notes across different hospitals, we can pursue automated comparisons of patient conditions, treatments, and diagnoses across different healthcare settings. |
format | Online Article Text |
id | pubmed-4147613 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BMJ Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-41476132015-09-01 Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals Connolly, Brian Matykiewicz, Pawel Bretonnel Cohen, K Standridge, Shannon M Glauser, Tracy A Dlugos, Dennis J Koh, Susan Tham, Eric Pestian, John J Am Med Inform Assoc Focus on Biomedical Natural Language Processing and Data Modeling OBJECTIVE: The constant progress in computational linguistic methods provides amazing opportunities for discovering information in clinical text and enables the clinical scientist to explore novel approaches to care. However, these new approaches need evaluation. We describe an automated system to compare descriptions of epilepsy patients at three different organizations: Cincinnati Children’s Hospital, the Children’s Hospital Colorado, and the Children’s Hospital of Philadelphia. To our knowledge, there have been no similar previous studies. MATERIALS AND METHODS: In this work, a support vector machine (SVM)-based natural language processing (NLP) algorithm is trained to classify epilepsy progress notes as belonging to a patient with a specific type of epilepsy from a particular hospital. The same SVM is then used to classify notes from another hospital. Our null hypothesis is that an NLP algorithm cannot be trained using epilepsy-specific notes from one hospital and subsequently used to classify notes from another hospital better than a random baseline classifier. The hypothesis is tested using epilepsy progress notes from the three hospitals. RESULTS: We are able to reject the null hypothesis at the 95% level. It is also found that classification was improved by including notes from a second hospital in the SVM training sample. DISCUSSION AND CONCLUSION: With a reasonably uniform epilepsy vocabulary and an NLP-based algorithm able to use this uniformity to classify epilepsy progress notes across different hospitals, we can pursue automated comparisons of patient conditions, treatments, and diagnoses across different healthcare settings. BMJ Publishing Group 2014-09 2014-04-01 /pmc/articles/PMC4147613/ /pubmed/24692393 http://dx.doi.org/10.1136/amiajnl-2013-002601 Text en Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/ |
spellingShingle | Focus on Biomedical Natural Language Processing and Data Modeling Connolly, Brian Matykiewicz, Pawel Bretonnel Cohen, K Standridge, Shannon M Glauser, Tracy A Dlugos, Dennis J Koh, Susan Tham, Eric Pestian, John Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals |
title | Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals |
title_full | Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals |
title_fullStr | Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals |
title_full_unstemmed | Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals |
title_short | Assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals |
title_sort | assessing the similarity of surface linguistic features related to epilepsy across pediatric hospitals |
topic | Focus on Biomedical Natural Language Processing and Data Modeling |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4147613/ https://www.ncbi.nlm.nih.gov/pubmed/24692393 http://dx.doi.org/10.1136/amiajnl-2013-002601 |
work_keys_str_mv | AT connollybrian assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals AT matykiewiczpawel assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals AT bretonnelcohenk assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals AT standridgeshannonm assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals AT glausertracya assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals AT dlugosdennisj assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals AT kohsusan assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals AT thameric assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals AT pestianjohn assessingthesimilarityofsurfacelinguisticfeaturesrelatedtoepilepsyacrosspediatrichospitals |