Cargando…
Developing a portable natural language processing based phenotyping system
BACKGROUND: This paper presents a portable phenotyping system that is capable of integrating both rule-based and statistical machine learning based approaches. METHODS: Our system utilizes UMLS to extract clinically relevant features from the unstructured text and then facilitates portability across...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6448187/ https://www.ncbi.nlm.nih.gov/pubmed/30943974 http://dx.doi.org/10.1186/s12911-019-0786-z |
_version_ | 1783408647852785664 |
---|---|
author | Sharma, Himanshu Mao, Chengsheng Zhang, Yizhen Vatani, Haleh Yao, Liang Zhong, Yizhen Rasmussen, Luke Jiang, Guoqian Pathak, Jyotishman Luo, Yuan |
author_facet | Sharma, Himanshu Mao, Chengsheng Zhang, Yizhen Vatani, Haleh Yao, Liang Zhong, Yizhen Rasmussen, Luke Jiang, Guoqian Pathak, Jyotishman Luo, Yuan |
author_sort | Sharma, Himanshu |
collection | PubMed |
description | BACKGROUND: This paper presents a portable phenotyping system that is capable of integrating both rule-based and statistical machine learning based approaches. METHODS: Our system utilizes UMLS to extract clinically relevant features from the unstructured text and then facilitates portability across different institutions and data systems by incorporating OHDSI’s OMOP Common Data Model (CDM) to standardize necessary data elements. Our system can also store the key components of rule-based systems (e.g., regular expression matches) in the format of OMOP CDM, thus enabling the reuse, adaptation and extension of many existing rule-based clinical NLP systems. We experimented with our system on the corpus from i2b2’s Obesity Challenge as a pilot study. RESULTS: Our system facilitates portable phenotyping of obesity and its 15 comorbidities based on the unstructured patient discharge summaries, while achieving a performance that often ranked among the top 10 of the challenge participants. CONCLUSION: Our system of standardization enables a consistent application of numerous rule-based and machine learning based classification techniques downstream across disparate datasets which may originate across different institutions and data systems. |
format | Online Article Text |
id | pubmed-6448187 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-64481872019-04-15 Developing a portable natural language processing based phenotyping system Sharma, Himanshu Mao, Chengsheng Zhang, Yizhen Vatani, Haleh Yao, Liang Zhong, Yizhen Rasmussen, Luke Jiang, Guoqian Pathak, Jyotishman Luo, Yuan BMC Med Inform Decis Mak Research BACKGROUND: This paper presents a portable phenotyping system that is capable of integrating both rule-based and statistical machine learning based approaches. METHODS: Our system utilizes UMLS to extract clinically relevant features from the unstructured text and then facilitates portability across different institutions and data systems by incorporating OHDSI’s OMOP Common Data Model (CDM) to standardize necessary data elements. Our system can also store the key components of rule-based systems (e.g., regular expression matches) in the format of OMOP CDM, thus enabling the reuse, adaptation and extension of many existing rule-based clinical NLP systems. We experimented with our system on the corpus from i2b2’s Obesity Challenge as a pilot study. RESULTS: Our system facilitates portable phenotyping of obesity and its 15 comorbidities based on the unstructured patient discharge summaries, while achieving a performance that often ranked among the top 10 of the challenge participants. CONCLUSION: Our system of standardization enables a consistent application of numerous rule-based and machine learning based classification techniques downstream across disparate datasets which may originate across different institutions and data systems. BioMed Central 2019-04-04 /pmc/articles/PMC6448187/ /pubmed/30943974 http://dx.doi.org/10.1186/s12911-019-0786-z Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Sharma, Himanshu Mao, Chengsheng Zhang, Yizhen Vatani, Haleh Yao, Liang Zhong, Yizhen Rasmussen, Luke Jiang, Guoqian Pathak, Jyotishman Luo, Yuan Developing a portable natural language processing based phenotyping system |
title | Developing a portable natural language processing based phenotyping system |
title_full | Developing a portable natural language processing based phenotyping system |
title_fullStr | Developing a portable natural language processing based phenotyping system |
title_full_unstemmed | Developing a portable natural language processing based phenotyping system |
title_short | Developing a portable natural language processing based phenotyping system |
title_sort | developing a portable natural language processing based phenotyping system |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6448187/ https://www.ncbi.nlm.nih.gov/pubmed/30943974 http://dx.doi.org/10.1186/s12911-019-0786-z |
work_keys_str_mv | AT sharmahimanshu developingaportablenaturallanguageprocessingbasedphenotypingsystem AT maochengsheng developingaportablenaturallanguageprocessingbasedphenotypingsystem AT zhangyizhen developingaportablenaturallanguageprocessingbasedphenotypingsystem AT vatanihaleh developingaportablenaturallanguageprocessingbasedphenotypingsystem AT yaoliang developingaportablenaturallanguageprocessingbasedphenotypingsystem AT zhongyizhen developingaportablenaturallanguageprocessingbasedphenotypingsystem AT rasmussenluke developingaportablenaturallanguageprocessingbasedphenotypingsystem AT jiangguoqian developingaportablenaturallanguageprocessingbasedphenotypingsystem AT pathakjyotishman developingaportablenaturallanguageprocessingbasedphenotypingsystem AT luoyuan developingaportablenaturallanguageprocessingbasedphenotypingsystem |