Cargando…

Developing a portable natural language processing based phenotyping system

BACKGROUND: This paper presents a portable phenotyping system that is capable of integrating both rule-based and statistical machine learning based approaches. METHODS: Our system utilizes UMLS to extract clinically relevant features from the unstructured text and then facilitates portability across...

Descripción completa

Detalles Bibliográficos
Autores principales: Sharma, Himanshu, Mao, Chengsheng, Zhang, Yizhen, Vatani, Haleh, Yao, Liang, Zhong, Yizhen, Rasmussen, Luke, Jiang, Guoqian, Pathak, Jyotishman, Luo, Yuan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6448187/
https://www.ncbi.nlm.nih.gov/pubmed/30943974
http://dx.doi.org/10.1186/s12911-019-0786-z
_version_ 1783408647852785664
author Sharma, Himanshu
Mao, Chengsheng
Zhang, Yizhen
Vatani, Haleh
Yao, Liang
Zhong, Yizhen
Rasmussen, Luke
Jiang, Guoqian
Pathak, Jyotishman
Luo, Yuan
author_facet Sharma, Himanshu
Mao, Chengsheng
Zhang, Yizhen
Vatani, Haleh
Yao, Liang
Zhong, Yizhen
Rasmussen, Luke
Jiang, Guoqian
Pathak, Jyotishman
Luo, Yuan
author_sort Sharma, Himanshu
collection PubMed
description BACKGROUND: This paper presents a portable phenotyping system that is capable of integrating both rule-based and statistical machine learning based approaches. METHODS: Our system utilizes UMLS to extract clinically relevant features from the unstructured text and then facilitates portability across different institutions and data systems by incorporating OHDSI’s OMOP Common Data Model (CDM) to standardize necessary data elements. Our system can also store the key components of rule-based systems (e.g., regular expression matches) in the format of OMOP CDM, thus enabling the reuse, adaptation and extension of many existing rule-based clinical NLP systems. We experimented with our system on the corpus from i2b2’s Obesity Challenge as a pilot study. RESULTS: Our system facilitates portable phenotyping of obesity and its 15 comorbidities based on the unstructured patient discharge summaries, while achieving a performance that often ranked among the top 10 of the challenge participants. CONCLUSION: Our system of standardization enables a consistent application of numerous rule-based and machine learning based classification techniques downstream across disparate datasets which may originate across different institutions and data systems.
format Online
Article
Text
id pubmed-6448187
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64481872019-04-15 Developing a portable natural language processing based phenotyping system Sharma, Himanshu Mao, Chengsheng Zhang, Yizhen Vatani, Haleh Yao, Liang Zhong, Yizhen Rasmussen, Luke Jiang, Guoqian Pathak, Jyotishman Luo, Yuan BMC Med Inform Decis Mak Research BACKGROUND: This paper presents a portable phenotyping system that is capable of integrating both rule-based and statistical machine learning based approaches. METHODS: Our system utilizes UMLS to extract clinically relevant features from the unstructured text and then facilitates portability across different institutions and data systems by incorporating OHDSI’s OMOP Common Data Model (CDM) to standardize necessary data elements. Our system can also store the key components of rule-based systems (e.g., regular expression matches) in the format of OMOP CDM, thus enabling the reuse, adaptation and extension of many existing rule-based clinical NLP systems. We experimented with our system on the corpus from i2b2’s Obesity Challenge as a pilot study. RESULTS: Our system facilitates portable phenotyping of obesity and its 15 comorbidities based on the unstructured patient discharge summaries, while achieving a performance that often ranked among the top 10 of the challenge participants. CONCLUSION: Our system of standardization enables a consistent application of numerous rule-based and machine learning based classification techniques downstream across disparate datasets which may originate across different institutions and data systems. BioMed Central 2019-04-04 /pmc/articles/PMC6448187/ /pubmed/30943974 http://dx.doi.org/10.1186/s12911-019-0786-z Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Sharma, Himanshu
Mao, Chengsheng
Zhang, Yizhen
Vatani, Haleh
Yao, Liang
Zhong, Yizhen
Rasmussen, Luke
Jiang, Guoqian
Pathak, Jyotishman
Luo, Yuan
Developing a portable natural language processing based phenotyping system
title Developing a portable natural language processing based phenotyping system
title_full Developing a portable natural language processing based phenotyping system
title_fullStr Developing a portable natural language processing based phenotyping system
title_full_unstemmed Developing a portable natural language processing based phenotyping system
title_short Developing a portable natural language processing based phenotyping system
title_sort developing a portable natural language processing based phenotyping system
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6448187/
https://www.ncbi.nlm.nih.gov/pubmed/30943974
http://dx.doi.org/10.1186/s12911-019-0786-z
work_keys_str_mv AT sharmahimanshu developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT maochengsheng developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT zhangyizhen developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT vatanihaleh developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT yaoliang developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT zhongyizhen developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT rasmussenluke developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT jiangguoqian developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT pathakjyotishman developingaportablenaturallanguageprocessingbasedphenotypingsystem
AT luoyuan developingaportablenaturallanguageprocessingbasedphenotypingsystem