Cargando…

Distant supervision for medical concept normalization

We consider the task of Medical Concept Normalization (MCN) which aims to map informal medical phrases such as “loosing weight” to formal medical concepts, such as “Weight loss”. Deep learning models have shown high performance across various MCN datasets containing small number of target concepts a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Pattisapu, Nikhil, Anand, Vivek, Patil, Sangameshwar, Palshikar, Girish, Varma, Vasudeva
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Elsevier Inc. 2020
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7415240/ https://www.ncbi.nlm.nih.gov/pubmed/32783923 http://dx.doi.org/10.1016/j.jbi.2020.103522

_version_	1783569130772758528
author	Pattisapu, Nikhil Anand, Vivek Patil, Sangameshwar Palshikar, Girish Varma, Vasudeva
author_facet	Pattisapu, Nikhil Anand, Vivek Patil, Sangameshwar Palshikar, Girish Varma, Vasudeva
author_sort	Pattisapu, Nikhil
collection	PubMed
description	We consider the task of Medical Concept Normalization (MCN) which aims to map informal medical phrases such as “loosing weight” to formal medical concepts, such as “Weight loss”. Deep learning models have shown high performance across various MCN datasets containing small number of target concepts along with adequate number of training examples per concept. However, scaling these models to millions of medical concepts entails the creation of much larger datasets which is cost and effort intensive. Recent works have shown that training MCN models using automatically labeled examples extracted from medical knowledge bases partially alleviates this problem. We extend this idea by computationally creating a distant dataset from patient discussion forums. We extract informal medical phrases and medical concepts from these forums using a synthetically trained classifier and an off-the-shelf medical entity linker respectively. We use pretrained sentence encoding models to find the k-nearest phrases corresponding to each medical concept. These mappings are used in combination with the examples obtained from medical knowledge bases to train an MCN model. Our approach outperforms the previous state-of-the-art by 15.9% and 17.1% classification accuracy across two datasets while avoiding manual labeling.
format	Online Article Text
id	pubmed-7415240
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Elsevier Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-74152402020-08-10 Distant supervision for medical concept normalization Pattisapu, Nikhil Anand, Vivek Patil, Sangameshwar Palshikar, Girish Varma, Vasudeva J Biomed Inform Original Research We consider the task of Medical Concept Normalization (MCN) which aims to map informal medical phrases such as “loosing weight” to formal medical concepts, such as “Weight loss”. Deep learning models have shown high performance across various MCN datasets containing small number of target concepts along with adequate number of training examples per concept. However, scaling these models to millions of medical concepts entails the creation of much larger datasets which is cost and effort intensive. Recent works have shown that training MCN models using automatically labeled examples extracted from medical knowledge bases partially alleviates this problem. We extend this idea by computationally creating a distant dataset from patient discussion forums. We extract informal medical phrases and medical concepts from these forums using a synthetically trained classifier and an off-the-shelf medical entity linker respectively. We use pretrained sentence encoding models to find the k-nearest phrases corresponding to each medical concept. These mappings are used in combination with the examples obtained from medical knowledge bases to train an MCN model. Our approach outperforms the previous state-of-the-art by 15.9% and 17.1% classification accuracy across two datasets while avoiding manual labeling. Elsevier Inc. 2020-09 2020-08-09 /pmc/articles/PMC7415240/ /pubmed/32783923 http://dx.doi.org/10.1016/j.jbi.2020.103522 Text en © 2020 Elsevier Inc. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle	Original Research Pattisapu, Nikhil Anand, Vivek Patil, Sangameshwar Palshikar, Girish Varma, Vasudeva Distant supervision for medical concept normalization
title	Distant supervision for medical concept normalization
title_full	Distant supervision for medical concept normalization
title_fullStr	Distant supervision for medical concept normalization
title_full_unstemmed	Distant supervision for medical concept normalization
title_short	Distant supervision for medical concept normalization
title_sort	distant supervision for medical concept normalization
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7415240/ https://www.ncbi.nlm.nih.gov/pubmed/32783923 http://dx.doi.org/10.1016/j.jbi.2020.103522
work_keys_str_mv	AT pattisapunikhil distantsupervisionformedicalconceptnormalization AT anandvivek distantsupervisionformedicalconceptnormalization AT patilsangameshwar distantsupervisionformedicalconceptnormalization AT palshikargirish distantsupervisionformedicalconceptnormalization AT varmavasudeva distantsupervisionformedicalconceptnormalization

Distant supervision for medical concept normalization

Ejemplares similares