Cargando…

Named entity recognition on bio-medical literature documents using hybrid based approach

There have been many changes in the medical field due to technological advances. The progression in technologies provides lot of opportunities to extract valuable insights from huge amount of unstructured data. The literature documents published by the researchers in medical domain consists enormous...

Descripción completa

Detalles Bibliográficos
Autores principales: Ramachandran, R., Arutchelvan, K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7947151/
https://www.ncbi.nlm.nih.gov/pubmed/33723489
http://dx.doi.org/10.1007/s12652-021-03078-z
_version_ 1783663165428465664
author Ramachandran, R.
Arutchelvan, K.
author_facet Ramachandran, R.
Arutchelvan, K.
author_sort Ramachandran, R.
collection PubMed
description There have been many changes in the medical field due to technological advances. The progression in technologies provides lot of opportunities to extract valuable insights from huge amount of unstructured data. The literature documents published by the researchers in medical domain consists enormous amount of knowledge. Many organizations are involving in retrieving the hidden information from the literature documents. Extracting the drug names, diseases, symptoms, route of administration, species and dosage forms from the textual document is an easy task due to the innovation of technologies in the Natural Language Processing. In this article, a new hybrid based approach is proposed to identify named entity from the medical literature documents. New dictionary has been built for route of administration, dosage forms and symptoms to annotate the entities in the medical documents. The annotated entities are trained by the blank Spacy machine learning model. The trained model provide a decent accuracy when compared with the existing model. The hybrid model is validated with the dictionary and human (optional)to calculate the confusion matrix. It is able to identify more entities than the prevailing model. The average F1 score for five entities of the proposed hybrid based approach 73.79%.
format Online
Article
Text
id pubmed-7947151
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-79471512021-03-11 Named entity recognition on bio-medical literature documents using hybrid based approach Ramachandran, R. Arutchelvan, K. J Ambient Intell Humaniz Comput Original Research There have been many changes in the medical field due to technological advances. The progression in technologies provides lot of opportunities to extract valuable insights from huge amount of unstructured data. The literature documents published by the researchers in medical domain consists enormous amount of knowledge. Many organizations are involving in retrieving the hidden information from the literature documents. Extracting the drug names, diseases, symptoms, route of administration, species and dosage forms from the textual document is an easy task due to the innovation of technologies in the Natural Language Processing. In this article, a new hybrid based approach is proposed to identify named entity from the medical literature documents. New dictionary has been built for route of administration, dosage forms and symptoms to annotate the entities in the medical documents. The annotated entities are trained by the blank Spacy machine learning model. The trained model provide a decent accuracy when compared with the existing model. The hybrid model is validated with the dictionary and human (optional)to calculate the confusion matrix. It is able to identify more entities than the prevailing model. The average F1 score for five entities of the proposed hybrid based approach 73.79%. Springer Berlin Heidelberg 2021-03-11 /pmc/articles/PMC7947151/ /pubmed/33723489 http://dx.doi.org/10.1007/s12652-021-03078-z Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Research
Ramachandran, R.
Arutchelvan, K.
Named entity recognition on bio-medical literature documents using hybrid based approach
title Named entity recognition on bio-medical literature documents using hybrid based approach
title_full Named entity recognition on bio-medical literature documents using hybrid based approach
title_fullStr Named entity recognition on bio-medical literature documents using hybrid based approach
title_full_unstemmed Named entity recognition on bio-medical literature documents using hybrid based approach
title_short Named entity recognition on bio-medical literature documents using hybrid based approach
title_sort named entity recognition on bio-medical literature documents using hybrid based approach
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7947151/
https://www.ncbi.nlm.nih.gov/pubmed/33723489
http://dx.doi.org/10.1007/s12652-021-03078-z
work_keys_str_mv AT ramachandranr namedentityrecognitiononbiomedicalliteraturedocumentsusinghybridbasedapproach
AT arutchelvank namedentityrecognitiononbiomedicalliteraturedocumentsusinghybridbasedapproach