Cargando…
Named entity recognition on bio-medical literature documents using hybrid based approach
There have been many changes in the medical field due to technological advances. The progression in technologies provides lot of opportunities to extract valuable insights from huge amount of unstructured data. The literature documents published by the researchers in medical domain consists enormous...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Berlin Heidelberg
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7947151/ https://www.ncbi.nlm.nih.gov/pubmed/33723489 http://dx.doi.org/10.1007/s12652-021-03078-z |
_version_ | 1783663165428465664 |
---|---|
author | Ramachandran, R. Arutchelvan, K. |
author_facet | Ramachandran, R. Arutchelvan, K. |
author_sort | Ramachandran, R. |
collection | PubMed |
description | There have been many changes in the medical field due to technological advances. The progression in technologies provides lot of opportunities to extract valuable insights from huge amount of unstructured data. The literature documents published by the researchers in medical domain consists enormous amount of knowledge. Many organizations are involving in retrieving the hidden information from the literature documents. Extracting the drug names, diseases, symptoms, route of administration, species and dosage forms from the textual document is an easy task due to the innovation of technologies in the Natural Language Processing. In this article, a new hybrid based approach is proposed to identify named entity from the medical literature documents. New dictionary has been built for route of administration, dosage forms and symptoms to annotate the entities in the medical documents. The annotated entities are trained by the blank Spacy machine learning model. The trained model provide a decent accuracy when compared with the existing model. The hybrid model is validated with the dictionary and human (optional)to calculate the confusion matrix. It is able to identify more entities than the prevailing model. The average F1 score for five entities of the proposed hybrid based approach 73.79%. |
format | Online Article Text |
id | pubmed-7947151 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer Berlin Heidelberg |
record_format | MEDLINE/PubMed |
spelling | pubmed-79471512021-03-11 Named entity recognition on bio-medical literature documents using hybrid based approach Ramachandran, R. Arutchelvan, K. J Ambient Intell Humaniz Comput Original Research There have been many changes in the medical field due to technological advances. The progression in technologies provides lot of opportunities to extract valuable insights from huge amount of unstructured data. The literature documents published by the researchers in medical domain consists enormous amount of knowledge. Many organizations are involving in retrieving the hidden information from the literature documents. Extracting the drug names, diseases, symptoms, route of administration, species and dosage forms from the textual document is an easy task due to the innovation of technologies in the Natural Language Processing. In this article, a new hybrid based approach is proposed to identify named entity from the medical literature documents. New dictionary has been built for route of administration, dosage forms and symptoms to annotate the entities in the medical documents. The annotated entities are trained by the blank Spacy machine learning model. The trained model provide a decent accuracy when compared with the existing model. The hybrid model is validated with the dictionary and human (optional)to calculate the confusion matrix. It is able to identify more entities than the prevailing model. The average F1 score for five entities of the proposed hybrid based approach 73.79%. Springer Berlin Heidelberg 2021-03-11 /pmc/articles/PMC7947151/ /pubmed/33723489 http://dx.doi.org/10.1007/s12652-021-03078-z Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Research Ramachandran, R. Arutchelvan, K. Named entity recognition on bio-medical literature documents using hybrid based approach |
title | Named entity recognition on bio-medical literature documents using hybrid based approach |
title_full | Named entity recognition on bio-medical literature documents using hybrid based approach |
title_fullStr | Named entity recognition on bio-medical literature documents using hybrid based approach |
title_full_unstemmed | Named entity recognition on bio-medical literature documents using hybrid based approach |
title_short | Named entity recognition on bio-medical literature documents using hybrid based approach |
title_sort | named entity recognition on bio-medical literature documents using hybrid based approach |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7947151/ https://www.ncbi.nlm.nih.gov/pubmed/33723489 http://dx.doi.org/10.1007/s12652-021-03078-z |
work_keys_str_mv | AT ramachandranr namedentityrecognitiononbiomedicalliteraturedocumentsusinghybridbasedapproach AT arutchelvank namedentityrecognitiononbiomedicalliteraturedocumentsusinghybridbasedapproach |