Cargando…
Comparison of named entity recognition methodologies in biomedical documents
BACKGROUND: Biomedical named entity recognition (Bio-NER) is a fundamental task in handling biomedical text terms, such as RNA, protein, cell type, cell line, and DNA. Bio-NER is one of the most elementary and core tasks in biomedical knowledge discovery from texts. The system described here is deve...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6219049/ https://www.ncbi.nlm.nih.gov/pubmed/30396340 http://dx.doi.org/10.1186/s12938-018-0573-6 |
_version_ | 1783368573950885888 |
---|---|
author | Song, Hye-Jeong Jo, Byeong-Cheol Park, Chan-Young Kim, Jong-Dae Kim, Yu-Seop |
author_facet | Song, Hye-Jeong Jo, Byeong-Cheol Park, Chan-Young Kim, Jong-Dae Kim, Yu-Seop |
author_sort | Song, Hye-Jeong |
collection | PubMed |
description | BACKGROUND: Biomedical named entity recognition (Bio-NER) is a fundamental task in handling biomedical text terms, such as RNA, protein, cell type, cell line, and DNA. Bio-NER is one of the most elementary and core tasks in biomedical knowledge discovery from texts. The system described here is developed by using the BioNLP/NLPBA 2004 shared task. Experiments are conducted on a training and evaluation set provided by the task organizers. RESULTS: Our results show that, compared with a baseline having a 70.09% F1 score, the RNN Jordan- and Elman-type algorithms have F1 scores of approximately 60.53% and 58.80%, respectively. When we use CRF as a machine learning algorithm, CCA, GloVe, and Word2Vec have F1 scores of 72.73%, 72.74%, and 72.82%, respectively. CONCLUSIONS: By using the word embedding constructed through the unsupervised learning, the time and cost required to construct the learning data can be saved. |
format | Online Article Text |
id | pubmed-6219049 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-62190492018-11-08 Comparison of named entity recognition methodologies in biomedical documents Song, Hye-Jeong Jo, Byeong-Cheol Park, Chan-Young Kim, Jong-Dae Kim, Yu-Seop Biomed Eng Online Research BACKGROUND: Biomedical named entity recognition (Bio-NER) is a fundamental task in handling biomedical text terms, such as RNA, protein, cell type, cell line, and DNA. Bio-NER is one of the most elementary and core tasks in biomedical knowledge discovery from texts. The system described here is developed by using the BioNLP/NLPBA 2004 shared task. Experiments are conducted on a training and evaluation set provided by the task organizers. RESULTS: Our results show that, compared with a baseline having a 70.09% F1 score, the RNN Jordan- and Elman-type algorithms have F1 scores of approximately 60.53% and 58.80%, respectively. When we use CRF as a machine learning algorithm, CCA, GloVe, and Word2Vec have F1 scores of 72.73%, 72.74%, and 72.82%, respectively. CONCLUSIONS: By using the word embedding constructed through the unsupervised learning, the time and cost required to construct the learning data can be saved. BioMed Central 2018-11-06 /pmc/articles/PMC6219049/ /pubmed/30396340 http://dx.doi.org/10.1186/s12938-018-0573-6 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Song, Hye-Jeong Jo, Byeong-Cheol Park, Chan-Young Kim, Jong-Dae Kim, Yu-Seop Comparison of named entity recognition methodologies in biomedical documents |
title | Comparison of named entity recognition methodologies in biomedical documents |
title_full | Comparison of named entity recognition methodologies in biomedical documents |
title_fullStr | Comparison of named entity recognition methodologies in biomedical documents |
title_full_unstemmed | Comparison of named entity recognition methodologies in biomedical documents |
title_short | Comparison of named entity recognition methodologies in biomedical documents |
title_sort | comparison of named entity recognition methodologies in biomedical documents |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6219049/ https://www.ncbi.nlm.nih.gov/pubmed/30396340 http://dx.doi.org/10.1186/s12938-018-0573-6 |
work_keys_str_mv | AT songhyejeong comparisonofnamedentityrecognitionmethodologiesinbiomedicaldocuments AT jobyeongcheol comparisonofnamedentityrecognitionmethodologiesinbiomedicaldocuments AT parkchanyoung comparisonofnamedentityrecognitionmethodologiesinbiomedicaldocuments AT kimjongdae comparisonofnamedentityrecognitionmethodologiesinbiomedicaldocuments AT kimyuseop comparisonofnamedentityrecognitionmethodologiesinbiomedicaldocuments |