Cargando…

A neural network approach to chemical and gene/protein entity recognition in patents

In biomedical research, patents contain the significant amount of information, and biomedical text mining has received much attention in patents recently. To accelerate the development of biomedical text mining for patents, the BioCreative V.5 challenge organized three tracks, i.e., chemical entity...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Ling, Yang, Zhihao, Yang, Pei, Zhang, Yin, Wang, Lei, Wang, Jian, Lin, Hongfei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6755562/
https://www.ncbi.nlm.nih.gov/pubmed/30564940
http://dx.doi.org/10.1186/s13321-018-0318-3
_version_ 1783453258582327296
author Luo, Ling
Yang, Zhihao
Yang, Pei
Zhang, Yin
Wang, Lei
Wang, Jian
Lin, Hongfei
author_facet Luo, Ling
Yang, Zhihao
Yang, Pei
Zhang, Yin
Wang, Lei
Wang, Jian
Lin, Hongfei
author_sort Luo, Ling
collection PubMed
description In biomedical research, patents contain the significant amount of information, and biomedical text mining has received much attention in patents recently. To accelerate the development of biomedical text mining for patents, the BioCreative V.5 challenge organized three tracks, i.e., chemical entity mention recognition (CEMP), gene and protein related object recognition (GPRO) and technical interoperability and performance of annotation servers, to focus on biomedical entity recognition in patents. This paper describes our neural network approach for the CEMP and GPRO tracks. In the approach, a bidirectional long short-term memory with a conditional random field layer is employed to recognize biomedical entities from patents. To improve the performance, we explored the effect of additional features (i.e., part of speech, chunking and named entity recognition features generated by the GENIA tagger) for the neural network model. In the official results, our best runs achieve the highest performances (a precision of 88.32%, a recall of 92.62%, and an F-score of 90.42% in the CEMP track; a precision of 76.65%, a recall of 81.91%, and an F-score of 79.19% in the GPRO track) among all participating teams in both tracks.
format Online
Article
Text
id pubmed-6755562
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-67555622019-09-26 A neural network approach to chemical and gene/protein entity recognition in patents Luo, Ling Yang, Zhihao Yang, Pei Zhang, Yin Wang, Lei Wang, Jian Lin, Hongfei J Cheminform Research Article In biomedical research, patents contain the significant amount of information, and biomedical text mining has received much attention in patents recently. To accelerate the development of biomedical text mining for patents, the BioCreative V.5 challenge organized three tracks, i.e., chemical entity mention recognition (CEMP), gene and protein related object recognition (GPRO) and technical interoperability and performance of annotation servers, to focus on biomedical entity recognition in patents. This paper describes our neural network approach for the CEMP and GPRO tracks. In the approach, a bidirectional long short-term memory with a conditional random field layer is employed to recognize biomedical entities from patents. To improve the performance, we explored the effect of additional features (i.e., part of speech, chunking and named entity recognition features generated by the GENIA tagger) for the neural network model. In the official results, our best runs achieve the highest performances (a precision of 88.32%, a recall of 92.62%, and an F-score of 90.42% in the CEMP track; a precision of 76.65%, a recall of 81.91%, and an F-score of 79.19% in the GPRO track) among all participating teams in both tracks. Springer International Publishing 2018-12-18 /pmc/articles/PMC6755562/ /pubmed/30564940 http://dx.doi.org/10.1186/s13321-018-0318-3 Text en © The Author(s) 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Luo, Ling
Yang, Zhihao
Yang, Pei
Zhang, Yin
Wang, Lei
Wang, Jian
Lin, Hongfei
A neural network approach to chemical and gene/protein entity recognition in patents
title A neural network approach to chemical and gene/protein entity recognition in patents
title_full A neural network approach to chemical and gene/protein entity recognition in patents
title_fullStr A neural network approach to chemical and gene/protein entity recognition in patents
title_full_unstemmed A neural network approach to chemical and gene/protein entity recognition in patents
title_short A neural network approach to chemical and gene/protein entity recognition in patents
title_sort neural network approach to chemical and gene/protein entity recognition in patents
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6755562/
https://www.ncbi.nlm.nih.gov/pubmed/30564940
http://dx.doi.org/10.1186/s13321-018-0318-3
work_keys_str_mv AT luoling aneuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT yangzhihao aneuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT yangpei aneuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT zhangyin aneuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT wanglei aneuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT wangjian aneuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT linhongfei aneuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT luoling neuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT yangzhihao neuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT yangpei neuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT zhangyin neuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT wanglei neuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT wangjian neuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents
AT linhongfei neuralnetworkapproachtochemicalandgeneproteinentityrecognitioninpatents