Cargando…

A set of domain rules and a deep network for protein coreference resolution

Current research of bio-text mining mainly focuses on event extractions. Biological networks present much richer and meaningful information to biologists than events. Bio-entity coreference resolution (CR) is a very important method to complete a bio-event’s attributes and interconnect events into b...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Chen, Rao, Zhiqiang, Zheng, Qinghua, Zhang, Xiangrong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6041745/
https://www.ncbi.nlm.nih.gov/pubmed/30010737
http://dx.doi.org/10.1093/database/bay065
_version_ 1783339040263634944
author Li, Chen
Rao, Zhiqiang
Zheng, Qinghua
Zhang, Xiangrong
author_facet Li, Chen
Rao, Zhiqiang
Zheng, Qinghua
Zhang, Xiangrong
author_sort Li, Chen
collection PubMed
description Current research of bio-text mining mainly focuses on event extractions. Biological networks present much richer and meaningful information to biologists than events. Bio-entity coreference resolution (CR) is a very important method to complete a bio-event’s attributes and interconnect events into bio-networks. Though general CR methods have been studies for a long time, they could not produce a practically useful result when applied to a special domain. Therefore, bio-entity CR needs attention to better assist biological network extraction. In this article, we present two methods for bio-entity CR. The first is a rule-based method, which creates a set of syntactic rules or semantic constraints for CR. It obtains a state-of-the-art performance (an F1-score of 62.0%) on the community supported dataset. We also present a machine learning-based method, which takes use of a recurrent neural network model, a long-short term memory network. It automatically learns global discriminative representations of all kinds of coreferences without hand-crafted features. The model outperforms the previously best machine leaning-based method.
format Online
Article
Text
id pubmed-6041745
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-60417452018-07-17 A set of domain rules and a deep network for protein coreference resolution Li, Chen Rao, Zhiqiang Zheng, Qinghua Zhang, Xiangrong Database (Oxford) Original Article Current research of bio-text mining mainly focuses on event extractions. Biological networks present much richer and meaningful information to biologists than events. Bio-entity coreference resolution (CR) is a very important method to complete a bio-event’s attributes and interconnect events into bio-networks. Though general CR methods have been studies for a long time, they could not produce a practically useful result when applied to a special domain. Therefore, bio-entity CR needs attention to better assist biological network extraction. In this article, we present two methods for bio-entity CR. The first is a rule-based method, which creates a set of syntactic rules or semantic constraints for CR. It obtains a state-of-the-art performance (an F1-score of 62.0%) on the community supported dataset. We also present a machine learning-based method, which takes use of a recurrent neural network model, a long-short term memory network. It automatically learns global discriminative representations of all kinds of coreferences without hand-crafted features. The model outperforms the previously best machine leaning-based method. Oxford University Press 2018-07-11 /pmc/articles/PMC6041745/ /pubmed/30010737 http://dx.doi.org/10.1093/database/bay065 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Li, Chen
Rao, Zhiqiang
Zheng, Qinghua
Zhang, Xiangrong
A set of domain rules and a deep network for protein coreference resolution
title A set of domain rules and a deep network for protein coreference resolution
title_full A set of domain rules and a deep network for protein coreference resolution
title_fullStr A set of domain rules and a deep network for protein coreference resolution
title_full_unstemmed A set of domain rules and a deep network for protein coreference resolution
title_short A set of domain rules and a deep network for protein coreference resolution
title_sort set of domain rules and a deep network for protein coreference resolution
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6041745/
https://www.ncbi.nlm.nih.gov/pubmed/30010737
http://dx.doi.org/10.1093/database/bay065
work_keys_str_mv AT lichen asetofdomainrulesandadeepnetworkforproteincoreferenceresolution
AT raozhiqiang asetofdomainrulesandadeepnetworkforproteincoreferenceresolution
AT zhengqinghua asetofdomainrulesandadeepnetworkforproteincoreferenceresolution
AT zhangxiangrong asetofdomainrulesandadeepnetworkforproteincoreferenceresolution
AT lichen setofdomainrulesandadeepnetworkforproteincoreferenceresolution
AT raozhiqiang setofdomainrulesandadeepnetworkforproteincoreferenceresolution
AT zhengqinghua setofdomainrulesandadeepnetworkforproteincoreferenceresolution
AT zhangxiangrong setofdomainrulesandadeepnetworkforproteincoreferenceresolution