Cargando…

Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model

Large-scale neuroscience literature call for effective methods to mine the knowledge from species perspective to link the brain and neuroscience communities, neurorobotics, computing devices, and AI research communities. Structured knowledge can motivate researchers to better understand the function...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Hongyin, Zeng, Yi, Wang, Dongsheng, Huangfu, Cunqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7187631/
https://www.ncbi.nlm.nih.gov/pubmed/32372933
http://dx.doi.org/10.3389/fnhum.2020.00128
_version_ 1783527203614490624
author Zhu, Hongyin
Zeng, Yi
Wang, Dongsheng
Huangfu, Cunqing
author_facet Zhu, Hongyin
Zeng, Yi
Wang, Dongsheng
Huangfu, Cunqing
author_sort Zhu, Hongyin
collection PubMed
description Large-scale neuroscience literature call for effective methods to mine the knowledge from species perspective to link the brain and neuroscience communities, neurorobotics, computing devices, and AI research communities. Structured knowledge can motivate researchers to better understand the functionality and structure of the brain and link the related resources and components. However, the abstracts of massive scientific works do not explicitly mention the species. Therefore, in addition to dictionary-based methods, we need to mine species using cognitive computing models that are more like the human reading process, and these methods can take advantage of the rich information in the literature. We also enable the model to automatically distinguish whether the mentioned species is the main research subject. Distinguishing the two situations can generate value at different levels of knowledge management. We propose SpecExplorer project which is used to explore the knowledge associations of different species for brain and neuroscience. This project frees humans from the tedious task of classifying neuroscience literature by species. Species classification task belongs to the multi-label classification which is more complex than the single-label classification due to the correlation between labels. To resolve this problem, we present the sequence-to-sequence classification framework to adaptively assign multiple species to the literature. To model the structure information of documents, we propose the hierarchical attentive decoding (HAD) to extract span of interest (SOI) for predicting each species. We create three datasets from PubMed and PMC corpora. We present two versions of annotation criteria (mention-based annotation and semantic-based annotation) for species research. Experiments demonstrate that our approach achieves improvements in the final results. Finally, we perform species-based analysis of brain diseases, brain cognitive functions, and proteins related to the hippocampus and provide potential research directions for certain species.
format Online
Article
Text
id pubmed-7187631
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-71876312020-05-05 Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model Zhu, Hongyin Zeng, Yi Wang, Dongsheng Huangfu, Cunqing Front Hum Neurosci Human Neuroscience Large-scale neuroscience literature call for effective methods to mine the knowledge from species perspective to link the brain and neuroscience communities, neurorobotics, computing devices, and AI research communities. Structured knowledge can motivate researchers to better understand the functionality and structure of the brain and link the related resources and components. However, the abstracts of massive scientific works do not explicitly mention the species. Therefore, in addition to dictionary-based methods, we need to mine species using cognitive computing models that are more like the human reading process, and these methods can take advantage of the rich information in the literature. We also enable the model to automatically distinguish whether the mentioned species is the main research subject. Distinguishing the two situations can generate value at different levels of knowledge management. We propose SpecExplorer project which is used to explore the knowledge associations of different species for brain and neuroscience. This project frees humans from the tedious task of classifying neuroscience literature by species. Species classification task belongs to the multi-label classification which is more complex than the single-label classification due to the correlation between labels. To resolve this problem, we present the sequence-to-sequence classification framework to adaptively assign multiple species to the literature. To model the structure information of documents, we propose the hierarchical attentive decoding (HAD) to extract span of interest (SOI) for predicting each species. We create three datasets from PubMed and PMC corpora. We present two versions of annotation criteria (mention-based annotation and semantic-based annotation) for species research. Experiments demonstrate that our approach achieves improvements in the final results. Finally, we perform species-based analysis of brain diseases, brain cognitive functions, and proteins related to the hippocampus and provide potential research directions for certain species. Frontiers Media S.A. 2020-04-21 /pmc/articles/PMC7187631/ /pubmed/32372933 http://dx.doi.org/10.3389/fnhum.2020.00128 Text en Copyright © 2020 Zhu, Zeng, Wang and Huangfu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Human Neuroscience
Zhu, Hongyin
Zeng, Yi
Wang, Dongsheng
Huangfu, Cunqing
Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model
title Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model
title_full Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model
title_fullStr Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model
title_full_unstemmed Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model
title_short Species Classification for Neuroscience Literature Based on Span of Interest Using Sequence-to-Sequence Learning Model
title_sort species classification for neuroscience literature based on span of interest using sequence-to-sequence learning model
topic Human Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7187631/
https://www.ncbi.nlm.nih.gov/pubmed/32372933
http://dx.doi.org/10.3389/fnhum.2020.00128
work_keys_str_mv AT zhuhongyin speciesclassificationforneuroscienceliteraturebasedonspanofinterestusingsequencetosequencelearningmodel
AT zengyi speciesclassificationforneuroscienceliteraturebasedonspanofinterestusingsequencetosequencelearningmodel
AT wangdongsheng speciesclassificationforneuroscienceliteraturebasedonspanofinterestusingsequencetosequencelearningmodel
AT huangfucunqing speciesclassificationforneuroscienceliteraturebasedonspanofinterestusingsequencetosequencelearningmodel