Cargando…

Biomedical literature classification with a CNNs-based hybrid learning network

Deep learning techniques, e.g., Convolutional Neural Networks (CNNs), have been explosively applied to the research in the fields of information retrieval and natural language processing. However, few research efforts have addressed semantic indexing with deep learning. The use of semantic indexing...

Descripción completa

Detalles Bibliográficos
Autores principales: Yan, Yan, Yin, Xu-Cheng, Yang, Chun, Li, Sujian, Zhang, Bo-Wen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6061982/
https://www.ncbi.nlm.nih.gov/pubmed/30048461
http://dx.doi.org/10.1371/journal.pone.0197933
_version_ 1783342317672857600
author Yan, Yan
Yin, Xu-Cheng
Yang, Chun
Li, Sujian
Zhang, Bo-Wen
author_facet Yan, Yan
Yin, Xu-Cheng
Yang, Chun
Li, Sujian
Zhang, Bo-Wen
author_sort Yan, Yan
collection PubMed
description Deep learning techniques, e.g., Convolutional Neural Networks (CNNs), have been explosively applied to the research in the fields of information retrieval and natural language processing. However, few research efforts have addressed semantic indexing with deep learning. The use of semantic indexing in the biomedical literature has been limited for several reasons. For instance, MEDLINE citations contain a large number of semantic labels from automatically annotated MeSH terms, and for a great deal of the literature, only the information of the title and the abstract is readily available. In this paper, we propose a Boltzmann Convolutional neural network framework (B-CNN) for biomedicine semantic indexing. In our hybrid learning framework, the CNN can adaptively deal with features of documents that have sequence relationships, and can capture context information accordingly; the Deep Boltzmann Machine (DBM) merges global (the entity in each document) and local information through its training with undirected connections. Additionally, we have designed a hierarchical coarse to fine style indexing structure for learning and classifying documents, and a novel feature extension approach with word sequence embedding and Wikipedia categorization. Comparative experiments were conducted for semantic indexing of biomedical abstract documents; these experiments verified the encouraged performance of our B-CNN model.
format Online
Article
Text
id pubmed-6061982
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-60619822018-08-03 Biomedical literature classification with a CNNs-based hybrid learning network Yan, Yan Yin, Xu-Cheng Yang, Chun Li, Sujian Zhang, Bo-Wen PLoS One Research Article Deep learning techniques, e.g., Convolutional Neural Networks (CNNs), have been explosively applied to the research in the fields of information retrieval and natural language processing. However, few research efforts have addressed semantic indexing with deep learning. The use of semantic indexing in the biomedical literature has been limited for several reasons. For instance, MEDLINE citations contain a large number of semantic labels from automatically annotated MeSH terms, and for a great deal of the literature, only the information of the title and the abstract is readily available. In this paper, we propose a Boltzmann Convolutional neural network framework (B-CNN) for biomedicine semantic indexing. In our hybrid learning framework, the CNN can adaptively deal with features of documents that have sequence relationships, and can capture context information accordingly; the Deep Boltzmann Machine (DBM) merges global (the entity in each document) and local information through its training with undirected connections. Additionally, we have designed a hierarchical coarse to fine style indexing structure for learning and classifying documents, and a novel feature extension approach with word sequence embedding and Wikipedia categorization. Comparative experiments were conducted for semantic indexing of biomedical abstract documents; these experiments verified the encouraged performance of our B-CNN model. Public Library of Science 2018-07-26 /pmc/articles/PMC6061982/ /pubmed/30048461 http://dx.doi.org/10.1371/journal.pone.0197933 Text en © 2018 Yan et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Yan, Yan
Yin, Xu-Cheng
Yang, Chun
Li, Sujian
Zhang, Bo-Wen
Biomedical literature classification with a CNNs-based hybrid learning network
title Biomedical literature classification with a CNNs-based hybrid learning network
title_full Biomedical literature classification with a CNNs-based hybrid learning network
title_fullStr Biomedical literature classification with a CNNs-based hybrid learning network
title_full_unstemmed Biomedical literature classification with a CNNs-based hybrid learning network
title_short Biomedical literature classification with a CNNs-based hybrid learning network
title_sort biomedical literature classification with a cnns-based hybrid learning network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6061982/
https://www.ncbi.nlm.nih.gov/pubmed/30048461
http://dx.doi.org/10.1371/journal.pone.0197933
work_keys_str_mv AT yanyan biomedicalliteratureclassificationwithacnnsbasedhybridlearningnetwork
AT yinxucheng biomedicalliteratureclassificationwithacnnsbasedhybridlearningnetwork
AT yangchun biomedicalliteratureclassificationwithacnnsbasedhybridlearningnetwork
AT lisujian biomedicalliteratureclassificationwithacnnsbasedhybridlearningnetwork
AT zhangbowen biomedicalliteratureclassificationwithacnnsbasedhybridlearningnetwork