Cargando…

Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method

This study aims to improve the performance of multiclass classification of biomedical texts for cardiovascular diseases by combining two different feature representation methods, i.e., bag-of-words (BoW) and word embeddings (WE). To hybridize the two feature representations, we investigated a set of...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahmed, Nizar, Dilmaç, Fatih, Alpkocak, Adil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712354/
https://www.ncbi.nlm.nih.gov/pubmed/33050399
http://dx.doi.org/10.3390/healthcare8040392
_version_ 1783618355909885952
author Ahmed, Nizar
Dilmaç, Fatih
Alpkocak, Adil
author_facet Ahmed, Nizar
Dilmaç, Fatih
Alpkocak, Adil
author_sort Ahmed, Nizar
collection PubMed
description This study aims to improve the performance of multiclass classification of biomedical texts for cardiovascular diseases by combining two different feature representation methods, i.e., bag-of-words (BoW) and word embeddings (WE). To hybridize the two feature representations, we investigated a set of possible statistical weighting schemes to combine with each element of WE vectors, which were term frequency (TF), inverse document frequency (IDF) and class probability (CP) methods. Thus, we built a multiclass classification model using a bidirectional long short-term memory (BLSTM) with deep neural networks for all investigated operations of feature vector combinations. We used MIMIC III and the PubMed dataset for the developing language model. To evaluate the performance of our weighted feature representation approaches, we conducted a set of experiments for examining multiclass classification performance with the deep neural network model and other state-of-the-art machine learning (ML) approaches. In all experiments, we used the OHSUMED-400 dataset, which includes PubMed abstracts related with specifically one class over 23 cardiovascular disease categories. Afterwards, we presented the results obtained from experiments and provided a comparison with related research in the literature. The results of the experiment showed that our BLSTM model with the weighting techniques outperformed the baseline and other machine learning approaches in terms of validation accuracy. Finally, our model outperformed the scores of related studies in the literature. This study shows that weighted feature representation improves the performance of the multiclass classification.
format Online
Article
Text
id pubmed-7712354
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-77123542020-12-04 Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method Ahmed, Nizar Dilmaç, Fatih Alpkocak, Adil Healthcare (Basel) Article This study aims to improve the performance of multiclass classification of biomedical texts for cardiovascular diseases by combining two different feature representation methods, i.e., bag-of-words (BoW) and word embeddings (WE). To hybridize the two feature representations, we investigated a set of possible statistical weighting schemes to combine with each element of WE vectors, which were term frequency (TF), inverse document frequency (IDF) and class probability (CP) methods. Thus, we built a multiclass classification model using a bidirectional long short-term memory (BLSTM) with deep neural networks for all investigated operations of feature vector combinations. We used MIMIC III and the PubMed dataset for the developing language model. To evaluate the performance of our weighted feature representation approaches, we conducted a set of experiments for examining multiclass classification performance with the deep neural network model and other state-of-the-art machine learning (ML) approaches. In all experiments, we used the OHSUMED-400 dataset, which includes PubMed abstracts related with specifically one class over 23 cardiovascular disease categories. Afterwards, we presented the results obtained from experiments and provided a comparison with related research in the literature. The results of the experiment showed that our BLSTM model with the weighting techniques outperformed the baseline and other machine learning approaches in terms of validation accuracy. Finally, our model outperformed the scores of related studies in the literature. This study shows that weighted feature representation improves the performance of the multiclass classification. MDPI 2020-10-10 /pmc/articles/PMC7712354/ /pubmed/33050399 http://dx.doi.org/10.3390/healthcare8040392 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ahmed, Nizar
Dilmaç, Fatih
Alpkocak, Adil
Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method
title Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method
title_full Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method
title_fullStr Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method
title_full_unstemmed Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method
title_short Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method
title_sort classification of biomedical texts for cardiovascular diseases with deep neural network using a weighted feature representation method
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712354/
https://www.ncbi.nlm.nih.gov/pubmed/33050399
http://dx.doi.org/10.3390/healthcare8040392
work_keys_str_mv AT ahmednizar classificationofbiomedicaltextsforcardiovasculardiseaseswithdeepneuralnetworkusingaweightedfeaturerepresentationmethod
AT dilmacfatih classificationofbiomedicaltextsforcardiovasculardiseaseswithdeepneuralnetworkusingaweightedfeaturerepresentationmethod
AT alpkocakadil classificationofbiomedicaltextsforcardiovasculardiseaseswithdeepneuralnetworkusingaweightedfeaturerepresentationmethod