Cargando…
Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach
The application of deep learning for taxonomic categorization of DNA sequences is investigated in this study. Two deep learning architectures, namely the Stacked Convolutional Autoencoder (SCAE) with Multilabel Extreme Learning Machine (MLELM) and the Variational Convolutional Autoencoder (VCAE) wit...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10669241/ https://www.ncbi.nlm.nih.gov/pubmed/38002417 http://dx.doi.org/10.3390/bioengineering10111293 |
_version_ | 1785139649459519488 |
---|---|
author | Hossain, Prommy Sultana Kim, Kyungsup Uddin, Jia Samad, Md Abdus Choi, Kwonhue |
author_facet | Hossain, Prommy Sultana Kim, Kyungsup Uddin, Jia Samad, Md Abdus Choi, Kwonhue |
author_sort | Hossain, Prommy Sultana |
collection | PubMed |
description | The application of deep learning for taxonomic categorization of DNA sequences is investigated in this study. Two deep learning architectures, namely the Stacked Convolutional Autoencoder (SCAE) with Multilabel Extreme Learning Machine (MLELM) and the Variational Convolutional Autoencoder (VCAE) with MLELM, have been proposed. These designs provide precise feature maps for individual and inter-label interactions within DNA sequences, capturing their spatial and temporal properties. The collected features are subsequently fed into MLELM networks, which yield soft classification scores and hard labels. The proposed algorithms underwent thorough training and testing on unsupervised data, whereby one or more labels were concurrently taken into account. The introduction of the clade label resulted in improved accuracy for both models compared to the class or genus labels, probably owing to the occurrence of large clusters of similar nucleotides inside a DNA strand. In all circumstances, the VCAE-MLELM model consistently outperformed the SCAE-MLELM model. The best accuracy attained by the VCAE-MLELM model when the clade and family labels were combined was 94%. However, accuracy ratings for single-label categorization using either approach were less than 65%. The approach’s effectiveness is based on MLELM networks, which record connected patterns across classes for accurate label categorization. This study advances deep learning in biological taxonomy by emphasizing the significance of combining numerous labels for increased classification accuracy. |
format | Online Article Text |
id | pubmed-10669241 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-106692412023-11-08 Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach Hossain, Prommy Sultana Kim, Kyungsup Uddin, Jia Samad, Md Abdus Choi, Kwonhue Bioengineering (Basel) Article The application of deep learning for taxonomic categorization of DNA sequences is investigated in this study. Two deep learning architectures, namely the Stacked Convolutional Autoencoder (SCAE) with Multilabel Extreme Learning Machine (MLELM) and the Variational Convolutional Autoencoder (VCAE) with MLELM, have been proposed. These designs provide precise feature maps for individual and inter-label interactions within DNA sequences, capturing their spatial and temporal properties. The collected features are subsequently fed into MLELM networks, which yield soft classification scores and hard labels. The proposed algorithms underwent thorough training and testing on unsupervised data, whereby one or more labels were concurrently taken into account. The introduction of the clade label resulted in improved accuracy for both models compared to the class or genus labels, probably owing to the occurrence of large clusters of similar nucleotides inside a DNA strand. In all circumstances, the VCAE-MLELM model consistently outperformed the SCAE-MLELM model. The best accuracy attained by the VCAE-MLELM model when the clade and family labels were combined was 94%. However, accuracy ratings for single-label categorization using either approach were less than 65%. The approach’s effectiveness is based on MLELM networks, which record connected patterns across classes for accurate label categorization. This study advances deep learning in biological taxonomy by emphasizing the significance of combining numerous labels for increased classification accuracy. MDPI 2023-11-08 /pmc/articles/PMC10669241/ /pubmed/38002417 http://dx.doi.org/10.3390/bioengineering10111293 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hossain, Prommy Sultana Kim, Kyungsup Uddin, Jia Samad, Md Abdus Choi, Kwonhue Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach |
title | Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach |
title_full | Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach |
title_fullStr | Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach |
title_full_unstemmed | Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach |
title_short | Enhancing Taxonomic Categorization of DNA Sequences with Deep Learning: A Multi-Label Approach |
title_sort | enhancing taxonomic categorization of dna sequences with deep learning: a multi-label approach |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10669241/ https://www.ncbi.nlm.nih.gov/pubmed/38002417 http://dx.doi.org/10.3390/bioengineering10111293 |
work_keys_str_mv | AT hossainprommysultana enhancingtaxonomiccategorizationofdnasequenceswithdeeplearningamultilabelapproach AT kimkyungsup enhancingtaxonomiccategorizationofdnasequenceswithdeeplearningamultilabelapproach AT uddinjia enhancingtaxonomiccategorizationofdnasequenceswithdeeplearningamultilabelapproach AT samadmdabdus enhancingtaxonomiccategorizationofdnasequenceswithdeeplearningamultilabelapproach AT choikwonhue enhancingtaxonomiccategorizationofdnasequenceswithdeeplearningamultilabelapproach |