Cargando…

KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites

This paper addresses the crucial task of identifying DNA/RNA binding sites, which has implications in drug/vaccine design, protein engineering, and cancer research. Existing methods utilize complex neural network structures, diverse input types, and machine learning techniques for feature extraction...

Descripción completa

Detalles Bibliográficos
Autores principales: Akbari Rokn Abadi, Saeedeh, Tabatabaei, SeyedehFatemeh, Koohi, Somayyeh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10580661/
https://www.ncbi.nlm.nih.gov/pubmed/37845681
http://dx.doi.org/10.1186/s12967-023-04593-7
_version_ 1785121991082115072
author Akbari Rokn Abadi, Saeedeh
Tabatabaei, SeyedehFatemeh
Koohi, Somayyeh
author_facet Akbari Rokn Abadi, Saeedeh
Tabatabaei, SeyedehFatemeh
Koohi, Somayyeh
author_sort Akbari Rokn Abadi, Saeedeh
collection PubMed
description This paper addresses the crucial task of identifying DNA/RNA binding sites, which has implications in drug/vaccine design, protein engineering, and cancer research. Existing methods utilize complex neural network structures, diverse input types, and machine learning techniques for feature extraction. However, the growing volume of sequences poses processing challenges. This study introduces KDeep, employing a CNN-LSTM architecture with a novel encoding method called 2Lk. 2Lk enhances prediction accuracy, reduces memory consumption by up to 84%, reduces trainable parameters, and improves interpretability by approximately 79% compared to state-of-the-art approaches. KDeep offers a promising solution for accurate and efficient binding site prediction. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12967-023-04593-7.
format Online
Article
Text
id pubmed-10580661
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-105806612023-10-18 KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites Akbari Rokn Abadi, Saeedeh Tabatabaei, SeyedehFatemeh Koohi, Somayyeh J Transl Med Methodology This paper addresses the crucial task of identifying DNA/RNA binding sites, which has implications in drug/vaccine design, protein engineering, and cancer research. Existing methods utilize complex neural network structures, diverse input types, and machine learning techniques for feature extraction. However, the growing volume of sequences poses processing challenges. This study introduces KDeep, employing a CNN-LSTM architecture with a novel encoding method called 2Lk. 2Lk enhances prediction accuracy, reduces memory consumption by up to 84%, reduces trainable parameters, and improves interpretability by approximately 79% compared to state-of-the-art approaches. KDeep offers a promising solution for accurate and efficient binding site prediction. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12967-023-04593-7. BioMed Central 2023-10-16 /pmc/articles/PMC10580661/ /pubmed/37845681 http://dx.doi.org/10.1186/s12967-023-04593-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology
Akbari Rokn Abadi, Saeedeh
Tabatabaei, SeyedehFatemeh
Koohi, Somayyeh
KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites
title KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites
title_full KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites
title_fullStr KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites
title_full_unstemmed KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites
title_short KDeep: a new memory-efficient data extraction method for accurately predicting DNA/RNA transcription factor binding sites
title_sort kdeep: a new memory-efficient data extraction method for accurately predicting dna/rna transcription factor binding sites
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10580661/
https://www.ncbi.nlm.nih.gov/pubmed/37845681
http://dx.doi.org/10.1186/s12967-023-04593-7
work_keys_str_mv AT akbariroknabadisaeedeh kdeepanewmemoryefficientdataextractionmethodforaccuratelypredictingdnarnatranscriptionfactorbindingsites
AT tabatabaeiseyedehfatemeh kdeepanewmemoryefficientdataextractionmethodforaccuratelypredictingdnarnatranscriptionfactorbindingsites
AT koohisomayyeh kdeepanewmemoryefficientdataextractionmethodforaccuratelypredictingdnarnatranscriptionfactorbindingsites