Cargando…

Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks

Transmembrane transport proteins (transporters) play a crucial role in the fundamental cellular processes of all organisms by facilitating the transport of hydrophilic substrates across hydrophobic membranes. Despite the availability of numerous membrane protein sequences, their structures and funct...

Descripción completa

Detalles Bibliográficos
Autores principales: Ghazikhani, Hamed, Butler, Gregory
Formato: Online Artículo Texto
Lenguaje:English
Publicado: De Gruyter 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10389051/
https://www.ncbi.nlm.nih.gov/pubmed/37497772
http://dx.doi.org/10.1515/jib-2022-0055
_version_ 1785082213633622016
author Ghazikhani, Hamed
Butler, Gregory
author_facet Ghazikhani, Hamed
Butler, Gregory
author_sort Ghazikhani, Hamed
collection PubMed
description Transmembrane transport proteins (transporters) play a crucial role in the fundamental cellular processes of all organisms by facilitating the transport of hydrophilic substrates across hydrophobic membranes. Despite the availability of numerous membrane protein sequences, their structures and functions remain largely elusive. Recently, natural language processing (NLP) techniques have shown promise in the analysis of protein sequences. Bidirectional Encoder Representations from Transformers (BERT) is an NLP technique adapted for proteins to learn contextual embeddings of individual amino acids within a protein sequence. Our previous strategy, TooT-BERT-T, differentiated transporters from non-transporters by employing a logistic regression classifier with fine-tuned representations from ProtBERT-BFD. In this study, we expand upon this approach by utilizing representations from ProtBERT, ProtBERT-BFD, and MembraneBERT in combination with classical classifiers. Additionally, we introduce TooT-BERT-CNN-T, a novel method that fine-tunes ProtBERT-BFD and discriminates transporters using a Convolutional Neural Network (CNN). Our experimental results reveal that CNN surpasses traditional classifiers in discriminating transporters from non-transporters, achieving an MCC of 0.89 and an accuracy of 95.1 % on the independent test set. This represents an improvement of 0.03 and 1.11 percentage points compared to TooT-BERT-T, respectively.
format Online
Article
Text
id pubmed-10389051
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher De Gruyter
record_format MEDLINE/PubMed
spelling pubmed-103890512023-08-01 Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks Ghazikhani, Hamed Butler, Gregory J Integr Bioinform Workshop Transmembrane transport proteins (transporters) play a crucial role in the fundamental cellular processes of all organisms by facilitating the transport of hydrophilic substrates across hydrophobic membranes. Despite the availability of numerous membrane protein sequences, their structures and functions remain largely elusive. Recently, natural language processing (NLP) techniques have shown promise in the analysis of protein sequences. Bidirectional Encoder Representations from Transformers (BERT) is an NLP technique adapted for proteins to learn contextual embeddings of individual amino acids within a protein sequence. Our previous strategy, TooT-BERT-T, differentiated transporters from non-transporters by employing a logistic regression classifier with fine-tuned representations from ProtBERT-BFD. In this study, we expand upon this approach by utilizing representations from ProtBERT, ProtBERT-BFD, and MembraneBERT in combination with classical classifiers. Additionally, we introduce TooT-BERT-CNN-T, a novel method that fine-tunes ProtBERT-BFD and discriminates transporters using a Convolutional Neural Network (CNN). Our experimental results reveal that CNN surpasses traditional classifiers in discriminating transporters from non-transporters, achieving an MCC of 0.89 and an accuracy of 95.1 % on the independent test set. This represents an improvement of 0.03 and 1.11 percentage points compared to TooT-BERT-T, respectively. De Gruyter 2023-07-28 /pmc/articles/PMC10389051/ /pubmed/37497772 http://dx.doi.org/10.1515/jib-2022-0055 Text en © 2023 the author(s), published by De Gruyter, Berlin/Boston https://creativecommons.org/licenses/by/4.0/This work is licensed under the Creative Commons Attribution 4.0 International License.
spellingShingle Workshop
Ghazikhani, Hamed
Butler, Gregory
Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks
title Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks
title_full Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks
title_fullStr Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks
title_full_unstemmed Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks
title_short Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks
title_sort enhanced identification of membrane transport proteins: a hybrid approach combining protbert-bfd and convolutional neural networks
topic Workshop
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10389051/
https://www.ncbi.nlm.nih.gov/pubmed/37497772
http://dx.doi.org/10.1515/jib-2022-0055
work_keys_str_mv AT ghazikhanihamed enhancedidentificationofmembranetransportproteinsahybridapproachcombiningprotbertbfdandconvolutionalneuralnetworks
AT butlergregory enhancedidentificationofmembranetransportproteinsahybridapproachcombiningprotbertbfdandconvolutionalneuralnetworks