Cargando…
Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks
Transmembrane transport proteins (transporters) play a crucial role in the fundamental cellular processes of all organisms by facilitating the transport of hydrophilic substrates across hydrophobic membranes. Despite the availability of numerous membrane protein sequences, their structures and funct...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
De Gruyter
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10389051/ https://www.ncbi.nlm.nih.gov/pubmed/37497772 http://dx.doi.org/10.1515/jib-2022-0055 |
_version_ | 1785082213633622016 |
---|---|
author | Ghazikhani, Hamed Butler, Gregory |
author_facet | Ghazikhani, Hamed Butler, Gregory |
author_sort | Ghazikhani, Hamed |
collection | PubMed |
description | Transmembrane transport proteins (transporters) play a crucial role in the fundamental cellular processes of all organisms by facilitating the transport of hydrophilic substrates across hydrophobic membranes. Despite the availability of numerous membrane protein sequences, their structures and functions remain largely elusive. Recently, natural language processing (NLP) techniques have shown promise in the analysis of protein sequences. Bidirectional Encoder Representations from Transformers (BERT) is an NLP technique adapted for proteins to learn contextual embeddings of individual amino acids within a protein sequence. Our previous strategy, TooT-BERT-T, differentiated transporters from non-transporters by employing a logistic regression classifier with fine-tuned representations from ProtBERT-BFD. In this study, we expand upon this approach by utilizing representations from ProtBERT, ProtBERT-BFD, and MembraneBERT in combination with classical classifiers. Additionally, we introduce TooT-BERT-CNN-T, a novel method that fine-tunes ProtBERT-BFD and discriminates transporters using a Convolutional Neural Network (CNN). Our experimental results reveal that CNN surpasses traditional classifiers in discriminating transporters from non-transporters, achieving an MCC of 0.89 and an accuracy of 95.1 % on the independent test set. This represents an improvement of 0.03 and 1.11 percentage points compared to TooT-BERT-T, respectively. |
format | Online Article Text |
id | pubmed-10389051 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | De Gruyter |
record_format | MEDLINE/PubMed |
spelling | pubmed-103890512023-08-01 Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks Ghazikhani, Hamed Butler, Gregory J Integr Bioinform Workshop Transmembrane transport proteins (transporters) play a crucial role in the fundamental cellular processes of all organisms by facilitating the transport of hydrophilic substrates across hydrophobic membranes. Despite the availability of numerous membrane protein sequences, their structures and functions remain largely elusive. Recently, natural language processing (NLP) techniques have shown promise in the analysis of protein sequences. Bidirectional Encoder Representations from Transformers (BERT) is an NLP technique adapted for proteins to learn contextual embeddings of individual amino acids within a protein sequence. Our previous strategy, TooT-BERT-T, differentiated transporters from non-transporters by employing a logistic regression classifier with fine-tuned representations from ProtBERT-BFD. In this study, we expand upon this approach by utilizing representations from ProtBERT, ProtBERT-BFD, and MembraneBERT in combination with classical classifiers. Additionally, we introduce TooT-BERT-CNN-T, a novel method that fine-tunes ProtBERT-BFD and discriminates transporters using a Convolutional Neural Network (CNN). Our experimental results reveal that CNN surpasses traditional classifiers in discriminating transporters from non-transporters, achieving an MCC of 0.89 and an accuracy of 95.1 % on the independent test set. This represents an improvement of 0.03 and 1.11 percentage points compared to TooT-BERT-T, respectively. De Gruyter 2023-07-28 /pmc/articles/PMC10389051/ /pubmed/37497772 http://dx.doi.org/10.1515/jib-2022-0055 Text en © 2023 the author(s), published by De Gruyter, Berlin/Boston https://creativecommons.org/licenses/by/4.0/This work is licensed under the Creative Commons Attribution 4.0 International License. |
spellingShingle | Workshop Ghazikhani, Hamed Butler, Gregory Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks |
title | Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks |
title_full | Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks |
title_fullStr | Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks |
title_full_unstemmed | Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks |
title_short | Enhanced identification of membrane transport proteins: a hybrid approach combining ProtBERT-BFD and convolutional neural networks |
title_sort | enhanced identification of membrane transport proteins: a hybrid approach combining protbert-bfd and convolutional neural networks |
topic | Workshop |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10389051/ https://www.ncbi.nlm.nih.gov/pubmed/37497772 http://dx.doi.org/10.1515/jib-2022-0055 |
work_keys_str_mv | AT ghazikhanihamed enhancedidentificationofmembranetransportproteinsahybridapproachcombiningprotbertbfdandconvolutionalneuralnetworks AT butlergregory enhancedidentificationofmembranetransportproteinsahybridapproachcombiningprotbertbfdandconvolutionalneuralnetworks |