Cargando…

Predicting ATP-Binding Cassette Transporters Using the Random Forest Method

ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC trans...

Descripción completa

Detalles Bibliográficos
Autores principales: Hou, Ruiyan, Wang, Lida, Wu, Yi-Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7109328/
https://www.ncbi.nlm.nih.gov/pubmed/32269586
http://dx.doi.org/10.3389/fgene.2020.00156
_version_ 1783512933046681600
author Hou, Ruiyan
Wang, Lida
Wu, Yi-Jun
author_facet Hou, Ruiyan
Wang, Lida
Wu, Yi-Jun
author_sort Hou, Ruiyan
collection PubMed
description ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC transporters is an urgent task. The present study used 188D as the feature extraction method, which is based on sequence information and physicochemical properties. We also visualized the feature extracted by t-Distributed Stochastic Neighbor Embedding (t-SNE). The sample based on the features extracted by 188D may be separated. Further, random forest (RF) is an efficient classifier to identify proteins. Under the 10-fold cross-validation of the model proposed here for a training set, the average accuracy rate of 10 training sets was 89.54%. We obtained values of 0.87 for specificity, 0.92 for sensitivity, and 0.79 for MCC. In the testing set, the accuracy achieved was 89%. These results suggest that the model combining 188D with RF is an optimal tool to identify ABC transporters.
format Online
Article
Text
id pubmed-7109328
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-71093282020-04-08 Predicting ATP-Binding Cassette Transporters Using the Random Forest Method Hou, Ruiyan Wang, Lida Wu, Yi-Jun Front Genet Genetics ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC transporters is an urgent task. The present study used 188D as the feature extraction method, which is based on sequence information and physicochemical properties. We also visualized the feature extracted by t-Distributed Stochastic Neighbor Embedding (t-SNE). The sample based on the features extracted by 188D may be separated. Further, random forest (RF) is an efficient classifier to identify proteins. Under the 10-fold cross-validation of the model proposed here for a training set, the average accuracy rate of 10 training sets was 89.54%. We obtained values of 0.87 for specificity, 0.92 for sensitivity, and 0.79 for MCC. In the testing set, the accuracy achieved was 89%. These results suggest that the model combining 188D with RF is an optimal tool to identify ABC transporters. Frontiers Media S.A. 2020-03-25 /pmc/articles/PMC7109328/ /pubmed/32269586 http://dx.doi.org/10.3389/fgene.2020.00156 Text en Copyright © 2020 Hou, Wang and Wu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Hou, Ruiyan
Wang, Lida
Wu, Yi-Jun
Predicting ATP-Binding Cassette Transporters Using the Random Forest Method
title Predicting ATP-Binding Cassette Transporters Using the Random Forest Method
title_full Predicting ATP-Binding Cassette Transporters Using the Random Forest Method
title_fullStr Predicting ATP-Binding Cassette Transporters Using the Random Forest Method
title_full_unstemmed Predicting ATP-Binding Cassette Transporters Using the Random Forest Method
title_short Predicting ATP-Binding Cassette Transporters Using the Random Forest Method
title_sort predicting atp-binding cassette transporters using the random forest method
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7109328/
https://www.ncbi.nlm.nih.gov/pubmed/32269586
http://dx.doi.org/10.3389/fgene.2020.00156
work_keys_str_mv AT houruiyan predictingatpbindingcassettetransportersusingtherandomforestmethod
AT wanglida predictingatpbindingcassettetransportersusingtherandomforestmethod
AT wuyijun predictingatpbindingcassettetransportersusingtherandomforestmethod