Cargando…
Predicting ATP-Binding Cassette Transporters Using the Random Forest Method
ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC trans...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7109328/ https://www.ncbi.nlm.nih.gov/pubmed/32269586 http://dx.doi.org/10.3389/fgene.2020.00156 |
_version_ | 1783512933046681600 |
---|---|
author | Hou, Ruiyan Wang, Lida Wu, Yi-Jun |
author_facet | Hou, Ruiyan Wang, Lida Wu, Yi-Jun |
author_sort | Hou, Ruiyan |
collection | PubMed |
description | ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC transporters is an urgent task. The present study used 188D as the feature extraction method, which is based on sequence information and physicochemical properties. We also visualized the feature extracted by t-Distributed Stochastic Neighbor Embedding (t-SNE). The sample based on the features extracted by 188D may be separated. Further, random forest (RF) is an efficient classifier to identify proteins. Under the 10-fold cross-validation of the model proposed here for a training set, the average accuracy rate of 10 training sets was 89.54%. We obtained values of 0.87 for specificity, 0.92 for sensitivity, and 0.79 for MCC. In the testing set, the accuracy achieved was 89%. These results suggest that the model combining 188D with RF is an optimal tool to identify ABC transporters. |
format | Online Article Text |
id | pubmed-7109328 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-71093282020-04-08 Predicting ATP-Binding Cassette Transporters Using the Random Forest Method Hou, Ruiyan Wang, Lida Wu, Yi-Jun Front Genet Genetics ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC transporters is an urgent task. The present study used 188D as the feature extraction method, which is based on sequence information and physicochemical properties. We also visualized the feature extracted by t-Distributed Stochastic Neighbor Embedding (t-SNE). The sample based on the features extracted by 188D may be separated. Further, random forest (RF) is an efficient classifier to identify proteins. Under the 10-fold cross-validation of the model proposed here for a training set, the average accuracy rate of 10 training sets was 89.54%. We obtained values of 0.87 for specificity, 0.92 for sensitivity, and 0.79 for MCC. In the testing set, the accuracy achieved was 89%. These results suggest that the model combining 188D with RF is an optimal tool to identify ABC transporters. Frontiers Media S.A. 2020-03-25 /pmc/articles/PMC7109328/ /pubmed/32269586 http://dx.doi.org/10.3389/fgene.2020.00156 Text en Copyright © 2020 Hou, Wang and Wu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Hou, Ruiyan Wang, Lida Wu, Yi-Jun Predicting ATP-Binding Cassette Transporters Using the Random Forest Method |
title | Predicting ATP-Binding Cassette Transporters Using the Random Forest Method |
title_full | Predicting ATP-Binding Cassette Transporters Using the Random Forest Method |
title_fullStr | Predicting ATP-Binding Cassette Transporters Using the Random Forest Method |
title_full_unstemmed | Predicting ATP-Binding Cassette Transporters Using the Random Forest Method |
title_short | Predicting ATP-Binding Cassette Transporters Using the Random Forest Method |
title_sort | predicting atp-binding cassette transporters using the random forest method |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7109328/ https://www.ncbi.nlm.nih.gov/pubmed/32269586 http://dx.doi.org/10.3389/fgene.2020.00156 |
work_keys_str_mv | AT houruiyan predictingatpbindingcassettetransportersusingtherandomforestmethod AT wanglida predictingatpbindingcassettetransportersusingtherandomforestmethod AT wuyijun predictingatpbindingcassettetransportersusingtherandomforestmethod |