Cargando…

Predicting ATP-Binding Cassette Transporters Using the Random Forest Method

ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC trans...

Descripción completa

Detalles Bibliográficos
Autores principales: Hou, Ruiyan, Wang, Lida, Wu, Yi-Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7109328/
https://www.ncbi.nlm.nih.gov/pubmed/32269586
http://dx.doi.org/10.3389/fgene.2020.00156
Descripción
Sumario:ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC transporters is an urgent task. The present study used 188D as the feature extraction method, which is based on sequence information and physicochemical properties. We also visualized the feature extracted by t-Distributed Stochastic Neighbor Embedding (t-SNE). The sample based on the features extracted by 188D may be separated. Further, random forest (RF) is an efficient classifier to identify proteins. Under the 10-fold cross-validation of the model proposed here for a training set, the average accuracy rate of 10 training sets was 89.54%. We obtained values of 0.87 for specificity, 0.92 for sensitivity, and 0.79 for MCC. In the testing set, the accuracy achieved was 89%. These results suggest that the model combining 188D with RF is an optimal tool to identify ABC transporters.