Cargando…
SPOTONE: Hot Spots on Protein Complexes with Extremely Randomized Trees via Sequence-Only Features
Protein Hot-Spots (HS) are experimentally determined amino acids, key to small ligand binding and tend to be structural landmarks on protein–protein interactions. As such, they were extensively approached by structure-based Machine Learning (ML) prediction methods. However, the availability of a muc...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7582262/ https://www.ncbi.nlm.nih.gov/pubmed/33019775 http://dx.doi.org/10.3390/ijms21197281 |
_version_ | 1783599150352302080 |
---|---|
author | Preto, A. J. Moreira, Irina S. |
author_facet | Preto, A. J. Moreira, Irina S. |
author_sort | Preto, A. J. |
collection | PubMed |
description | Protein Hot-Spots (HS) are experimentally determined amino acids, key to small ligand binding and tend to be structural landmarks on protein–protein interactions. As such, they were extensively approached by structure-based Machine Learning (ML) prediction methods. However, the availability of a much larger array of protein sequences in comparison to determined tree-dimensional structures indicates that a sequence-based HS predictor has the potential to be more useful for the scientific community. Herein, we present SPOTONE, a new ML predictor able to accurately classify protein HS via sequence-only features. This algorithm shows accuracy, AUROC, precision, recall and F1-score of 0.82, 0.83, 0.91, 0.82 and 0.85, respectively, on an independent testing set. The algorithm is deployed within a free-to-use webserver, only requiring the user to submit a FASTA file with one or more protein sequences. |
format | Online Article Text |
id | pubmed-7582262 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-75822622020-10-28 SPOTONE: Hot Spots on Protein Complexes with Extremely Randomized Trees via Sequence-Only Features Preto, A. J. Moreira, Irina S. Int J Mol Sci Article Protein Hot-Spots (HS) are experimentally determined amino acids, key to small ligand binding and tend to be structural landmarks on protein–protein interactions. As such, they were extensively approached by structure-based Machine Learning (ML) prediction methods. However, the availability of a much larger array of protein sequences in comparison to determined tree-dimensional structures indicates that a sequence-based HS predictor has the potential to be more useful for the scientific community. Herein, we present SPOTONE, a new ML predictor able to accurately classify protein HS via sequence-only features. This algorithm shows accuracy, AUROC, precision, recall and F1-score of 0.82, 0.83, 0.91, 0.82 and 0.85, respectively, on an independent testing set. The algorithm is deployed within a free-to-use webserver, only requiring the user to submit a FASTA file with one or more protein sequences. MDPI 2020-10-01 /pmc/articles/PMC7582262/ /pubmed/33019775 http://dx.doi.org/10.3390/ijms21197281 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Preto, A. J. Moreira, Irina S. SPOTONE: Hot Spots on Protein Complexes with Extremely Randomized Trees via Sequence-Only Features |
title | SPOTONE: Hot Spots on Protein Complexes with Extremely Randomized Trees via Sequence-Only Features |
title_full | SPOTONE: Hot Spots on Protein Complexes with Extremely Randomized Trees via Sequence-Only Features |
title_fullStr | SPOTONE: Hot Spots on Protein Complexes with Extremely Randomized Trees via Sequence-Only Features |
title_full_unstemmed | SPOTONE: Hot Spots on Protein Complexes with Extremely Randomized Trees via Sequence-Only Features |
title_short | SPOTONE: Hot Spots on Protein Complexes with Extremely Randomized Trees via Sequence-Only Features |
title_sort | spotone: hot spots on protein complexes with extremely randomized trees via sequence-only features |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7582262/ https://www.ncbi.nlm.nih.gov/pubmed/33019775 http://dx.doi.org/10.3390/ijms21197281 |
work_keys_str_mv | AT pretoaj spotonehotspotsonproteincomplexeswithextremelyrandomizedtreesviasequenceonlyfeatures AT moreirairinas spotonehotspotsonproteincomplexeswithextremelyrandomizedtreesviasequenceonlyfeatures |