Cargando…

An automatic representation of peptides for effective antimicrobial activity classification

Antimicrobial peptides (AMPs) are a promising alternative to small-molecules-based antibiotics. These peptides are part of most living organisms’ innate defense system. In order to computationally identify new AMPs within the peptides these organisms produce, an automatic AMP/non-AMP classifier is r...

Descripción completa

Detalles Bibliográficos
Autores principales: Beltran, Jesus A., Del Rio, Gabriel, Brizuela, Carlos A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7063200/
https://www.ncbi.nlm.nih.gov/pubmed/32180904
http://dx.doi.org/10.1016/j.csbj.2020.02.002
Descripción
Sumario:Antimicrobial peptides (AMPs) are a promising alternative to small-molecules-based antibiotics. These peptides are part of most living organisms’ innate defense system. In order to computationally identify new AMPs within the peptides these organisms produce, an automatic AMP/non-AMP classifier is required. In order to have an efficient classifier, a set of robust features that can capture what differentiates an AMP from another that is not, has to be selected. However, the number of candidate descriptors is large (in the order of thousands) to allow for an exhaustive search of all possible combinations. Therefore, efficient and effective feature selection techniques are required. In this work, we propose an efficient wrapper technique to solve the feature selection problem for AMPs identification. The method is based on a Genetic Algorithm that uses a variable-length chromosome for representing the selected features and uses an objective function that considers the Mathew Correlation Coefficient and the number of selected features. Computational experiments show that the proposed method can produce competitive results regarding sensitivity, specificity, and MCC. Furthermore, the best classification results are achieved by using only 39 out of 272 molecular descriptors.