Cargando…

Organism-specific training improves performance of linear B-cell epitope prediction

MOTIVATION: In silico identification of linear B-cell epitopes represents an important step in the development of diagnostic tests and vaccine candidates, by providing potential high-probability targets for experimental investigation. Current predictive tools were developed under a generalist approa...

Descripción completa

Detalles Bibliográficos
Autores principales: Ashford, Jodie, Reis-Cunha, João, Lobo, Igor, Lobo, Francisco, Campelo, Felipe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8665745/
https://www.ncbi.nlm.nih.gov/pubmed/34289025
http://dx.doi.org/10.1093/bioinformatics/btab536
Descripción
Sumario:MOTIVATION: In silico identification of linear B-cell epitopes represents an important step in the development of diagnostic tests and vaccine candidates, by providing potential high-probability targets for experimental investigation. Current predictive tools were developed under a generalist approach, training models with heterogeneous datasets to develop predictors that can be deployed for a wide variety of pathogens. However, continuous advances in processing power and the increasing amount of epitope data for a broad range of pathogens indicate that training organism or taxon-specific models may become a feasible alternative, with unexplored potential gains in predictive performance. RESULTS: This article shows how organism-specific training of epitope prediction models can yield substantial performance gains across several quality metrics when compared to models trained with heterogeneous and hybrid data, and with a variety of widely used predictors from the literature. These results suggest a promising alternative for the development of custom-tailored predictive models with high predictive power, which can be easily implemented and deployed for the investigation of specific pathogens. AVAILABILITY AND IMPLEMENTATION: The data underlying this article, as well as the full reproducibility scripts, are available at https://github.com/fcampelo/OrgSpec-paper. The R package that implements the organism-specific pipeline functions is available at https://github.com/fcampelo/epitopes. SUPPLEMENTARY INFORMATION: Supplementary materials are available at Bioinformatics online.