Cargando…

Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction

The prediction of cell-lines sensitivity to a given set of compounds is a very important factor in the optimization of in-vitro assays. To date, the most common prediction strategies are based upon machine learning or other quantitative structure-activity relationships (QSAR) based approaches. In th...

Descripción completa

Detalles Bibliográficos
Autores principales: Tejera, E., Carrera, I., Jimenes-Vargas, Karina, Armijos-Jaramillo, V., Sánchez-Rodríguez, A., Cruz-Monteagudo, M., Perez-Castillo, Y.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6779297/
https://www.ncbi.nlm.nih.gov/pubmed/31589649
http://dx.doi.org/10.1371/journal.pone.0223276
_version_ 1783456912661020672
author Tejera, E.
Carrera, I.
Jimenes-Vargas, Karina
Armijos-Jaramillo, V.
Sánchez-Rodríguez, A.
Cruz-Monteagudo, M.
Perez-Castillo, Y.
author_facet Tejera, E.
Carrera, I.
Jimenes-Vargas, Karina
Armijos-Jaramillo, V.
Sánchez-Rodríguez, A.
Cruz-Monteagudo, M.
Perez-Castillo, Y.
author_sort Tejera, E.
collection PubMed
description The prediction of cell-lines sensitivity to a given set of compounds is a very important factor in the optimization of in-vitro assays. To date, the most common prediction strategies are based upon machine learning or other quantitative structure-activity relationships (QSAR) based approaches. In the present research, we propose and discuss a straightforward strategy not based on any learning modelling but exclusively relying upon the chemical similarity of a query compound to reference compounds with annotated activity against cell lines. We also compare the performance of the proposed method to machine learning predictions on the same problem. A curated database of compounds-cell lines associations derived from ChemBL version 22 was created for algorithm construction and cross-validation. Validation was done using 10-fold cross-validation and testing the models on new data obtained from ChemBL version 25. In terms of accuracy, both methods perform similarly with values around 0.65 across 750 cell lines in 10-fold cross-validation experiments. By combining both methods it is possible to achieve 66% of correct classification rate in more than 26000 newly reported interactions comprising 11000 new compounds. A Web Service implementing the described approaches (both similarity and machine learning based models) is freely available at: http://bioquimio.udla.edu.ec/cellfishing.
format Online
Article
Text
id pubmed-6779297
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-67792972019-10-19 Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction Tejera, E. Carrera, I. Jimenes-Vargas, Karina Armijos-Jaramillo, V. Sánchez-Rodríguez, A. Cruz-Monteagudo, M. Perez-Castillo, Y. PLoS One Research Article The prediction of cell-lines sensitivity to a given set of compounds is a very important factor in the optimization of in-vitro assays. To date, the most common prediction strategies are based upon machine learning or other quantitative structure-activity relationships (QSAR) based approaches. In the present research, we propose and discuss a straightforward strategy not based on any learning modelling but exclusively relying upon the chemical similarity of a query compound to reference compounds with annotated activity against cell lines. We also compare the performance of the proposed method to machine learning predictions on the same problem. A curated database of compounds-cell lines associations derived from ChemBL version 22 was created for algorithm construction and cross-validation. Validation was done using 10-fold cross-validation and testing the models on new data obtained from ChemBL version 25. In terms of accuracy, both methods perform similarly with values around 0.65 across 750 cell lines in 10-fold cross-validation experiments. By combining both methods it is possible to achieve 66% of correct classification rate in more than 26000 newly reported interactions comprising 11000 new compounds. A Web Service implementing the described approaches (both similarity and machine learning based models) is freely available at: http://bioquimio.udla.edu.ec/cellfishing. Public Library of Science 2019-10-07 /pmc/articles/PMC6779297/ /pubmed/31589649 http://dx.doi.org/10.1371/journal.pone.0223276 Text en © 2019 Tejera et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tejera, E.
Carrera, I.
Jimenes-Vargas, Karina
Armijos-Jaramillo, V.
Sánchez-Rodríguez, A.
Cruz-Monteagudo, M.
Perez-Castillo, Y.
Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction
title Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction
title_full Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction
title_fullStr Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction
title_full_unstemmed Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction
title_short Cell fishing: A similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction
title_sort cell fishing: a similarity based approach and machine learning strategy for multiple cell lines-compound sensitivity prediction
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6779297/
https://www.ncbi.nlm.nih.gov/pubmed/31589649
http://dx.doi.org/10.1371/journal.pone.0223276
work_keys_str_mv AT tejerae cellfishingasimilaritybasedapproachandmachinelearningstrategyformultiplecelllinescompoundsensitivityprediction
AT carrerai cellfishingasimilaritybasedapproachandmachinelearningstrategyformultiplecelllinescompoundsensitivityprediction
AT jimenesvargaskarina cellfishingasimilaritybasedapproachandmachinelearningstrategyformultiplecelllinescompoundsensitivityprediction
AT armijosjaramillov cellfishingasimilaritybasedapproachandmachinelearningstrategyformultiplecelllinescompoundsensitivityprediction
AT sanchezrodrigueza cellfishingasimilaritybasedapproachandmachinelearningstrategyformultiplecelllinescompoundsensitivityprediction
AT cruzmonteagudom cellfishingasimilaritybasedapproachandmachinelearningstrategyformultiplecelllinescompoundsensitivityprediction
AT perezcastilloy cellfishingasimilaritybasedapproachandmachinelearningstrategyformultiplecelllinescompoundsensitivityprediction