Cargando…

Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map

MOTIVATION: Do machine learning methods improve standard deconvolution techniques for gene expression data? This article uses a unique new dataset combined with an open innovation competition to evaluate a wide range of approaches developed by 294 competitors from 20 countries. The competition’s obj...

Descripción completa

Detalles Bibliográficos
Autores principales: Blasco, Andrea, Natoli, Ted, Endres, Michael G, Sergeev, Rinat A, Randazzo, Steven, Paik, Jin H, Macaluso, N J Maximilian, Narayan, Rajiv, Lu, Xiaodong, Peck, David, Lakhani, Karim R, Subramanian, Aravind
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479655/
https://www.ncbi.nlm.nih.gov/pubmed/33824954
http://dx.doi.org/10.1093/bioinformatics/btab192
_version_ 1784576305886396416
author Blasco, Andrea
Natoli, Ted
Endres, Michael G
Sergeev, Rinat A
Randazzo, Steven
Paik, Jin H
Macaluso, N J Maximilian
Narayan, Rajiv
Lu, Xiaodong
Peck, David
Lakhani, Karim R
Subramanian, Aravind
author_facet Blasco, Andrea
Natoli, Ted
Endres, Michael G
Sergeev, Rinat A
Randazzo, Steven
Paik, Jin H
Macaluso, N J Maximilian
Narayan, Rajiv
Lu, Xiaodong
Peck, David
Lakhani, Karim R
Subramanian, Aravind
author_sort Blasco, Andrea
collection PubMed
description MOTIVATION: Do machine learning methods improve standard deconvolution techniques for gene expression data? This article uses a unique new dataset combined with an open innovation competition to evaluate a wide range of approaches developed by 294 competitors from 20 countries. The competition’s objective was to address a deconvolution problem critical to analyzing genetic perturbations from the Connectivity Map. The issue consists of separating gene expression of individual genes from raw measurements obtained from gene pairs. We evaluated the outcomes using ground-truth data (direct measurements for single genes) obtained from the same samples. RESULTS: We find that the top-ranked algorithm, based on random forest regression, beat the other methods in accuracy and reproducibility; more traditional gaussian-mixture methods performed well and tended to be faster, and the best deep learning approach yielded outcomes slightly inferior to the above methods. We anticipate researchers in the field will find the dataset and algorithms developed in this study to be a powerful research tool for benchmarking their deconvolution methods and a resource useful for multiple applications. AVAILABILITY AND IMPLEMENTATION: The data is freely available at clue.io/data (section Contests) and the software is on GitHub at https://github.com/cmap/gene_deconvolution_challenge SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-8479655
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-84796552021-09-30 Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map Blasco, Andrea Natoli, Ted Endres, Michael G Sergeev, Rinat A Randazzo, Steven Paik, Jin H Macaluso, N J Maximilian Narayan, Rajiv Lu, Xiaodong Peck, David Lakhani, Karim R Subramanian, Aravind Bioinformatics Original Papers MOTIVATION: Do machine learning methods improve standard deconvolution techniques for gene expression data? This article uses a unique new dataset combined with an open innovation competition to evaluate a wide range of approaches developed by 294 competitors from 20 countries. The competition’s objective was to address a deconvolution problem critical to analyzing genetic perturbations from the Connectivity Map. The issue consists of separating gene expression of individual genes from raw measurements obtained from gene pairs. We evaluated the outcomes using ground-truth data (direct measurements for single genes) obtained from the same samples. RESULTS: We find that the top-ranked algorithm, based on random forest regression, beat the other methods in accuracy and reproducibility; more traditional gaussian-mixture methods performed well and tended to be faster, and the best deep learning approach yielded outcomes slightly inferior to the above methods. We anticipate researchers in the field will find the dataset and algorithms developed in this study to be a powerful research tool for benchmarking their deconvolution methods and a resource useful for multiple applications. AVAILABILITY AND IMPLEMENTATION: The data is freely available at clue.io/data (section Contests) and the software is on GitHub at https://github.com/cmap/gene_deconvolution_challenge SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-03-22 /pmc/articles/PMC8479655/ /pubmed/33824954 http://dx.doi.org/10.1093/bioinformatics/btab192 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Blasco, Andrea
Natoli, Ted
Endres, Michael G
Sergeev, Rinat A
Randazzo, Steven
Paik, Jin H
Macaluso, N J Maximilian
Narayan, Rajiv
Lu, Xiaodong
Peck, David
Lakhani, Karim R
Subramanian, Aravind
Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map
title Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map
title_full Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map
title_fullStr Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map
title_full_unstemmed Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map
title_short Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map
title_sort improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479655/
https://www.ncbi.nlm.nih.gov/pubmed/33824954
http://dx.doi.org/10.1093/bioinformatics/btab192
work_keys_str_mv AT blascoandrea improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT natolited improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT endresmichaelg improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT sergeevrinata improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT randazzosteven improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT paikjinh improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT macalusonjmaximilian improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT narayanrajiv improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT luxiaodong improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT peckdavid improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT lakhanikarimr improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap
AT subramanianaravind improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap