Cargando…
Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map
MOTIVATION: Do machine learning methods improve standard deconvolution techniques for gene expression data? This article uses a unique new dataset combined with an open innovation competition to evaluate a wide range of approaches developed by 294 competitors from 20 countries. The competition’s obj...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479655/ https://www.ncbi.nlm.nih.gov/pubmed/33824954 http://dx.doi.org/10.1093/bioinformatics/btab192 |
_version_ | 1784576305886396416 |
---|---|
author | Blasco, Andrea Natoli, Ted Endres, Michael G Sergeev, Rinat A Randazzo, Steven Paik, Jin H Macaluso, N J Maximilian Narayan, Rajiv Lu, Xiaodong Peck, David Lakhani, Karim R Subramanian, Aravind |
author_facet | Blasco, Andrea Natoli, Ted Endres, Michael G Sergeev, Rinat A Randazzo, Steven Paik, Jin H Macaluso, N J Maximilian Narayan, Rajiv Lu, Xiaodong Peck, David Lakhani, Karim R Subramanian, Aravind |
author_sort | Blasco, Andrea |
collection | PubMed |
description | MOTIVATION: Do machine learning methods improve standard deconvolution techniques for gene expression data? This article uses a unique new dataset combined with an open innovation competition to evaluate a wide range of approaches developed by 294 competitors from 20 countries. The competition’s objective was to address a deconvolution problem critical to analyzing genetic perturbations from the Connectivity Map. The issue consists of separating gene expression of individual genes from raw measurements obtained from gene pairs. We evaluated the outcomes using ground-truth data (direct measurements for single genes) obtained from the same samples. RESULTS: We find that the top-ranked algorithm, based on random forest regression, beat the other methods in accuracy and reproducibility; more traditional gaussian-mixture methods performed well and tended to be faster, and the best deep learning approach yielded outcomes slightly inferior to the above methods. We anticipate researchers in the field will find the dataset and algorithms developed in this study to be a powerful research tool for benchmarking their deconvolution methods and a resource useful for multiple applications. AVAILABILITY AND IMPLEMENTATION: The data is freely available at clue.io/data (section Contests) and the software is on GitHub at https://github.com/cmap/gene_deconvolution_challenge SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-8479655 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-84796552021-09-30 Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map Blasco, Andrea Natoli, Ted Endres, Michael G Sergeev, Rinat A Randazzo, Steven Paik, Jin H Macaluso, N J Maximilian Narayan, Rajiv Lu, Xiaodong Peck, David Lakhani, Karim R Subramanian, Aravind Bioinformatics Original Papers MOTIVATION: Do machine learning methods improve standard deconvolution techniques for gene expression data? This article uses a unique new dataset combined with an open innovation competition to evaluate a wide range of approaches developed by 294 competitors from 20 countries. The competition’s objective was to address a deconvolution problem critical to analyzing genetic perturbations from the Connectivity Map. The issue consists of separating gene expression of individual genes from raw measurements obtained from gene pairs. We evaluated the outcomes using ground-truth data (direct measurements for single genes) obtained from the same samples. RESULTS: We find that the top-ranked algorithm, based on random forest regression, beat the other methods in accuracy and reproducibility; more traditional gaussian-mixture methods performed well and tended to be faster, and the best deep learning approach yielded outcomes slightly inferior to the above methods. We anticipate researchers in the field will find the dataset and algorithms developed in this study to be a powerful research tool for benchmarking their deconvolution methods and a resource useful for multiple applications. AVAILABILITY AND IMPLEMENTATION: The data is freely available at clue.io/data (section Contests) and the software is on GitHub at https://github.com/cmap/gene_deconvolution_challenge SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2021-03-22 /pmc/articles/PMC8479655/ /pubmed/33824954 http://dx.doi.org/10.1093/bioinformatics/btab192 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Original Papers Blasco, Andrea Natoli, Ted Endres, Michael G Sergeev, Rinat A Randazzo, Steven Paik, Jin H Macaluso, N J Maximilian Narayan, Rajiv Lu, Xiaodong Peck, David Lakhani, Karim R Subramanian, Aravind Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map |
title | Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map |
title_full | Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map |
title_fullStr | Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map |
title_full_unstemmed | Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map |
title_short | Improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map |
title_sort | improving deconvolution methods in biology through open innovation competitions: an application to the connectivity map |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8479655/ https://www.ncbi.nlm.nih.gov/pubmed/33824954 http://dx.doi.org/10.1093/bioinformatics/btab192 |
work_keys_str_mv | AT blascoandrea improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT natolited improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT endresmichaelg improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT sergeevrinata improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT randazzosteven improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT paikjinh improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT macalusonjmaximilian improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT narayanrajiv improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT luxiaodong improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT peckdavid improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT lakhanikarimr improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap AT subramanianaravind improvingdeconvolutionmethodsinbiologythroughopeninnovationcompetitionsanapplicationtotheconnectivitymap |