Cargando…
A Bayesian approach to accurate and robust signature detection on LINCS L1000 data
MOTIVATION: LINCS L1000 dataset contains numerous cellular expression data induced by large sets of perturbagens. Although it provides invaluable resources for drug discovery as well as understanding of disease mechanisms, the existing peak deconvolution algorithms cannot recover the accurate expres...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7203754/ https://www.ncbi.nlm.nih.gov/pubmed/32003771 http://dx.doi.org/10.1093/bioinformatics/btaa064 |
_version_ | 1783529928071839744 |
---|---|
author | Qiu, Yue Lu, Tianhuan Lim, Hansaim Xie, Lei |
author_facet | Qiu, Yue Lu, Tianhuan Lim, Hansaim Xie, Lei |
author_sort | Qiu, Yue |
collection | PubMed |
description | MOTIVATION: LINCS L1000 dataset contains numerous cellular expression data induced by large sets of perturbagens. Although it provides invaluable resources for drug discovery as well as understanding of disease mechanisms, the existing peak deconvolution algorithms cannot recover the accurate expression level of genes in many cases, inducing severe noise in the dataset and limiting its applications in biomedical studies. RESULTS: Here, we present a novel Bayesian-based peak deconvolution algorithm that gives unbiased likelihood estimations for peak locations and characterize the peaks with probability based z-scores. Based on the above algorithm, we build a pipeline to process raw data from L1000 assay into signatures that represent the features of perturbagen. The performance of the proposed pipeline is evaluated using similarity between the signatures of bio-replicates and the drugs with shared targets, and the results show that signatures derived from our pipeline gives a substantially more reliable and informative representation for perturbagens than existing methods. Thus, the new pipeline may significantly boost the performance of L1000 data in the downstream applications such as drug repurposing, disease modeling and gene function prediction. AVAILABILITY AND IMPLEMENTATION: The code and the precomputed data for LINCS L1000 Phase II (GSE 70138) are available at https://github.com/njpipeorgan/L1000-bayesian. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-7203754 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-72037542020-05-11 A Bayesian approach to accurate and robust signature detection on LINCS L1000 data Qiu, Yue Lu, Tianhuan Lim, Hansaim Xie, Lei Bioinformatics Original Papers MOTIVATION: LINCS L1000 dataset contains numerous cellular expression data induced by large sets of perturbagens. Although it provides invaluable resources for drug discovery as well as understanding of disease mechanisms, the existing peak deconvolution algorithms cannot recover the accurate expression level of genes in many cases, inducing severe noise in the dataset and limiting its applications in biomedical studies. RESULTS: Here, we present a novel Bayesian-based peak deconvolution algorithm that gives unbiased likelihood estimations for peak locations and characterize the peaks with probability based z-scores. Based on the above algorithm, we build a pipeline to process raw data from L1000 assay into signatures that represent the features of perturbagen. The performance of the proposed pipeline is evaluated using similarity between the signatures of bio-replicates and the drugs with shared targets, and the results show that signatures derived from our pipeline gives a substantially more reliable and informative representation for perturbagens than existing methods. Thus, the new pipeline may significantly boost the performance of L1000 data in the downstream applications such as drug repurposing, disease modeling and gene function prediction. AVAILABILITY AND IMPLEMENTATION: The code and the precomputed data for LINCS L1000 Phase II (GSE 70138) are available at https://github.com/njpipeorgan/L1000-bayesian. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-05-01 2020-01-31 /pmc/articles/PMC7203754/ /pubmed/32003771 http://dx.doi.org/10.1093/bioinformatics/btaa064 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Original Papers Qiu, Yue Lu, Tianhuan Lim, Hansaim Xie, Lei A Bayesian approach to accurate and robust signature detection on LINCS L1000 data |
title | A Bayesian approach to accurate and robust signature detection on LINCS L1000 data |
title_full | A Bayesian approach to accurate and robust signature detection on LINCS L1000 data |
title_fullStr | A Bayesian approach to accurate and robust signature detection on LINCS L1000 data |
title_full_unstemmed | A Bayesian approach to accurate and robust signature detection on LINCS L1000 data |
title_short | A Bayesian approach to accurate and robust signature detection on LINCS L1000 data |
title_sort | bayesian approach to accurate and robust signature detection on lincs l1000 data |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7203754/ https://www.ncbi.nlm.nih.gov/pubmed/32003771 http://dx.doi.org/10.1093/bioinformatics/btaa064 |
work_keys_str_mv | AT qiuyue abayesianapproachtoaccurateandrobustsignaturedetectiononlincsl1000data AT lutianhuan abayesianapproachtoaccurateandrobustsignaturedetectiononlincsl1000data AT limhansaim abayesianapproachtoaccurateandrobustsignaturedetectiononlincsl1000data AT xielei abayesianapproachtoaccurateandrobustsignaturedetectiononlincsl1000data AT qiuyue bayesianapproachtoaccurateandrobustsignaturedetectiononlincsl1000data AT lutianhuan bayesianapproachtoaccurateandrobustsignaturedetectiononlincsl1000data AT limhansaim bayesianapproachtoaccurateandrobustsignaturedetectiononlincsl1000data AT xielei bayesianapproachtoaccurateandrobustsignaturedetectiononlincsl1000data |