Cargando…

Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis

Many tools and algorithms are available for analyzing transcriptomics data. These include algorithms for performing sequence alignment, data normalization and imputation, clustering, identifying differentially expressed genes, and performing gene set enrichment analysis. To make the best choice abou...

Descripción completa

Detalles Bibliográficos
Autores principales: Xie, Zhuorui, Chen, Clara, Ma’ayan, Avi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638921/
https://www.ncbi.nlm.nih.gov/pubmed/37953774
http://dx.doi.org/10.7717/peerj.16351
_version_ 1785133700926668800
author Xie, Zhuorui
Chen, Clara
Ma’ayan, Avi
author_facet Xie, Zhuorui
Chen, Clara
Ma’ayan, Avi
author_sort Xie, Zhuorui
collection PubMed
description Many tools and algorithms are available for analyzing transcriptomics data. These include algorithms for performing sequence alignment, data normalization and imputation, clustering, identifying differentially expressed genes, and performing gene set enrichment analysis. To make the best choice about which tools to use, objective benchmarks can be developed to compare the quality of different algorithms to extract biological knowledge maximally and accurately from these data. The Dexamethasone Benchmark (Dex-Benchmark) resource aims to fill this need by providing the community with datasets and code templates for benchmarking different gene expression analysis tools and algorithms. The resource provides access to a collection of curated RNA-seq, L1000, and ChIP-seq data from dexamethasone treatment as well as genetic perturbations of its known targets. In addition, the website provides Jupyter Notebooks that use these pre-processed curated datasets to demonstrate how to benchmark the different steps in gene expression analysis. By comparing two independent data sources and data types with some expected concordance, we can assess which tools and algorithms best recover such associations. To demonstrate the usefulness of the resource for discovering novel drug targets, we applied it to optimize data processing strategies for the chemical perturbations and CRISPR single gene knockouts from the L1000 transcriptomics data from the Library of Integrated Network Cellular Signatures (LINCS) program, with a focus on understudied proteins from the Illuminating the Druggable Genome (IDG) program. Overall, the Dex-Benchmark resource can be utilized to assess the quality of transcriptomics and other related bioinformatics data analysis workflows. The resource is available from: https://maayanlab.github.io/dex-benchmark.
format Online
Article
Text
id pubmed-10638921
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-106389212023-11-11 Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis Xie, Zhuorui Chen, Clara Ma’ayan, Avi PeerJ Bioinformatics Many tools and algorithms are available for analyzing transcriptomics data. These include algorithms for performing sequence alignment, data normalization and imputation, clustering, identifying differentially expressed genes, and performing gene set enrichment analysis. To make the best choice about which tools to use, objective benchmarks can be developed to compare the quality of different algorithms to extract biological knowledge maximally and accurately from these data. The Dexamethasone Benchmark (Dex-Benchmark) resource aims to fill this need by providing the community with datasets and code templates for benchmarking different gene expression analysis tools and algorithms. The resource provides access to a collection of curated RNA-seq, L1000, and ChIP-seq data from dexamethasone treatment as well as genetic perturbations of its known targets. In addition, the website provides Jupyter Notebooks that use these pre-processed curated datasets to demonstrate how to benchmark the different steps in gene expression analysis. By comparing two independent data sources and data types with some expected concordance, we can assess which tools and algorithms best recover such associations. To demonstrate the usefulness of the resource for discovering novel drug targets, we applied it to optimize data processing strategies for the chemical perturbations and CRISPR single gene knockouts from the L1000 transcriptomics data from the Library of Integrated Network Cellular Signatures (LINCS) program, with a focus on understudied proteins from the Illuminating the Druggable Genome (IDG) program. Overall, the Dex-Benchmark resource can be utilized to assess the quality of transcriptomics and other related bioinformatics data analysis workflows. The resource is available from: https://maayanlab.github.io/dex-benchmark. PeerJ Inc. 2023-11-08 /pmc/articles/PMC10638921/ /pubmed/37953774 http://dx.doi.org/10.7717/peerj.16351 Text en © 2023 Xie et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Xie, Zhuorui
Chen, Clara
Ma’ayan, Avi
Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis
title Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis
title_full Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis
title_fullStr Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis
title_full_unstemmed Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis
title_short Dex-Benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis
title_sort dex-benchmark: datasets and code to evaluate algorithms for transcriptomics data analysis
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638921/
https://www.ncbi.nlm.nih.gov/pubmed/37953774
http://dx.doi.org/10.7717/peerj.16351
work_keys_str_mv AT xiezhuorui dexbenchmarkdatasetsandcodetoevaluatealgorithmsfortranscriptomicsdataanalysis
AT chenclara dexbenchmarkdatasetsandcodetoevaluatealgorithmsfortranscriptomicsdataanalysis
AT maayanavi dexbenchmarkdatasetsandcodetoevaluatealgorithmsfortranscriptomicsdataanalysis