Cargando…

Fluent genomics with plyranges and tximeta

We construct a simple workflow for fluent genomics data analysis using the R/Bioconductor ecosystem. This involves three core steps: import the data into an appropriate abstraction, model the data with respect to the biological questions of interest, and integrate the results with respect to their u...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lee, Stuart, Lawrence, Michael, Love, Michael I.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	F1000 Research Limited 2020
Materias:	Method Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7243206/ https://www.ncbi.nlm.nih.gov/pubmed/32528659 http://dx.doi.org/10.12688/f1000research.22259.1

_version_	1783537384272429056
author	Lee, Stuart Lawrence, Michael Love, Michael I.
author_facet	Lee, Stuart Lawrence, Michael Love, Michael I.
author_sort	Lee, Stuart
collection	PubMed
description	We construct a simple workflow for fluent genomics data analysis using the R/Bioconductor ecosystem. This involves three core steps: import the data into an appropriate abstraction, model the data with respect to the biological questions of interest, and integrate the results with respect to their underlying genomic coordinates. Here we show how to implement these steps to integrate published RNA-seq and ATAC-seq experiments on macrophage cell lines. Using tximeta, we import RNA-seq transcript quantifications into an analysis-ready data structure, called the SummarizedExperiment, that contains the ranges of the reference transcripts and metadata on their provenance. Using SummarizedExperiments to represent the ATAC-seq and RNA-seq data, we model differentially accessible (DA) chromatin peaks and differentially expressed (DE) genes with existing Bioconductor packages. Using plyranges we then integrate the results to see if there is an enrichment of DA peaks near DE genes by finding overlaps and aggregating over log-fold change thresholds. The combination of these packages and their integration with the Bioconductor ecosystem provide a coherent framework for analysts to iteratively and reproducibly explore their biological data.
format	Online Article Text
id	pubmed-7243206
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	F1000 Research Limited
record_format	MEDLINE/PubMed
spelling	pubmed-72432062020-06-10 Fluent genomics with plyranges and tximeta Lee, Stuart Lawrence, Michael Love, Michael I. F1000Res Method Article We construct a simple workflow for fluent genomics data analysis using the R/Bioconductor ecosystem. This involves three core steps: import the data into an appropriate abstraction, model the data with respect to the biological questions of interest, and integrate the results with respect to their underlying genomic coordinates. Here we show how to implement these steps to integrate published RNA-seq and ATAC-seq experiments on macrophage cell lines. Using tximeta, we import RNA-seq transcript quantifications into an analysis-ready data structure, called the SummarizedExperiment, that contains the ranges of the reference transcripts and metadata on their provenance. Using SummarizedExperiments to represent the ATAC-seq and RNA-seq data, we model differentially accessible (DA) chromatin peaks and differentially expressed (DE) genes with existing Bioconductor packages. Using plyranges we then integrate the results to see if there is an enrichment of DA peaks near DE genes by finding overlaps and aggregating over log-fold change thresholds. The combination of these packages and their integration with the Bioconductor ecosystem provide a coherent framework for analysts to iteratively and reproducibly explore their biological data. F1000 Research Limited 2020-02-12 /pmc/articles/PMC7243206/ /pubmed/32528659 http://dx.doi.org/10.12688/f1000research.22259.1 Text en Copyright: © 2020 Lee S et al. http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Method Article Lee, Stuart Lawrence, Michael Love, Michael I. Fluent genomics with plyranges and tximeta
title	Fluent genomics with plyranges and tximeta
title_full	Fluent genomics with plyranges and tximeta
title_fullStr	Fluent genomics with plyranges and tximeta
title_full_unstemmed	Fluent genomics with plyranges and tximeta
title_short	Fluent genomics with plyranges and tximeta
title_sort	fluent genomics with plyranges and tximeta
topic	Method Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7243206/ https://www.ncbi.nlm.nih.gov/pubmed/32528659 http://dx.doi.org/10.12688/f1000research.22259.1
work_keys_str_mv	AT leestuart fluentgenomicswithplyrangesandtximeta AT lawrencemichael fluentgenomicswithplyrangesandtximeta AT lovemichaeli fluentgenomicswithplyrangesandtximeta

Fluent genomics with plyranges and tximeta

Ejemplares similares