Cargando…
The seeker R package: simplified fetching and processing of transcriptome data
Transcriptome data have become invaluable for interrogating biological systems. Preparing a transcriptome dataset for analysis, particularly an RNA-seq dataset, entails multiple steps and software programs, each with its own command-line interface (CLI). Although these CLIs are powerful, they often...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9648347/ https://www.ncbi.nlm.nih.gov/pubmed/36389425 http://dx.doi.org/10.7717/peerj.14372 |
_version_ | 1784827563977211904 |
---|---|
author | Schoenbachler, Joshua L. Hughey, Jacob J. |
author_facet | Schoenbachler, Joshua L. Hughey, Jacob J. |
author_sort | Schoenbachler, Joshua L. |
collection | PubMed |
description | Transcriptome data have become invaluable for interrogating biological systems. Preparing a transcriptome dataset for analysis, particularly an RNA-seq dataset, entails multiple steps and software programs, each with its own command-line interface (CLI). Although these CLIs are powerful, they often require shell scripting for automation and parallelization, which can have a high learning curve, especially when the details of the CLIs vary from one tool to another. However, many individuals working with transcriptome data are already familiar with R due to the plethora and popularity of R-based tools for analyzing biological data. Thus, we developed an R package called seeker for simplified fetching and processing of RNA-seq and microarray data. Seeker is a wrapper around various existing tools, and provides a standard interface, simple parallelization, and detailed logging. Seeker’s primary output—sample metadata and gene expression values based on Entrez or Ensembl Gene IDs—can be directly plugged into a differential expression analysis. To maximize reproducibility, seeker is available as a standalone R package and in a Docker image that includes all dependencies, both of which are accessible at https://seeker.hugheylab.org. |
format | Online Article Text |
id | pubmed-9648347 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-96483472022-11-15 The seeker R package: simplified fetching and processing of transcriptome data Schoenbachler, Joshua L. Hughey, Jacob J. PeerJ Bioinformatics Transcriptome data have become invaluable for interrogating biological systems. Preparing a transcriptome dataset for analysis, particularly an RNA-seq dataset, entails multiple steps and software programs, each with its own command-line interface (CLI). Although these CLIs are powerful, they often require shell scripting for automation and parallelization, which can have a high learning curve, especially when the details of the CLIs vary from one tool to another. However, many individuals working with transcriptome data are already familiar with R due to the plethora and popularity of R-based tools for analyzing biological data. Thus, we developed an R package called seeker for simplified fetching and processing of RNA-seq and microarray data. Seeker is a wrapper around various existing tools, and provides a standard interface, simple parallelization, and detailed logging. Seeker’s primary output—sample metadata and gene expression values based on Entrez or Ensembl Gene IDs—can be directly plugged into a differential expression analysis. To maximize reproducibility, seeker is available as a standalone R package and in a Docker image that includes all dependencies, both of which are accessible at https://seeker.hugheylab.org. PeerJ Inc. 2022-11-07 /pmc/articles/PMC9648347/ /pubmed/36389425 http://dx.doi.org/10.7717/peerj.14372 Text en © 2022 Schoenbachler and Hughey https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Schoenbachler, Joshua L. Hughey, Jacob J. The seeker R package: simplified fetching and processing of transcriptome data |
title | The seeker R package: simplified fetching and processing of transcriptome data |
title_full | The seeker R package: simplified fetching and processing of transcriptome data |
title_fullStr | The seeker R package: simplified fetching and processing of transcriptome data |
title_full_unstemmed | The seeker R package: simplified fetching and processing of transcriptome data |
title_short | The seeker R package: simplified fetching and processing of transcriptome data |
title_sort | seeker r package: simplified fetching and processing of transcriptome data |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9648347/ https://www.ncbi.nlm.nih.gov/pubmed/36389425 http://dx.doi.org/10.7717/peerj.14372 |
work_keys_str_mv | AT schoenbachlerjoshual theseekerrpackagesimplifiedfetchingandprocessingoftranscriptomedata AT hugheyjacobj theseekerrpackagesimplifiedfetchingandprocessingoftranscriptomedata AT schoenbachlerjoshual seekerrpackagesimplifiedfetchingandprocessingoftranscriptomedata AT hugheyjacobj seekerrpackagesimplifiedfetchingandprocessingoftranscriptomedata |