Cargando…

fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool

BACKGROUND: Decreasing costs make RNA sequencing technologies increasingly affordable for biologists. However, many researchers who can now afford sequencing lack access to resources necessary for downstream analysis. This means that even as algorithms to process RNA-Seq data improve, many biologist...

Descripción completa

Detalles Bibliográficos
Autores principales: Hubbard, Allen, Bomhoff, Matthew, Schmidt, Carl J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7231498/
https://www.ncbi.nlm.nih.gov/pubmed/32461821
http://dx.doi.org/10.7717/peerj.8592
_version_ 1783535203384295424
author Hubbard, Allen
Bomhoff, Matthew
Schmidt, Carl J.
author_facet Hubbard, Allen
Bomhoff, Matthew
Schmidt, Carl J.
author_sort Hubbard, Allen
collection PubMed
description BACKGROUND: Decreasing costs make RNA sequencing technologies increasingly affordable for biologists. However, many researchers who can now afford sequencing lack access to resources necessary for downstream analysis. This means that even as algorithms to process RNA-Seq data improve, many biologists still struggle to manage the sheer volume of data produced by next generation sequencing (NGS) technologies. Scalable bioinformatics tools that exploit multiple platforms are needed to democratize bioinformatics resources in the sequencing era. This is essential for equipping many research groups in the life sciences with the tools to process the increasingly unwieldy datasets they produce. METHODS: One strategy to address this challenge is to develop a modern generation of sequence analysis tools capable of seamless data sharing and communication. Such tools will provide interoperability through offerings of interlinked resources. Systems of interlinked, scalable resources, which often incorporate cloud data storage, are broadly referred to as cyberinfrastructure. Cyberinfrastructure integrated tools will help researchers to robustly analyze large scale datasets by efficiently sharing data burdens across a distributed architecture. Additionally, interoperability will allow emerging tools to cross-adapt features of existing tools. It is important that these tools are designed to be easy to use for biologists. RESULTS: We introduce fRNAkenseq, a powered-by-CyVerse RNA sequencing analysis tool that exhibits interoperability with other resources and meets the needs of biologists for comprehensive, easy to use RNA sequencing analysis. fRNAkenseq leverages a complex set of Application Programming Interfaces (APIs) associated with the NSF-funded cyberinfrastructure project, CyVerse, to execute FASTQ-to-differential expression RNA-Seq analyses. Integrating across bioinformatics platforms, fRNAkenseq also exploits cloud integration and cross-talk with another CyVerse associated tool, CoGe. fRNAkenseq offers novel features for the biologist such as more robust and comprehensive pipelines for enrichment than those currently available by default in a single tool, whether they are cloud-based or local installation. Importantly, cross-talk with CoGe allows fRNAkenseq users to execute RNA-Seq pipelines on an inventory of 47,000 archived genomes stored in CoGe or upload their own draft genome.
format Online
Article
Text
id pubmed-7231498
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-72314982020-05-26 fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool Hubbard, Allen Bomhoff, Matthew Schmidt, Carl J. PeerJ Bioinformatics BACKGROUND: Decreasing costs make RNA sequencing technologies increasingly affordable for biologists. However, many researchers who can now afford sequencing lack access to resources necessary for downstream analysis. This means that even as algorithms to process RNA-Seq data improve, many biologists still struggle to manage the sheer volume of data produced by next generation sequencing (NGS) technologies. Scalable bioinformatics tools that exploit multiple platforms are needed to democratize bioinformatics resources in the sequencing era. This is essential for equipping many research groups in the life sciences with the tools to process the increasingly unwieldy datasets they produce. METHODS: One strategy to address this challenge is to develop a modern generation of sequence analysis tools capable of seamless data sharing and communication. Such tools will provide interoperability through offerings of interlinked resources. Systems of interlinked, scalable resources, which often incorporate cloud data storage, are broadly referred to as cyberinfrastructure. Cyberinfrastructure integrated tools will help researchers to robustly analyze large scale datasets by efficiently sharing data burdens across a distributed architecture. Additionally, interoperability will allow emerging tools to cross-adapt features of existing tools. It is important that these tools are designed to be easy to use for biologists. RESULTS: We introduce fRNAkenseq, a powered-by-CyVerse RNA sequencing analysis tool that exhibits interoperability with other resources and meets the needs of biologists for comprehensive, easy to use RNA sequencing analysis. fRNAkenseq leverages a complex set of Application Programming Interfaces (APIs) associated with the NSF-funded cyberinfrastructure project, CyVerse, to execute FASTQ-to-differential expression RNA-Seq analyses. Integrating across bioinformatics platforms, fRNAkenseq also exploits cloud integration and cross-talk with another CyVerse associated tool, CoGe. fRNAkenseq offers novel features for the biologist such as more robust and comprehensive pipelines for enrichment than those currently available by default in a single tool, whether they are cloud-based or local installation. Importantly, cross-talk with CoGe allows fRNAkenseq users to execute RNA-Seq pipelines on an inventory of 47,000 archived genomes stored in CoGe or upload their own draft genome. PeerJ Inc. 2020-05-14 /pmc/articles/PMC7231498/ /pubmed/32461821 http://dx.doi.org/10.7717/peerj.8592 Text en © 2020 Hubbard et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Hubbard, Allen
Bomhoff, Matthew
Schmidt, Carl J.
fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool
title fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool
title_full fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool
title_fullStr fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool
title_full_unstemmed fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool
title_short fRNAkenseq: a fully powered-by-CyVerse cloud integrated RNA-sequencing analysis tool
title_sort frnakenseq: a fully powered-by-cyverse cloud integrated rna-sequencing analysis tool
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7231498/
https://www.ncbi.nlm.nih.gov/pubmed/32461821
http://dx.doi.org/10.7717/peerj.8592
work_keys_str_mv AT hubbardallen frnakenseqafullypoweredbycyversecloudintegratedrnasequencinganalysistool
AT bomhoffmatthew frnakenseqafullypoweredbycyversecloudintegratedrnasequencinganalysistool
AT schmidtcarlj frnakenseqafullypoweredbycyversecloudintegratedrnasequencinganalysistool