Cargando…

CAMPAREE: a robust and configurable RNA expression simulator

BACKGROUND: The accurate interpretation of RNA-Seq data presents a moving target as scientists continue to introduce new experimental techniques and analysis algorithms. Simulated datasets are an invaluable tool to accurately assess the performance of RNA-Seq analysis methods. However, existing RNA-...

Descripción completa

Detalles Bibliográficos
Autores principales: Lahens, Nicholas F., Brooks, Thomas G., Sarantopoulou, Dimitra, Nayak, Soumyashant, Lawrence, Cris, Mrčela, Antonijo, Srinivasan, Anand, Schug, Jonathan, Hogenesch, John B., Barash, Yoseph, Grant, Gregory R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8467241/
https://www.ncbi.nlm.nih.gov/pubmed/34563123
http://dx.doi.org/10.1186/s12864-021-07934-2
_version_ 1784573346964307968
author Lahens, Nicholas F.
Brooks, Thomas G.
Sarantopoulou, Dimitra
Nayak, Soumyashant
Lawrence, Cris
Mrčela, Antonijo
Srinivasan, Anand
Schug, Jonathan
Hogenesch, John B.
Barash, Yoseph
Grant, Gregory R.
author_facet Lahens, Nicholas F.
Brooks, Thomas G.
Sarantopoulou, Dimitra
Nayak, Soumyashant
Lawrence, Cris
Mrčela, Antonijo
Srinivasan, Anand
Schug, Jonathan
Hogenesch, John B.
Barash, Yoseph
Grant, Gregory R.
author_sort Lahens, Nicholas F.
collection PubMed
description BACKGROUND: The accurate interpretation of RNA-Seq data presents a moving target as scientists continue to introduce new experimental techniques and analysis algorithms. Simulated datasets are an invaluable tool to accurately assess the performance of RNA-Seq analysis methods. However, existing RNA-Seq simulators focus on modeling the technical biases and artifacts of sequencing, rather than on simulating the original RNA samples. A first step in simulating RNA-Seq is to simulate RNA. RESULTS: To fill this need, we developed the Configurable And Modular Program Allowing RNA Expression Emulation (CAMPAREE), a simulator using empirical data to simulate diploid RNA samples at the level of individual molecules. We demonstrated CAMPAREE’s use for generating idealized coverage plots from real data, and for adding the ability to generate allele-specific data to existing RNA-Seq simulators that do not natively support this feature. CONCLUSIONS: Separating input sample modeling from library preparation/sequencing offers added flexibility for both users and developers to mix-and-match different sample and sequencing simulators to suit their specific needs. Furthermore, the ability to maintain sample and sequencing simulators independently provides greater agility to incorporate new biological findings about transcriptomics and new developments in sequencing technologies. Additionally, by simulating at the level of individual molecules, CAMPAREE has the potential to model molecules transcribed from the same genes as a heterogeneous population of transcripts with different states of degradation and processing (splicing, editing, etc.). CAMPAREE was developed in Python, is open source, and freely available at https://github.com/itmat/CAMPAREE. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07934-2.
format Online
Article
Text
id pubmed-8467241
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-84672412021-09-28 CAMPAREE: a robust and configurable RNA expression simulator Lahens, Nicholas F. Brooks, Thomas G. Sarantopoulou, Dimitra Nayak, Soumyashant Lawrence, Cris Mrčela, Antonijo Srinivasan, Anand Schug, Jonathan Hogenesch, John B. Barash, Yoseph Grant, Gregory R. BMC Genomics Software BACKGROUND: The accurate interpretation of RNA-Seq data presents a moving target as scientists continue to introduce new experimental techniques and analysis algorithms. Simulated datasets are an invaluable tool to accurately assess the performance of RNA-Seq analysis methods. However, existing RNA-Seq simulators focus on modeling the technical biases and artifacts of sequencing, rather than on simulating the original RNA samples. A first step in simulating RNA-Seq is to simulate RNA. RESULTS: To fill this need, we developed the Configurable And Modular Program Allowing RNA Expression Emulation (CAMPAREE), a simulator using empirical data to simulate diploid RNA samples at the level of individual molecules. We demonstrated CAMPAREE’s use for generating idealized coverage plots from real data, and for adding the ability to generate allele-specific data to existing RNA-Seq simulators that do not natively support this feature. CONCLUSIONS: Separating input sample modeling from library preparation/sequencing offers added flexibility for both users and developers to mix-and-match different sample and sequencing simulators to suit their specific needs. Furthermore, the ability to maintain sample and sequencing simulators independently provides greater agility to incorporate new biological findings about transcriptomics and new developments in sequencing technologies. Additionally, by simulating at the level of individual molecules, CAMPAREE has the potential to model molecules transcribed from the same genes as a heterogeneous population of transcripts with different states of degradation and processing (splicing, editing, etc.). CAMPAREE was developed in Python, is open source, and freely available at https://github.com/itmat/CAMPAREE. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-021-07934-2. BioMed Central 2021-09-25 /pmc/articles/PMC8467241/ /pubmed/34563123 http://dx.doi.org/10.1186/s12864-021-07934-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Software
Lahens, Nicholas F.
Brooks, Thomas G.
Sarantopoulou, Dimitra
Nayak, Soumyashant
Lawrence, Cris
Mrčela, Antonijo
Srinivasan, Anand
Schug, Jonathan
Hogenesch, John B.
Barash, Yoseph
Grant, Gregory R.
CAMPAREE: a robust and configurable RNA expression simulator
title CAMPAREE: a robust and configurable RNA expression simulator
title_full CAMPAREE: a robust and configurable RNA expression simulator
title_fullStr CAMPAREE: a robust and configurable RNA expression simulator
title_full_unstemmed CAMPAREE: a robust and configurable RNA expression simulator
title_short CAMPAREE: a robust and configurable RNA expression simulator
title_sort camparee: a robust and configurable rna expression simulator
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8467241/
https://www.ncbi.nlm.nih.gov/pubmed/34563123
http://dx.doi.org/10.1186/s12864-021-07934-2
work_keys_str_mv AT lahensnicholasf campareearobustandconfigurablernaexpressionsimulator
AT brooksthomasg campareearobustandconfigurablernaexpressionsimulator
AT sarantopouloudimitra campareearobustandconfigurablernaexpressionsimulator
AT nayaksoumyashant campareearobustandconfigurablernaexpressionsimulator
AT lawrencecris campareearobustandconfigurablernaexpressionsimulator
AT mrcelaantonijo campareearobustandconfigurablernaexpressionsimulator
AT srinivasananand campareearobustandconfigurablernaexpressionsimulator
AT schugjonathan campareearobustandconfigurablernaexpressionsimulator
AT hogeneschjohnb campareearobustandconfigurablernaexpressionsimulator
AT barashyoseph campareearobustandconfigurablernaexpressionsimulator
AT grantgregoryr campareearobustandconfigurablernaexpressionsimulator