Cargando…

NGSNGS: next-generation simulator for next-generation sequencing data

SUMMARY: With the rapid expansion of the capabilities of the DNA sequencers throughout the different sequencing generations, the quantity of generated data has likewise increased. This evolution has also led to new bioinformatical methods, for which in silico data have become crucial when verifying...

Descripción completa

Detalles Bibliográficos
Autores principales: Henriksen, Rasmus Amund, Zhao, Lei, Korneliussen, Thorfinn Sand
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9891242/
https://www.ncbi.nlm.nih.gov/pubmed/36661298
http://dx.doi.org/10.1093/bioinformatics/btad041
_version_ 1784881101601243136
author Henriksen, Rasmus Amund
Zhao, Lei
Korneliussen, Thorfinn Sand
author_facet Henriksen, Rasmus Amund
Zhao, Lei
Korneliussen, Thorfinn Sand
author_sort Henriksen, Rasmus Amund
collection PubMed
description SUMMARY: With the rapid expansion of the capabilities of the DNA sequencers throughout the different sequencing generations, the quantity of generated data has likewise increased. This evolution has also led to new bioinformatical methods, for which in silico data have become crucial when verifying the accuracy of a model or the robustness of a genomic analysis pipeline. Here, we present a multithreaded next-generation simulator for next-generation sequencing data (NGSNGS), which simulates reads faster than currently available methods and programs. NGSNGS can simulate reads with platform-specific characteristics based on nucleotide quality score profiles as well as including a post-mortem damage model which is relevant for simulating ancient DNA. The simulated sequences are sampled (with replacement) from a reference DNA genome, which can represent a haploid genome, polyploid assemblies or even population haplotypes and allows the user to simulate known variable sites directly. The program is implemented in a multithreading framework and is factors faster than currently available tools while extending their feature set and possible output formats. AVAILABILITY AND IMPLEMENTATION: The method and associated programs are released as open-source software, code and user manual are available at https://github.com/RAHenriksen/NGSNGS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9891242
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98912422023-02-02 NGSNGS: next-generation simulator for next-generation sequencing data Henriksen, Rasmus Amund Zhao, Lei Korneliussen, Thorfinn Sand Bioinformatics Applications Note SUMMARY: With the rapid expansion of the capabilities of the DNA sequencers throughout the different sequencing generations, the quantity of generated data has likewise increased. This evolution has also led to new bioinformatical methods, for which in silico data have become crucial when verifying the accuracy of a model or the robustness of a genomic analysis pipeline. Here, we present a multithreaded next-generation simulator for next-generation sequencing data (NGSNGS), which simulates reads faster than currently available methods and programs. NGSNGS can simulate reads with platform-specific characteristics based on nucleotide quality score profiles as well as including a post-mortem damage model which is relevant for simulating ancient DNA. The simulated sequences are sampled (with replacement) from a reference DNA genome, which can represent a haploid genome, polyploid assemblies or even population haplotypes and allows the user to simulate known variable sites directly. The program is implemented in a multithreading framework and is factors faster than currently available tools while extending their feature set and possible output formats. AVAILABILITY AND IMPLEMENTATION: The method and associated programs are released as open-source software, code and user manual are available at https://github.com/RAHenriksen/NGSNGS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2023-01-20 /pmc/articles/PMC9891242/ /pubmed/36661298 http://dx.doi.org/10.1093/bioinformatics/btad041 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Applications Note
Henriksen, Rasmus Amund
Zhao, Lei
Korneliussen, Thorfinn Sand
NGSNGS: next-generation simulator for next-generation sequencing data
title NGSNGS: next-generation simulator for next-generation sequencing data
title_full NGSNGS: next-generation simulator for next-generation sequencing data
title_fullStr NGSNGS: next-generation simulator for next-generation sequencing data
title_full_unstemmed NGSNGS: next-generation simulator for next-generation sequencing data
title_short NGSNGS: next-generation simulator for next-generation sequencing data
title_sort ngsngs: next-generation simulator for next-generation sequencing data
topic Applications Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9891242/
https://www.ncbi.nlm.nih.gov/pubmed/36661298
http://dx.doi.org/10.1093/bioinformatics/btad041
work_keys_str_mv AT henriksenrasmusamund ngsngsnextgenerationsimulatorfornextgenerationsequencingdata
AT zhaolei ngsngsnextgenerationsimulatorfornextgenerationsequencingdata
AT korneliussenthorfinnsand ngsngsnextgenerationsimulatorfornextgenerationsequencingdata