Cargando…

Pysim-sv: a package for simulating structural variation data with GC-biases

BACKGROUND: Structural variations (SVs) are wide-spread in human genomes and may have important implications in disease-related and evolutionary studies. High-throughput sequencing (HTS) has become a major platform for SV detection and simulation serves as a powerful and cost-effective approach for...

Descripción completa

Detalles Bibliográficos
Autores principales: Xia, Yuchao, Liu, Yun, Deng, Minghua, Xi, Ruibin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374556/
https://www.ncbi.nlm.nih.gov/pubmed/28361688
http://dx.doi.org/10.1186/s12859-017-1464-8
_version_ 1782518911489540096
author Xia, Yuchao
Liu, Yun
Deng, Minghua
Xi, Ruibin
author_facet Xia, Yuchao
Liu, Yun
Deng, Minghua
Xi, Ruibin
author_sort Xia, Yuchao
collection PubMed
description BACKGROUND: Structural variations (SVs) are wide-spread in human genomes and may have important implications in disease-related and evolutionary studies. High-throughput sequencing (HTS) has become a major platform for SV detection and simulation serves as a powerful and cost-effective approach for benchmarking SV detection algorithms. Accurate performance assessment by simulation requires the simulator capable of generating simulation data with all important features of real data, such GC biases in HTS data and various complexities in tumor data. However, no available package has systematically addressed all issues in data simulation for SV benchmarking. RESULTS: Pysim-sv is a package for simulating HTS data to evaluate performance of SV detection algorithms. Pysim-sv can introduce a wide spectrum of germline and somatic genomic variations. The package contains functionalities to simulate tumor data with aneuploidy and heterogeneous subclones, which is very useful in assessing algorithm performance in tumor studies. Furthermore, Pysim-sv can introduce GC-bias, the most important and prevalent bias in HTS data, in the simulated HTS data. CONCLUSIONS: Pysim-sv provides an unbiased toolkit for evaluating HTS-based SV detection algorithms.
format Online
Article
Text
id pubmed-5374556
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53745562017-03-31 Pysim-sv: a package for simulating structural variation data with GC-biases Xia, Yuchao Liu, Yun Deng, Minghua Xi, Ruibin BMC Bioinformatics Research BACKGROUND: Structural variations (SVs) are wide-spread in human genomes and may have important implications in disease-related and evolutionary studies. High-throughput sequencing (HTS) has become a major platform for SV detection and simulation serves as a powerful and cost-effective approach for benchmarking SV detection algorithms. Accurate performance assessment by simulation requires the simulator capable of generating simulation data with all important features of real data, such GC biases in HTS data and various complexities in tumor data. However, no available package has systematically addressed all issues in data simulation for SV benchmarking. RESULTS: Pysim-sv is a package for simulating HTS data to evaluate performance of SV detection algorithms. Pysim-sv can introduce a wide spectrum of germline and somatic genomic variations. The package contains functionalities to simulate tumor data with aneuploidy and heterogeneous subclones, which is very useful in assessing algorithm performance in tumor studies. Furthermore, Pysim-sv can introduce GC-bias, the most important and prevalent bias in HTS data, in the simulated HTS data. CONCLUSIONS: Pysim-sv provides an unbiased toolkit for evaluating HTS-based SV detection algorithms. BioMed Central 2017-03-14 /pmc/articles/PMC5374556/ /pubmed/28361688 http://dx.doi.org/10.1186/s12859-017-1464-8 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Xia, Yuchao
Liu, Yun
Deng, Minghua
Xi, Ruibin
Pysim-sv: a package for simulating structural variation data with GC-biases
title Pysim-sv: a package for simulating structural variation data with GC-biases
title_full Pysim-sv: a package for simulating structural variation data with GC-biases
title_fullStr Pysim-sv: a package for simulating structural variation data with GC-biases
title_full_unstemmed Pysim-sv: a package for simulating structural variation data with GC-biases
title_short Pysim-sv: a package for simulating structural variation data with GC-biases
title_sort pysim-sv: a package for simulating structural variation data with gc-biases
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5374556/
https://www.ncbi.nlm.nih.gov/pubmed/28361688
http://dx.doi.org/10.1186/s12859-017-1464-8
work_keys_str_mv AT xiayuchao pysimsvapackageforsimulatingstructuralvariationdatawithgcbiases
AT liuyun pysimsvapackageforsimulatingstructuralvariationdatawithgcbiases
AT dengminghua pysimsvapackageforsimulatingstructuralvariationdatawithgcbiases
AT xiruibin pysimsvapackageforsimulatingstructuralvariationdatawithgcbiases