Cargando…
An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data
BACKGROUND: Recent advances in sequencing technologies make it possible to comprehensively study structural variations (SVs) using sequence data of large-scale populations. Currently, more efforts have been taken to develop methods that call SVs with exact breakpoints. Among these approaches, split-...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3358659/ https://www.ncbi.nlm.nih.gov/pubmed/22537045 http://dx.doi.org/10.1186/1471-2105-13-S6-S6 |
_version_ | 1782233795282337792 |
---|---|
author | Zhang, Jin Wang, Jiayin Wu, Yufeng |
author_facet | Zhang, Jin Wang, Jiayin Wu, Yufeng |
author_sort | Zhang, Jin |
collection | PubMed |
description | BACKGROUND: Recent advances in sequencing technologies make it possible to comprehensively study structural variations (SVs) using sequence data of large-scale populations. Currently, more efforts have been taken to develop methods that call SVs with exact breakpoints. Among these approaches, split-read mapping methods can be applied on low-coverage sequence data. With increasing amount of data generated, more efficient split-read mapping methods are still needed. Also, since sequence errors can not be avoided for the current sequencing technologies, more accurate split-read mapping methods are still needed to better handle sequence errors. RESULTS: In this paper, we present a split-read mapping method implemented in the program SVseq2 which improves our previous work SVseq1. Similar to SVseq1, SVseq2 calls deletions (and insertions) with exact breakpoints. SVseq2 achieves more accurate calling through split-read mapping within focal regions. SVseq2 also has a much desired feature: there is no need to specify the maximum deletion size, while some existing split-read mapping methods need more memory and longer running time when larger maximum deletion size is chosen. SVseq2 is also much faster because it only needs to examine a small number of ways of splitting the reads. Moreover, SVseq2 supports insertion calling from low-coverage sequence data, while SVseq1 only supports deletion finding. The program SVseq2 can be downloaded at http://www.engr.uconn.edu/~jiz08001/. CONCLUSIONS: SVseq2 enables accurate and efficient SV calling through split-read mapping within focal regions using paired-end reads. For many simulated data and real sequence data, SVseq2 outperforms some other existing approaches in accuracy and efficiency, especially when sequence coverage is low. |
format | Online Article Text |
id | pubmed-3358659 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-33586592012-05-24 An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data Zhang, Jin Wang, Jiayin Wu, Yufeng BMC Bioinformatics Proceedings BACKGROUND: Recent advances in sequencing technologies make it possible to comprehensively study structural variations (SVs) using sequence data of large-scale populations. Currently, more efforts have been taken to develop methods that call SVs with exact breakpoints. Among these approaches, split-read mapping methods can be applied on low-coverage sequence data. With increasing amount of data generated, more efficient split-read mapping methods are still needed. Also, since sequence errors can not be avoided for the current sequencing technologies, more accurate split-read mapping methods are still needed to better handle sequence errors. RESULTS: In this paper, we present a split-read mapping method implemented in the program SVseq2 which improves our previous work SVseq1. Similar to SVseq1, SVseq2 calls deletions (and insertions) with exact breakpoints. SVseq2 achieves more accurate calling through split-read mapping within focal regions. SVseq2 also has a much desired feature: there is no need to specify the maximum deletion size, while some existing split-read mapping methods need more memory and longer running time when larger maximum deletion size is chosen. SVseq2 is also much faster because it only needs to examine a small number of ways of splitting the reads. Moreover, SVseq2 supports insertion calling from low-coverage sequence data, while SVseq1 only supports deletion finding. The program SVseq2 can be downloaded at http://www.engr.uconn.edu/~jiz08001/. CONCLUSIONS: SVseq2 enables accurate and efficient SV calling through split-read mapping within focal regions using paired-end reads. For many simulated data and real sequence data, SVseq2 outperforms some other existing approaches in accuracy and efficiency, especially when sequence coverage is low. BioMed Central 2012-04-19 /pmc/articles/PMC3358659/ /pubmed/22537045 http://dx.doi.org/10.1186/1471-2105-13-S6-S6 Text en Copyright ©2012 Zhang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Zhang, Jin Wang, Jiayin Wu, Yufeng An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data |
title | An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data |
title_full | An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data |
title_fullStr | An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data |
title_full_unstemmed | An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data |
title_short | An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data |
title_sort | improved approach for accurate and efficient calling of structural variations with low-coverage sequence data |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3358659/ https://www.ncbi.nlm.nih.gov/pubmed/22537045 http://dx.doi.org/10.1186/1471-2105-13-S6-S6 |
work_keys_str_mv | AT zhangjin animprovedapproachforaccurateandefficientcallingofstructuralvariationswithlowcoveragesequencedata AT wangjiayin animprovedapproachforaccurateandefficientcallingofstructuralvariationswithlowcoveragesequencedata AT wuyufeng animprovedapproachforaccurateandefficientcallingofstructuralvariationswithlowcoveragesequencedata AT zhangjin improvedapproachforaccurateandefficientcallingofstructuralvariationswithlowcoveragesequencedata AT wangjiayin improvedapproachforaccurateandefficientcallingofstructuralvariationswithlowcoveragesequencedata AT wuyufeng improvedapproachforaccurateandefficientcallingofstructuralvariationswithlowcoveragesequencedata |