Cargando…

SVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads

BACKGROUND: Characterization of genomic structural variation (SV) is essential to expanding the research and clinical applications of genome sequencing. Reliance upon short DNA fragment paired end sequencing has yielded a wealth of single nucleotide variants and internal sequencing read insertions-d...

Descripción completa

Detalles Bibliográficos
Autores principales: Hampton, Oliver A., English, Adam C., Wang, Mark, Salerno, William J., Liu, Yue, Muzny, Donna M., Han, Yi, Wheeler, David A., Worley, Kim C., Lupski, James R., Gibbs, Richard A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5629590/
https://www.ncbi.nlm.nih.gov/pubmed/28984202
http://dx.doi.org/10.1186/s12864-017-4021-y
Descripción
Sumario:BACKGROUND: Characterization of genomic structural variation (SV) is essential to expanding the research and clinical applications of genome sequencing. Reliance upon short DNA fragment paired end sequencing has yielded a wealth of single nucleotide variants and internal sequencing read insertions-deletions, at the cost of limited SV detection. Multi-kilobase DNA fragment mate pair sequencing has supplemented the void in SV detection, but introduced new analytic challenges requiring SV detection tools specifically designed for mate pair sequencing data. Here, we introduce SVachra – Structural Variation Assessment of CHRomosomal Aberrations, a breakpoint calling program that identifies large insertions-deletions, inversions, inter- and intra-chromosomal translocations utilizing both inward and outward facing read types generated by mate pair sequencing. RESULTS: We demonstrate SVachra’s utility by executing the program on large-insert (Illumina Nextera) mate pair sequencing data from the personal genome of a single subject (HS1011). An additional data set of long-read (Pacific BioSciences RSII) was also generated to validate SV calls from SVachra and other comparison SV calling programs. SVachra exhibited the highest validation rate and reported the widest distribution of SV types and size ranges when compared to other SV callers. CONCLUSIONS: SVachra is a highly specific breakpoint calling program that exhibits a more unbiased SV detection methodology than other callers. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-017-4021-y) contains supplementary material, which is available to authorized users.