Cargando…

FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods

Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Becker, Timothy, Lee, Wan-Ping, Leone, Joseph, Zhu, Qihui, Zhang, Chengsheng, Liu, Silvia, Sargent, Jack, Shanker, Kritika, Mil-homens, Adam, Cerveira, Eliza, Ryan, Mallory, Cha, Jane, Navarro, Fabio C. P., Galeev, Timur, Gerstein, Mark, Mills, Ryan E., Shin, Dong-Guk, Lee, Charles, Malhotra, Ankit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859555/
https://www.ncbi.nlm.nih.gov/pubmed/29559002
http://dx.doi.org/10.1186/s13059-018-1404-6
Descripción
Sumario:Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-018-1404-6) contains supplementary material, which is available to authorized users.