Cargando…

FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods

Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion mode...

Descripción completa

Detalles Bibliográficos
Autores principales: Becker, Timothy, Lee, Wan-Ping, Leone, Joseph, Zhu, Qihui, Zhang, Chengsheng, Liu, Silvia, Sargent, Jack, Shanker, Kritika, Mil-homens, Adam, Cerveira, Eliza, Ryan, Mallory, Cha, Jane, Navarro, Fabio C. P., Galeev, Timur, Gerstein, Mark, Mills, Ryan E., Shin, Dong-Guk, Lee, Charles, Malhotra, Ankit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859555/
https://www.ncbi.nlm.nih.gov/pubmed/29559002
http://dx.doi.org/10.1186/s13059-018-1404-6
_version_ 1783307847519436800
author Becker, Timothy
Lee, Wan-Ping
Leone, Joseph
Zhu, Qihui
Zhang, Chengsheng
Liu, Silvia
Sargent, Jack
Shanker, Kritika
Mil-homens, Adam
Cerveira, Eliza
Ryan, Mallory
Cha, Jane
Navarro, Fabio C. P.
Galeev, Timur
Gerstein, Mark
Mills, Ryan E.
Shin, Dong-Guk
Lee, Charles
Malhotra, Ankit
author_facet Becker, Timothy
Lee, Wan-Ping
Leone, Joseph
Zhu, Qihui
Zhang, Chengsheng
Liu, Silvia
Sargent, Jack
Shanker, Kritika
Mil-homens, Adam
Cerveira, Eliza
Ryan, Mallory
Cha, Jane
Navarro, Fabio C. P.
Galeev, Timur
Gerstein, Mark
Mills, Ryan E.
Shin, Dong-Guk
Lee, Charles
Malhotra, Ankit
author_sort Becker, Timothy
collection PubMed
description Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-018-1404-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5859555
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-58595552018-03-22 FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods Becker, Timothy Lee, Wan-Ping Leone, Joseph Zhu, Qihui Zhang, Chengsheng Liu, Silvia Sargent, Jack Shanker, Kritika Mil-homens, Adam Cerveira, Eliza Ryan, Mallory Cha, Jane Navarro, Fabio C. P. Galeev, Timur Gerstein, Mark Mills, Ryan E. Shin, Dong-Guk Lee, Charles Malhotra, Ankit Genome Biol Method Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-018-1404-6) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-20 /pmc/articles/PMC5859555/ /pubmed/29559002 http://dx.doi.org/10.1186/s13059-018-1404-6 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Method
Becker, Timothy
Lee, Wan-Ping
Leone, Joseph
Zhu, Qihui
Zhang, Chengsheng
Liu, Silvia
Sargent, Jack
Shanker, Kritika
Mil-homens, Adam
Cerveira, Eliza
Ryan, Mallory
Cha, Jane
Navarro, Fabio C. P.
Galeev, Timur
Gerstein, Mark
Mills, Ryan E.
Shin, Dong-Guk
Lee, Charles
Malhotra, Ankit
FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
title FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
title_full FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
title_fullStr FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
title_full_unstemmed FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
title_short FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
title_sort fusorsv: an algorithm for optimally combining data from multiple structural variation detection methods
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859555/
https://www.ncbi.nlm.nih.gov/pubmed/29559002
http://dx.doi.org/10.1186/s13059-018-1404-6
work_keys_str_mv AT beckertimothy fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT leewanping fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT leonejoseph fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT zhuqihui fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT zhangchengsheng fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT liusilvia fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT sargentjack fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT shankerkritika fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT milhomensadam fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT cerveiraeliza fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT ryanmallory fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT chajane fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT navarrofabiocp fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT galeevtimur fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT gersteinmark fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT millsryane fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT shindongguk fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT leecharles fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods
AT malhotraankit fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods