Cargando…
FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion mode...
Autores principales: | , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859555/ https://www.ncbi.nlm.nih.gov/pubmed/29559002 http://dx.doi.org/10.1186/s13059-018-1404-6 |
_version_ | 1783307847519436800 |
---|---|
author | Becker, Timothy Lee, Wan-Ping Leone, Joseph Zhu, Qihui Zhang, Chengsheng Liu, Silvia Sargent, Jack Shanker, Kritika Mil-homens, Adam Cerveira, Eliza Ryan, Mallory Cha, Jane Navarro, Fabio C. P. Galeev, Timur Gerstein, Mark Mills, Ryan E. Shin, Dong-Guk Lee, Charles Malhotra, Ankit |
author_facet | Becker, Timothy Lee, Wan-Ping Leone, Joseph Zhu, Qihui Zhang, Chengsheng Liu, Silvia Sargent, Jack Shanker, Kritika Mil-homens, Adam Cerveira, Eliza Ryan, Mallory Cha, Jane Navarro, Fabio C. P. Galeev, Timur Gerstein, Mark Mills, Ryan E. Shin, Dong-Guk Lee, Charles Malhotra, Ankit |
author_sort | Becker, Timothy |
collection | PubMed |
description | Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-018-1404-6) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5859555 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-58595552018-03-22 FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods Becker, Timothy Lee, Wan-Ping Leone, Joseph Zhu, Qihui Zhang, Chengsheng Liu, Silvia Sargent, Jack Shanker, Kritika Mil-homens, Adam Cerveira, Eliza Ryan, Mallory Cha, Jane Navarro, Fabio C. P. Galeev, Timur Gerstein, Mark Mills, Ryan E. Shin, Dong-Guk Lee, Charles Malhotra, Ankit Genome Biol Method Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s13059-018-1404-6) contains supplementary material, which is available to authorized users. BioMed Central 2018-03-20 /pmc/articles/PMC5859555/ /pubmed/29559002 http://dx.doi.org/10.1186/s13059-018-1404-6 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Method Becker, Timothy Lee, Wan-Ping Leone, Joseph Zhu, Qihui Zhang, Chengsheng Liu, Silvia Sargent, Jack Shanker, Kritika Mil-homens, Adam Cerveira, Eliza Ryan, Mallory Cha, Jane Navarro, Fabio C. P. Galeev, Timur Gerstein, Mark Mills, Ryan E. Shin, Dong-Guk Lee, Charles Malhotra, Ankit FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods |
title | FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods |
title_full | FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods |
title_fullStr | FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods |
title_full_unstemmed | FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods |
title_short | FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods |
title_sort | fusorsv: an algorithm for optimally combining data from multiple structural variation detection methods |
topic | Method |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5859555/ https://www.ncbi.nlm.nih.gov/pubmed/29559002 http://dx.doi.org/10.1186/s13059-018-1404-6 |
work_keys_str_mv | AT beckertimothy fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT leewanping fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT leonejoseph fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT zhuqihui fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT zhangchengsheng fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT liusilvia fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT sargentjack fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT shankerkritika fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT milhomensadam fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT cerveiraeliza fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT ryanmallory fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT chajane fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT navarrofabiocp fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT galeevtimur fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT gersteinmark fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT millsryane fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT shindongguk fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT leecharles fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods AT malhotraankit fusorsvanalgorithmforoptimallycombiningdatafrommultiplestructuralvariationdetectionmethods |