Cargando…

Parliament2: Accurate structural variant calling at scale

BACKGROUND: Structural variants (SVs) are critical contributors to genetic diversity and genomic disease. To predict the phenotypic impact of SVs, there is a need for better estimates of both the occurrence and frequency of SVs, preferably from large, ethnically diverse cohorts. Thus, the current st...

Descripción completa

Detalles Bibliográficos
Autores principales: Zarate, Samantha, Carroll, Andrew, Mahmoud, Medhat, Krasheninina, Olga, Jun, Goo, Salerno, William J, Schatz, Michael C, Boerwinkle, Eric, Gibbs, Richard A, Sedlazeck, Fritz J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7751401/
https://www.ncbi.nlm.nih.gov/pubmed/33347570
http://dx.doi.org/10.1093/gigascience/giaa145
_version_ 1783625658076758016
author Zarate, Samantha
Carroll, Andrew
Mahmoud, Medhat
Krasheninina, Olga
Jun, Goo
Salerno, William J
Schatz, Michael C
Boerwinkle, Eric
Gibbs, Richard A
Sedlazeck, Fritz J
author_facet Zarate, Samantha
Carroll, Andrew
Mahmoud, Medhat
Krasheninina, Olga
Jun, Goo
Salerno, William J
Schatz, Michael C
Boerwinkle, Eric
Gibbs, Richard A
Sedlazeck, Fritz J
author_sort Zarate, Samantha
collection PubMed
description BACKGROUND: Structural variants (SVs) are critical contributors to genetic diversity and genomic disease. To predict the phenotypic impact of SVs, there is a need for better estimates of both the occurrence and frequency of SVs, preferably from large, ethnically diverse cohorts. Thus, the current standard approach requires the use of short paired-end reads, which remain challenging to detect, especially at the scale of hundreds to thousands of samples. FINDINGS: We present Parliament2, a consensus SV framework that leverages multiple best-in-class methods to identify high-quality SVs from short-read DNA sequence data at scale. Parliament2 incorporates pre-installed SV callers that are optimized for efficient execution in parallel to reduce the overall runtime and costs. We demonstrate the accuracy of Parliament2 when applied to data from NovaSeq and HiSeq X platforms with the Genome in a Bottle (GIAB) SV call set across all size classes. The reported quality score per SV is calibrated across different SV types and size classes. Parliament2 has the highest F1 score (74.27%) measured across the independent gold standard from GIAB. We illustrate the compute performance by processing all 1000 Genomes samples (2,691 samples) in <1 day on GRCH38. Parliament2 improves the runtime performance of individual methods and is open source (https://github.com/slzarate/parliament2), and a Docker image, as well as a WDL implementation, is available. CONCLUSION: Parliament2 provides both a highly accurate single-sample SV call set from short-read DNA sequence data and enables cost-efficient application over cloud or cluster environments, processing thousands of samples.
format Online
Article
Text
id pubmed-7751401
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77514012020-12-29 Parliament2: Accurate structural variant calling at scale Zarate, Samantha Carroll, Andrew Mahmoud, Medhat Krasheninina, Olga Jun, Goo Salerno, William J Schatz, Michael C Boerwinkle, Eric Gibbs, Richard A Sedlazeck, Fritz J Gigascience Technical Note BACKGROUND: Structural variants (SVs) are critical contributors to genetic diversity and genomic disease. To predict the phenotypic impact of SVs, there is a need for better estimates of both the occurrence and frequency of SVs, preferably from large, ethnically diverse cohorts. Thus, the current standard approach requires the use of short paired-end reads, which remain challenging to detect, especially at the scale of hundreds to thousands of samples. FINDINGS: We present Parliament2, a consensus SV framework that leverages multiple best-in-class methods to identify high-quality SVs from short-read DNA sequence data at scale. Parliament2 incorporates pre-installed SV callers that are optimized for efficient execution in parallel to reduce the overall runtime and costs. We demonstrate the accuracy of Parliament2 when applied to data from NovaSeq and HiSeq X platforms with the Genome in a Bottle (GIAB) SV call set across all size classes. The reported quality score per SV is calibrated across different SV types and size classes. Parliament2 has the highest F1 score (74.27%) measured across the independent gold standard from GIAB. We illustrate the compute performance by processing all 1000 Genomes samples (2,691 samples) in <1 day on GRCH38. Parliament2 improves the runtime performance of individual methods and is open source (https://github.com/slzarate/parliament2), and a Docker image, as well as a WDL implementation, is available. CONCLUSION: Parliament2 provides both a highly accurate single-sample SV call set from short-read DNA sequence data and enables cost-efficient application over cloud or cluster environments, processing thousands of samples. Oxford University Press 2020-12-21 /pmc/articles/PMC7751401/ /pubmed/33347570 http://dx.doi.org/10.1093/gigascience/giaa145 Text en © The Author(s) 2020. Published by Oxford University Press GigaScience. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Zarate, Samantha
Carroll, Andrew
Mahmoud, Medhat
Krasheninina, Olga
Jun, Goo
Salerno, William J
Schatz, Michael C
Boerwinkle, Eric
Gibbs, Richard A
Sedlazeck, Fritz J
Parliament2: Accurate structural variant calling at scale
title Parliament2: Accurate structural variant calling at scale
title_full Parliament2: Accurate structural variant calling at scale
title_fullStr Parliament2: Accurate structural variant calling at scale
title_full_unstemmed Parliament2: Accurate structural variant calling at scale
title_short Parliament2: Accurate structural variant calling at scale
title_sort parliament2: accurate structural variant calling at scale
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7751401/
https://www.ncbi.nlm.nih.gov/pubmed/33347570
http://dx.doi.org/10.1093/gigascience/giaa145
work_keys_str_mv AT zaratesamantha parliament2accuratestructuralvariantcallingatscale
AT carrollandrew parliament2accuratestructuralvariantcallingatscale
AT mahmoudmedhat parliament2accuratestructuralvariantcallingatscale
AT krashenininaolga parliament2accuratestructuralvariantcallingatscale
AT jungoo parliament2accuratestructuralvariantcallingatscale
AT salernowilliamj parliament2accuratestructuralvariantcallingatscale
AT schatzmichaelc parliament2accuratestructuralvariantcallingatscale
AT boerwinkleeric parliament2accuratestructuralvariantcallingatscale
AT gibbsricharda parliament2accuratestructuralvariantcallingatscale
AT sedlazeckfritzj parliament2accuratestructuralvariantcallingatscale