Cargando…
Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads
Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read based phasing. Third-generation nanopore sequence data has demonstrated a long read length, but current interpretatio...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8571015/ https://www.ncbi.nlm.nih.gov/pubmed/34725481 http://dx.doi.org/10.1038/s41592-021-01299-w |
_version_ | 1784594935191699456 |
---|---|
author | Shafin, Kishwar Pesout, Trevor Chang, Pi-Chuan Nattestad, Maria Kolesnikov, Alexey Goel, Sidharth Baid, Gunjan Kolmogorov, Mikhail Eizenga, Jordan M. Miga, Karen H. Carnevali, Paolo Jain, Miten Carroll, Andrew Paten, Benedict |
author_facet | Shafin, Kishwar Pesout, Trevor Chang, Pi-Chuan Nattestad, Maria Kolesnikov, Alexey Goel, Sidharth Baid, Gunjan Kolmogorov, Mikhail Eizenga, Jordan M. Miga, Karen H. Carnevali, Paolo Jain, Miten Carroll, Andrew Paten, Benedict |
author_sort | Shafin, Kishwar |
collection | PubMed |
description | Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read based phasing. Third-generation nanopore sequence data has demonstrated a long read length, but current interpretation methods for its novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline PEPPER-Margin-DeepVariant that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single nucleotide variant identification method at the whole genome-scale and produces high quality single nucleotide variants in segmental duplications and low-mappability regions where short-read based genotyping fails. We show that our pipeline can provide highly-contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% to 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance than the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio-HiFi-polished). |
format | Online Article Text |
id | pubmed-8571015 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-85710152022-05-01 Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads Shafin, Kishwar Pesout, Trevor Chang, Pi-Chuan Nattestad, Maria Kolesnikov, Alexey Goel, Sidharth Baid, Gunjan Kolmogorov, Mikhail Eizenga, Jordan M. Miga, Karen H. Carnevali, Paolo Jain, Miten Carroll, Andrew Paten, Benedict Nat Methods Article Long-read sequencing has the potential to transform variant detection by reaching currently difficult-to-map regions and routinely linking together adjacent variations to enable read based phasing. Third-generation nanopore sequence data has demonstrated a long read length, but current interpretation methods for its novel pore-based signal have unique error profiles, making accurate analysis challenging. Here, we introduce a haplotype-aware variant calling pipeline PEPPER-Margin-DeepVariant that produces state-of-the-art variant calling results with nanopore data. We show that our nanopore-based method outperforms the short-read-based single nucleotide variant identification method at the whole genome-scale and produces high quality single nucleotide variants in segmental duplications and low-mappability regions where short-read based genotyping fails. We show that our pipeline can provide highly-contiguous phase blocks across the genome with nanopore reads, contiguously spanning between 85% to 92% of annotated genes across six samples. We also extend PEPPER-Margin-DeepVariant to PacBio HiFi data, providing an efficient solution with superior performance than the current WhatsHap-DeepVariant standard. Finally, we demonstrate de novo assembly polishing methods that use nanopore and PacBio HiFi reads to produce diploid assemblies with high accuracy (Q35+ nanopore-polished and Q40+ PacBio-HiFi-polished). 2021-11-01 2021-11 /pmc/articles/PMC8571015/ /pubmed/34725481 http://dx.doi.org/10.1038/s41592-021-01299-w Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms |
spellingShingle | Article Shafin, Kishwar Pesout, Trevor Chang, Pi-Chuan Nattestad, Maria Kolesnikov, Alexey Goel, Sidharth Baid, Gunjan Kolmogorov, Mikhail Eizenga, Jordan M. Miga, Karen H. Carnevali, Paolo Jain, Miten Carroll, Andrew Paten, Benedict Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads |
title | Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads |
title_full | Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads |
title_fullStr | Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads |
title_full_unstemmed | Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads |
title_short | Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads |
title_sort | haplotype-aware variant calling with pepper-margin-deepvariant enables high accuracy in nanopore long-reads |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8571015/ https://www.ncbi.nlm.nih.gov/pubmed/34725481 http://dx.doi.org/10.1038/s41592-021-01299-w |
work_keys_str_mv | AT shafinkishwar haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT pesouttrevor haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT changpichuan haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT nattestadmaria haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT kolesnikovalexey haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT goelsidharth haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT baidgunjan haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT kolmogorovmikhail haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT eizengajordanm haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT migakarenh haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT carnevalipaolo haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT jainmiten haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT carrollandrew haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads AT patenbenedict haplotypeawarevariantcallingwithpeppermargindeepvariantenableshighaccuracyinnanoporelongreads |