Cargando…

A complete pedigree-based graph workflow for rare candidate variant analysis

Methods that use a linear genome reference for genome sequencing data analysis are reference-biased. In the field of clinical genetics for rare diseases, a resulting reduction in genotyping accuracy in some regions has likely prevented the resolution of some cases. Pangenome graphs embed population...

Descripción completa

Detalles Bibliográficos
Autores principales: Markello, Charles, Huang, Charles, Rodriguez, Alex, Carroll, Andrew, Chang, Pi-Chuan, Eizenga, Jordan, Markello, Thomas, Haussler, David, Paten, Benedict
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9104704/
https://www.ncbi.nlm.nih.gov/pubmed/35483961
http://dx.doi.org/10.1101/gr.276387.121
_version_ 1784707859298123776
author Markello, Charles
Huang, Charles
Rodriguez, Alex
Carroll, Andrew
Chang, Pi-Chuan
Eizenga, Jordan
Markello, Thomas
Haussler, David
Paten, Benedict
author_facet Markello, Charles
Huang, Charles
Rodriguez, Alex
Carroll, Andrew
Chang, Pi-Chuan
Eizenga, Jordan
Markello, Thomas
Haussler, David
Paten, Benedict
author_sort Markello, Charles
collection PubMed
description Methods that use a linear genome reference for genome sequencing data analysis are reference-biased. In the field of clinical genetics for rare diseases, a resulting reduction in genotyping accuracy in some regions has likely prevented the resolution of some cases. Pangenome graphs embed population variation into a reference structure. Although pangenome graphs have helped to reduce reference mapping bias, further performance improvements are possible. We introduce VG-Pedigree, a pedigree-aware workflow based on the pangenome-mapping tool of Giraffe and the variant calling tool DeepTrio using a specially trained model for Giraffe-based alignments. We demonstrate mapping and variant calling improvements in both single-nucleotide variants (SNVs) and insertion and deletion (indel) variants over those produced by alignments created using BWA-MEM to a linear-reference and Giraffe mapping to a pangenome graph containing data from the 1000 Genomes Project. We have also adapted and upgraded deleterious-variant (DV) detecting methods and programs into a streamlined workflow. We used these workflows in combination to detect small lists of candidate DVs among 15 family quartets and quintets of the Undiagnosed Diseases Program (UDP). All candidate DVs that were previously diagnosed using the Mendelian models covered by the previously published methods were recapitulated by these workflows. The results of these experiments indicate that a slightly greater absolute count of DVs are detected in the proband population than in their matched unaffected siblings.
format Online
Article
Text
id pubmed-9104704
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-91047042022-11-01 A complete pedigree-based graph workflow for rare candidate variant analysis Markello, Charles Huang, Charles Rodriguez, Alex Carroll, Andrew Chang, Pi-Chuan Eizenga, Jordan Markello, Thomas Haussler, David Paten, Benedict Genome Res Method Methods that use a linear genome reference for genome sequencing data analysis are reference-biased. In the field of clinical genetics for rare diseases, a resulting reduction in genotyping accuracy in some regions has likely prevented the resolution of some cases. Pangenome graphs embed population variation into a reference structure. Although pangenome graphs have helped to reduce reference mapping bias, further performance improvements are possible. We introduce VG-Pedigree, a pedigree-aware workflow based on the pangenome-mapping tool of Giraffe and the variant calling tool DeepTrio using a specially trained model for Giraffe-based alignments. We demonstrate mapping and variant calling improvements in both single-nucleotide variants (SNVs) and insertion and deletion (indel) variants over those produced by alignments created using BWA-MEM to a linear-reference and Giraffe mapping to a pangenome graph containing data from the 1000 Genomes Project. We have also adapted and upgraded deleterious-variant (DV) detecting methods and programs into a streamlined workflow. We used these workflows in combination to detect small lists of candidate DVs among 15 family quartets and quintets of the Undiagnosed Diseases Program (UDP). All candidate DVs that were previously diagnosed using the Mendelian models covered by the previously published methods were recapitulated by these workflows. The results of these experiments indicate that a slightly greater absolute count of DVs are detected in the proband population than in their matched unaffected siblings. Cold Spring Harbor Laboratory Press 2022-05 /pmc/articles/PMC9104704/ /pubmed/35483961 http://dx.doi.org/10.1101/gr.276387.121 Text en © 2022 Markello et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Method
Markello, Charles
Huang, Charles
Rodriguez, Alex
Carroll, Andrew
Chang, Pi-Chuan
Eizenga, Jordan
Markello, Thomas
Haussler, David
Paten, Benedict
A complete pedigree-based graph workflow for rare candidate variant analysis
title A complete pedigree-based graph workflow for rare candidate variant analysis
title_full A complete pedigree-based graph workflow for rare candidate variant analysis
title_fullStr A complete pedigree-based graph workflow for rare candidate variant analysis
title_full_unstemmed A complete pedigree-based graph workflow for rare candidate variant analysis
title_short A complete pedigree-based graph workflow for rare candidate variant analysis
title_sort complete pedigree-based graph workflow for rare candidate variant analysis
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9104704/
https://www.ncbi.nlm.nih.gov/pubmed/35483961
http://dx.doi.org/10.1101/gr.276387.121
work_keys_str_mv AT markellocharles acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT huangcharles acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT rodriguezalex acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT carrollandrew acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT changpichuan acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT eizengajordan acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT markellothomas acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT hausslerdavid acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT patenbenedict acompletepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT markellocharles completepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT huangcharles completepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT rodriguezalex completepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT carrollandrew completepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT changpichuan completepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT eizengajordan completepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT markellothomas completepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT hausslerdavid completepedigreebasedgraphworkflowforrarecandidatevariantanalysis
AT patenbenedict completepedigreebasedgraphworkflowforrarecandidatevariantanalysis