Cargando…

Mitochondrial DNA variation across 56,434 individuals in gnomAD

Genomic databases of allele frequency are extremely helpful for evaluating clinical variants of unknown significance; however, until now, databases such as the Genome Aggregation Database (gnomAD) have focused on nuclear DNA and have ignored the mitochondrial genome (mtDNA). Here, we present a pipel...

Descripción completa

Detalles Bibliográficos
Autores principales: Laricchia, Kristen M., Lake, Nicole J., Watts, Nicholas A., Shand, Megan, Haessly, Andrea, Gauthier, Laura, Benjamin, David, Banks, Eric, Soto, Jose, Garimella, Kiran, Emery, James, Rehm, Heidi L., MacArthur, Daniel G., Tiao, Grace, Lek, Monkol, Mootha, Vamsi K., Calvo, Sarah E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8896463/
https://www.ncbi.nlm.nih.gov/pubmed/35074858
http://dx.doi.org/10.1101/gr.276013.121
_version_ 1784663171401777152
author Laricchia, Kristen M.
Lake, Nicole J.
Watts, Nicholas A.
Shand, Megan
Haessly, Andrea
Gauthier, Laura
Benjamin, David
Banks, Eric
Soto, Jose
Garimella, Kiran
Emery, James
Rehm, Heidi L.
MacArthur, Daniel G.
Tiao, Grace
Lek, Monkol
Mootha, Vamsi K.
Calvo, Sarah E.
author_facet Laricchia, Kristen M.
Lake, Nicole J.
Watts, Nicholas A.
Shand, Megan
Haessly, Andrea
Gauthier, Laura
Benjamin, David
Banks, Eric
Soto, Jose
Garimella, Kiran
Emery, James
Rehm, Heidi L.
MacArthur, Daniel G.
Tiao, Grace
Lek, Monkol
Mootha, Vamsi K.
Calvo, Sarah E.
author_sort Laricchia, Kristen M.
collection PubMed
description Genomic databases of allele frequency are extremely helpful for evaluating clinical variants of unknown significance; however, until now, databases such as the Genome Aggregation Database (gnomAD) have focused on nuclear DNA and have ignored the mitochondrial genome (mtDNA). Here, we present a pipeline to call mtDNA variants that addresses three technical challenges: (1) detecting homoplasmic and heteroplasmic variants, present, respectively, in all or a fraction of mtDNA molecules; (2) circular mtDNA genome; and (3) misalignment of nuclear sequences of mitochondrial origin (NUMTs). We observed that mtDNA copy number per cell varied across gnomAD cohorts and influenced the fraction of NUMT-derived false-positive variant calls, which can account for the majority of putative heteroplasmies. To avoid false positives, we excluded contaminated samples, cell lines, and samples prone to NUMT misalignment due to few mtDNA copies. Furthermore, we report variants with heteroplasmy ≥10%. We applied this pipeline to 56,434 whole-genome sequences in the gnomAD v3.1 database that includes individuals of European (58%), African (25%), Latino (10%), and Asian (5%) ancestry. Our gnomAD v3.1 release contains population frequencies for 10,850 unique mtDNA variants at more than half of all mtDNA bases. Importantly, we report frequencies within each nuclear ancestral population and mitochondrial haplogroup. Homoplasmic variants account for most variant calls (98%) and unique variants (85%). We observed that 1/250 individuals carry a pathogenic mtDNA variant with heteroplasmy above 10%. These mtDNA population allele frequencies are freely accessible and will aid in diagnostic interpretation and research studies.
format Online
Article
Text
id pubmed-8896463
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-88964632022-03-22 Mitochondrial DNA variation across 56,434 individuals in gnomAD Laricchia, Kristen M. Lake, Nicole J. Watts, Nicholas A. Shand, Megan Haessly, Andrea Gauthier, Laura Benjamin, David Banks, Eric Soto, Jose Garimella, Kiran Emery, James Rehm, Heidi L. MacArthur, Daniel G. Tiao, Grace Lek, Monkol Mootha, Vamsi K. Calvo, Sarah E. Genome Res Resource Genomic databases of allele frequency are extremely helpful for evaluating clinical variants of unknown significance; however, until now, databases such as the Genome Aggregation Database (gnomAD) have focused on nuclear DNA and have ignored the mitochondrial genome (mtDNA). Here, we present a pipeline to call mtDNA variants that addresses three technical challenges: (1) detecting homoplasmic and heteroplasmic variants, present, respectively, in all or a fraction of mtDNA molecules; (2) circular mtDNA genome; and (3) misalignment of nuclear sequences of mitochondrial origin (NUMTs). We observed that mtDNA copy number per cell varied across gnomAD cohorts and influenced the fraction of NUMT-derived false-positive variant calls, which can account for the majority of putative heteroplasmies. To avoid false positives, we excluded contaminated samples, cell lines, and samples prone to NUMT misalignment due to few mtDNA copies. Furthermore, we report variants with heteroplasmy ≥10%. We applied this pipeline to 56,434 whole-genome sequences in the gnomAD v3.1 database that includes individuals of European (58%), African (25%), Latino (10%), and Asian (5%) ancestry. Our gnomAD v3.1 release contains population frequencies for 10,850 unique mtDNA variants at more than half of all mtDNA bases. Importantly, we report frequencies within each nuclear ancestral population and mitochondrial haplogroup. Homoplasmic variants account for most variant calls (98%) and unique variants (85%). We observed that 1/250 individuals carry a pathogenic mtDNA variant with heteroplasmy above 10%. These mtDNA population allele frequencies are freely accessible and will aid in diagnostic interpretation and research studies. Cold Spring Harbor Laboratory Press 2022-03 /pmc/articles/PMC8896463/ /pubmed/35074858 http://dx.doi.org/10.1101/gr.276013.121 Text en © 2022 Laricchia et al.; Published by Cold Spring Harbor Laboratory Press https://creativecommons.org/licenses/by-nc/4.0/This article, published in Genome Research, is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) .
spellingShingle Resource
Laricchia, Kristen M.
Lake, Nicole J.
Watts, Nicholas A.
Shand, Megan
Haessly, Andrea
Gauthier, Laura
Benjamin, David
Banks, Eric
Soto, Jose
Garimella, Kiran
Emery, James
Rehm, Heidi L.
MacArthur, Daniel G.
Tiao, Grace
Lek, Monkol
Mootha, Vamsi K.
Calvo, Sarah E.
Mitochondrial DNA variation across 56,434 individuals in gnomAD
title Mitochondrial DNA variation across 56,434 individuals in gnomAD
title_full Mitochondrial DNA variation across 56,434 individuals in gnomAD
title_fullStr Mitochondrial DNA variation across 56,434 individuals in gnomAD
title_full_unstemmed Mitochondrial DNA variation across 56,434 individuals in gnomAD
title_short Mitochondrial DNA variation across 56,434 individuals in gnomAD
title_sort mitochondrial dna variation across 56,434 individuals in gnomad
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8896463/
https://www.ncbi.nlm.nih.gov/pubmed/35074858
http://dx.doi.org/10.1101/gr.276013.121
work_keys_str_mv AT laricchiakristenm mitochondrialdnavariationacross56434individualsingnomad
AT lakenicolej mitochondrialdnavariationacross56434individualsingnomad
AT wattsnicholasa mitochondrialdnavariationacross56434individualsingnomad
AT shandmegan mitochondrialdnavariationacross56434individualsingnomad
AT haesslyandrea mitochondrialdnavariationacross56434individualsingnomad
AT gauthierlaura mitochondrialdnavariationacross56434individualsingnomad
AT benjamindavid mitochondrialdnavariationacross56434individualsingnomad
AT bankseric mitochondrialdnavariationacross56434individualsingnomad
AT sotojose mitochondrialdnavariationacross56434individualsingnomad
AT garimellakiran mitochondrialdnavariationacross56434individualsingnomad
AT emeryjames mitochondrialdnavariationacross56434individualsingnomad
AT mitochondrialdnavariationacross56434individualsingnomad
AT rehmheidil mitochondrialdnavariationacross56434individualsingnomad
AT macarthurdanielg mitochondrialdnavariationacross56434individualsingnomad
AT tiaograce mitochondrialdnavariationacross56434individualsingnomad
AT lekmonkol mitochondrialdnavariationacross56434individualsingnomad
AT moothavamsik mitochondrialdnavariationacross56434individualsingnomad
AT calvosarahe mitochondrialdnavariationacross56434individualsingnomad