Cargando…

A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population

The ethnic composition of the population of a country contributes to the uniqueness of each national DNA sequencing project and, ideally, individual reference genomes are required to reduce the confounding nature of ethnic bias. This work represents a representative Whole Genome Sequencing effort of...

Descripción completa

Detalles Bibliográficos
Autores principales: Daw Elbait, Gihan, Henschel, Andreas, Tay, Guan K., Al Safar, Habiba S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8102833/
https://www.ncbi.nlm.nih.gov/pubmed/33968136
http://dx.doi.org/10.3389/fgene.2021.660428
_version_ 1783689186114535424
author Daw Elbait, Gihan
Henschel, Andreas
Tay, Guan K.
Al Safar, Habiba S.
author_facet Daw Elbait, Gihan
Henschel, Andreas
Tay, Guan K.
Al Safar, Habiba S.
author_sort Daw Elbait, Gihan
collection PubMed
description The ethnic composition of the population of a country contributes to the uniqueness of each national DNA sequencing project and, ideally, individual reference genomes are required to reduce the confounding nature of ethnic bias. This work represents a representative Whole Genome Sequencing effort of an understudied population. Specifically, high coverage consensus sequences from 120 whole genomes and 33 whole exomes were used to construct the first ever population specific major allele reference genome for the United Arab Emirates (UAE). When this was applied and compared to the archetype hg19 reference, assembly of local Emirati genomes was reduced by ∼19% (i.e., some 1 million fewer calls). In compiling the United Arab Emirates Reference Genome (UAERG), sets of annotated 23,038,090 short (novel: 1,790,171) and 137,713 structural (novel: 8,462) variants; their allele frequencies (AFs) and distribution across the genome were identified. Population-specific genetic characteristics including loss-of-function variants, admixture, and ancestral haplogroup distribution were identified and reported here. We also detect a strong correlation between F(ST) and admixture components in the UAE. This baseline study was conceived to establish a high-quality reference genome and a genetic variations resource to enable the development of regional population specific initiatives and thus inform the application of population studies and precision medicine in the UAE.
format Online
Article
Text
id pubmed-8102833
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-81028332021-05-08 A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population Daw Elbait, Gihan Henschel, Andreas Tay, Guan K. Al Safar, Habiba S. Front Genet Genetics The ethnic composition of the population of a country contributes to the uniqueness of each national DNA sequencing project and, ideally, individual reference genomes are required to reduce the confounding nature of ethnic bias. This work represents a representative Whole Genome Sequencing effort of an understudied population. Specifically, high coverage consensus sequences from 120 whole genomes and 33 whole exomes were used to construct the first ever population specific major allele reference genome for the United Arab Emirates (UAE). When this was applied and compared to the archetype hg19 reference, assembly of local Emirati genomes was reduced by ∼19% (i.e., some 1 million fewer calls). In compiling the United Arab Emirates Reference Genome (UAERG), sets of annotated 23,038,090 short (novel: 1,790,171) and 137,713 structural (novel: 8,462) variants; their allele frequencies (AFs) and distribution across the genome were identified. Population-specific genetic characteristics including loss-of-function variants, admixture, and ancestral haplogroup distribution were identified and reported here. We also detect a strong correlation between F(ST) and admixture components in the UAE. This baseline study was conceived to establish a high-quality reference genome and a genetic variations resource to enable the development of regional population specific initiatives and thus inform the application of population studies and precision medicine in the UAE. Frontiers Media S.A. 2021-04-23 /pmc/articles/PMC8102833/ /pubmed/33968136 http://dx.doi.org/10.3389/fgene.2021.660428 Text en Copyright © 2021 Daw Elbait, Henschel, Tay and Al Safar. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Daw Elbait, Gihan
Henschel, Andreas
Tay, Guan K.
Al Safar, Habiba S.
A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population
title A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population
title_full A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population
title_fullStr A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population
title_full_unstemmed A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population
title_short A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population
title_sort population-specific major allele reference genome from the united arab emirates population
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8102833/
https://www.ncbi.nlm.nih.gov/pubmed/33968136
http://dx.doi.org/10.3389/fgene.2021.660428
work_keys_str_mv AT dawelbaitgihan apopulationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation
AT henschelandreas apopulationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation
AT tayguank apopulationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation
AT alsafarhabibas apopulationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation
AT dawelbaitgihan populationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation
AT henschelandreas populationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation
AT tayguank populationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation
AT alsafarhabibas populationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation