Cargando…
A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population
The ethnic composition of the population of a country contributes to the uniqueness of each national DNA sequencing project and, ideally, individual reference genomes are required to reduce the confounding nature of ethnic bias. This work represents a representative Whole Genome Sequencing effort of...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8102833/ https://www.ncbi.nlm.nih.gov/pubmed/33968136 http://dx.doi.org/10.3389/fgene.2021.660428 |
_version_ | 1783689186114535424 |
---|---|
author | Daw Elbait, Gihan Henschel, Andreas Tay, Guan K. Al Safar, Habiba S. |
author_facet | Daw Elbait, Gihan Henschel, Andreas Tay, Guan K. Al Safar, Habiba S. |
author_sort | Daw Elbait, Gihan |
collection | PubMed |
description | The ethnic composition of the population of a country contributes to the uniqueness of each national DNA sequencing project and, ideally, individual reference genomes are required to reduce the confounding nature of ethnic bias. This work represents a representative Whole Genome Sequencing effort of an understudied population. Specifically, high coverage consensus sequences from 120 whole genomes and 33 whole exomes were used to construct the first ever population specific major allele reference genome for the United Arab Emirates (UAE). When this was applied and compared to the archetype hg19 reference, assembly of local Emirati genomes was reduced by ∼19% (i.e., some 1 million fewer calls). In compiling the United Arab Emirates Reference Genome (UAERG), sets of annotated 23,038,090 short (novel: 1,790,171) and 137,713 structural (novel: 8,462) variants; their allele frequencies (AFs) and distribution across the genome were identified. Population-specific genetic characteristics including loss-of-function variants, admixture, and ancestral haplogroup distribution were identified and reported here. We also detect a strong correlation between F(ST) and admixture components in the UAE. This baseline study was conceived to establish a high-quality reference genome and a genetic variations resource to enable the development of regional population specific initiatives and thus inform the application of population studies and precision medicine in the UAE. |
format | Online Article Text |
id | pubmed-8102833 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-81028332021-05-08 A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population Daw Elbait, Gihan Henschel, Andreas Tay, Guan K. Al Safar, Habiba S. Front Genet Genetics The ethnic composition of the population of a country contributes to the uniqueness of each national DNA sequencing project and, ideally, individual reference genomes are required to reduce the confounding nature of ethnic bias. This work represents a representative Whole Genome Sequencing effort of an understudied population. Specifically, high coverage consensus sequences from 120 whole genomes and 33 whole exomes were used to construct the first ever population specific major allele reference genome for the United Arab Emirates (UAE). When this was applied and compared to the archetype hg19 reference, assembly of local Emirati genomes was reduced by ∼19% (i.e., some 1 million fewer calls). In compiling the United Arab Emirates Reference Genome (UAERG), sets of annotated 23,038,090 short (novel: 1,790,171) and 137,713 structural (novel: 8,462) variants; their allele frequencies (AFs) and distribution across the genome were identified. Population-specific genetic characteristics including loss-of-function variants, admixture, and ancestral haplogroup distribution were identified and reported here. We also detect a strong correlation between F(ST) and admixture components in the UAE. This baseline study was conceived to establish a high-quality reference genome and a genetic variations resource to enable the development of regional population specific initiatives and thus inform the application of population studies and precision medicine in the UAE. Frontiers Media S.A. 2021-04-23 /pmc/articles/PMC8102833/ /pubmed/33968136 http://dx.doi.org/10.3389/fgene.2021.660428 Text en Copyright © 2021 Daw Elbait, Henschel, Tay and Al Safar. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Daw Elbait, Gihan Henschel, Andreas Tay, Guan K. Al Safar, Habiba S. A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population |
title | A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population |
title_full | A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population |
title_fullStr | A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population |
title_full_unstemmed | A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population |
title_short | A Population-Specific Major Allele Reference Genome From The United Arab Emirates Population |
title_sort | population-specific major allele reference genome from the united arab emirates population |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8102833/ https://www.ncbi.nlm.nih.gov/pubmed/33968136 http://dx.doi.org/10.3389/fgene.2021.660428 |
work_keys_str_mv | AT dawelbaitgihan apopulationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation AT henschelandreas apopulationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation AT tayguank apopulationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation AT alsafarhabibas apopulationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation AT dawelbaitgihan populationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation AT henschelandreas populationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation AT tayguank populationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation AT alsafarhabibas populationspecificmajorallelereferencegenomefromtheunitedarabemiratespopulation |