Cargando…

Whole Genome Sequencing of Four Representatives From the Admixed Population of the United Arab Emirates

Whole genome sequences (WGS) of four nationals of the United Arab Emirates (UAE) at an average coverage of 33X have been completed and described. The selection of suitable subpopulation representatives was informed by a preceding comprehensive population structure analysis. Representatives were chos...

Descripción completa

Detalles Bibliográficos
Autores principales: Daw Elbait, Gihan, Henschel, Andreas, Tay, Guan K., Al Safar, Habiba S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7367215/
https://www.ncbi.nlm.nih.gov/pubmed/32754195
http://dx.doi.org/10.3389/fgene.2020.00681
_version_ 1783560377803472896
author Daw Elbait, Gihan
Henschel, Andreas
Tay, Guan K.
Al Safar, Habiba S.
author_facet Daw Elbait, Gihan
Henschel, Andreas
Tay, Guan K.
Al Safar, Habiba S.
author_sort Daw Elbait, Gihan
collection PubMed
description Whole genome sequences (WGS) of four nationals of the United Arab Emirates (UAE) at an average coverage of 33X have been completed and described. The selection of suitable subpopulation representatives was informed by a preceding comprehensive population structure analysis. Representatives were chosen based on their central location within the subpopulation on a principal component analysis (PCA) and the degree to which they were admixed. Novel genomic variations among the different subgroups of the UAE population are reported here. Specifically, the WGS analysis identified 4,161,067–4,798,806 variants in the four individual samples, where approximately 80% were single nucleotide polymorphisms (SNPs) and 20% were insertions or deletions (indels). An average of 2.75% was found to be novel variants according to dbSNP (build 151). This is the first report of structural variants (SV) from WGS data from UAE nationals. There were 15,677–20,339 called SVs, of which around 13.5% were novel. The four samples shared 1,399,178 variants, each with distinct variants as follows: 1,085,524 (for the individual denoted as UAE S011), 1,228,559 (UAE S012), 791,072 (UAE S013), and 906,818 (UAE S014). These results show a previously unappreciated population diversity in the region. The synergy of WGS and genotype array data was demonstrated through variant annotation of the former using 2.3 million allele frequencies for the local population derived from the latter technology platform. This novel approach of combining breadth and depth of array and WGS technologies has guided the choice of population genetic representatives and provides complementary, regionalized allele frequency annotation to new genomes comprising millions of loci.
format Online
Article
Text
id pubmed-7367215
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-73672152020-08-03 Whole Genome Sequencing of Four Representatives From the Admixed Population of the United Arab Emirates Daw Elbait, Gihan Henschel, Andreas Tay, Guan K. Al Safar, Habiba S. Front Genet Genetics Whole genome sequences (WGS) of four nationals of the United Arab Emirates (UAE) at an average coverage of 33X have been completed and described. The selection of suitable subpopulation representatives was informed by a preceding comprehensive population structure analysis. Representatives were chosen based on their central location within the subpopulation on a principal component analysis (PCA) and the degree to which they were admixed. Novel genomic variations among the different subgroups of the UAE population are reported here. Specifically, the WGS analysis identified 4,161,067–4,798,806 variants in the four individual samples, where approximately 80% were single nucleotide polymorphisms (SNPs) and 20% were insertions or deletions (indels). An average of 2.75% was found to be novel variants according to dbSNP (build 151). This is the first report of structural variants (SV) from WGS data from UAE nationals. There were 15,677–20,339 called SVs, of which around 13.5% were novel. The four samples shared 1,399,178 variants, each with distinct variants as follows: 1,085,524 (for the individual denoted as UAE S011), 1,228,559 (UAE S012), 791,072 (UAE S013), and 906,818 (UAE S014). These results show a previously unappreciated population diversity in the region. The synergy of WGS and genotype array data was demonstrated through variant annotation of the former using 2.3 million allele frequencies for the local population derived from the latter technology platform. This novel approach of combining breadth and depth of array and WGS technologies has guided the choice of population genetic representatives and provides complementary, regionalized allele frequency annotation to new genomes comprising millions of loci. Frontiers Media S.A. 2020-07-09 /pmc/articles/PMC7367215/ /pubmed/32754195 http://dx.doi.org/10.3389/fgene.2020.00681 Text en Copyright © 2020 Daw Elbait, Henschel, Tay and Al Safar. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Daw Elbait, Gihan
Henschel, Andreas
Tay, Guan K.
Al Safar, Habiba S.
Whole Genome Sequencing of Four Representatives From the Admixed Population of the United Arab Emirates
title Whole Genome Sequencing of Four Representatives From the Admixed Population of the United Arab Emirates
title_full Whole Genome Sequencing of Four Representatives From the Admixed Population of the United Arab Emirates
title_fullStr Whole Genome Sequencing of Four Representatives From the Admixed Population of the United Arab Emirates
title_full_unstemmed Whole Genome Sequencing of Four Representatives From the Admixed Population of the United Arab Emirates
title_short Whole Genome Sequencing of Four Representatives From the Admixed Population of the United Arab Emirates
title_sort whole genome sequencing of four representatives from the admixed population of the united arab emirates
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7367215/
https://www.ncbi.nlm.nih.gov/pubmed/32754195
http://dx.doi.org/10.3389/fgene.2020.00681
work_keys_str_mv AT dawelbaitgihan wholegenomesequencingoffourrepresentativesfromtheadmixedpopulationoftheunitedarabemirates
AT henschelandreas wholegenomesequencingoffourrepresentativesfromtheadmixedpopulationoftheunitedarabemirates
AT tayguank wholegenomesequencingoffourrepresentativesfromtheadmixedpopulationoftheunitedarabemirates
AT alsafarhabibas wholegenomesequencingoffourrepresentativesfromtheadmixedpopulationoftheunitedarabemirates