Cargando…

Differentiation of Hispanic biogeographic ancestry with 80 ancestry informative markers

Ancestry informative single nucleotide polymorphisms (SNPs) can identify biogeographic ancestry (BGA); however, population substructure and relatively recent admixture can make differentiation difficult in heterogeneous Hispanic populations. Utilizing unrelated individuals from the Genomic Origins a...

Descripción completa

Detalles Bibliográficos
Autores principales: Setser, Casandra H., Planz, John V., Barber, Robert C., Phillips, Nicole R., Chakraborty, Ranajit, Cross, Deanna S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7210943/
https://www.ncbi.nlm.nih.gov/pubmed/32385290
http://dx.doi.org/10.1038/s41598-020-64245-4
_version_ 1783531362821603328
author Setser, Casandra H.
Planz, John V.
Barber, Robert C.
Phillips, Nicole R.
Chakraborty, Ranajit
Cross, Deanna S.
author_facet Setser, Casandra H.
Planz, John V.
Barber, Robert C.
Phillips, Nicole R.
Chakraborty, Ranajit
Cross, Deanna S.
author_sort Setser, Casandra H.
collection PubMed
description Ancestry informative single nucleotide polymorphisms (SNPs) can identify biogeographic ancestry (BGA); however, population substructure and relatively recent admixture can make differentiation difficult in heterogeneous Hispanic populations. Utilizing unrelated individuals from the Genomic Origins and Admixture in Latinos dataset (GOAL, n = 160), we designed an 80 SNP panel (Setser80) that accurately depicts BGA through STRUCTURE and PCA. We compared our Setser80 to the Seldin and Kidd panels via resampling simulations, which models data based on allele frequencies. We incorporated Admixed American 1000 Genomes populations (1000 G, n = 347), into a combined populations dataset to determine robustness. Using multinomial logistic regression (MLR), we compared the 3 panels on the combined dataset and found overall MLR classification accuracies: 93.2% Setser80, 87.9% Seldin panel, 71.4% Kidd panel. Naïve Bayesian classification had similar results on the combined dataset: 91.5% Setser80, 84.7% Seldin panel, 71.1% Kidd panel. Although Peru and Mexico were absent from panel design, we achieved high classification accuracy on the combined populations for Peru (MLR = 100%, naïve Bayes = 98%), and Mexico (MLR = 90%, naïve Bayes = 83.4%) as evidence of the portability of the Setser80. Our results indicate the Setser80 SNP panel can reliably classify BGA for individuals of presumed Hispanic origin.
format Online
Article
Text
id pubmed-7210943
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-72109432020-05-15 Differentiation of Hispanic biogeographic ancestry with 80 ancestry informative markers Setser, Casandra H. Planz, John V. Barber, Robert C. Phillips, Nicole R. Chakraborty, Ranajit Cross, Deanna S. Sci Rep Article Ancestry informative single nucleotide polymorphisms (SNPs) can identify biogeographic ancestry (BGA); however, population substructure and relatively recent admixture can make differentiation difficult in heterogeneous Hispanic populations. Utilizing unrelated individuals from the Genomic Origins and Admixture in Latinos dataset (GOAL, n = 160), we designed an 80 SNP panel (Setser80) that accurately depicts BGA through STRUCTURE and PCA. We compared our Setser80 to the Seldin and Kidd panels via resampling simulations, which models data based on allele frequencies. We incorporated Admixed American 1000 Genomes populations (1000 G, n = 347), into a combined populations dataset to determine robustness. Using multinomial logistic regression (MLR), we compared the 3 panels on the combined dataset and found overall MLR classification accuracies: 93.2% Setser80, 87.9% Seldin panel, 71.4% Kidd panel. Naïve Bayesian classification had similar results on the combined dataset: 91.5% Setser80, 84.7% Seldin panel, 71.1% Kidd panel. Although Peru and Mexico were absent from panel design, we achieved high classification accuracy on the combined populations for Peru (MLR = 100%, naïve Bayes = 98%), and Mexico (MLR = 90%, naïve Bayes = 83.4%) as evidence of the portability of the Setser80. Our results indicate the Setser80 SNP panel can reliably classify BGA for individuals of presumed Hispanic origin. Nature Publishing Group UK 2020-05-08 /pmc/articles/PMC7210943/ /pubmed/32385290 http://dx.doi.org/10.1038/s41598-020-64245-4 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Setser, Casandra H.
Planz, John V.
Barber, Robert C.
Phillips, Nicole R.
Chakraborty, Ranajit
Cross, Deanna S.
Differentiation of Hispanic biogeographic ancestry with 80 ancestry informative markers
title Differentiation of Hispanic biogeographic ancestry with 80 ancestry informative markers
title_full Differentiation of Hispanic biogeographic ancestry with 80 ancestry informative markers
title_fullStr Differentiation of Hispanic biogeographic ancestry with 80 ancestry informative markers
title_full_unstemmed Differentiation of Hispanic biogeographic ancestry with 80 ancestry informative markers
title_short Differentiation of Hispanic biogeographic ancestry with 80 ancestry informative markers
title_sort differentiation of hispanic biogeographic ancestry with 80 ancestry informative markers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7210943/
https://www.ncbi.nlm.nih.gov/pubmed/32385290
http://dx.doi.org/10.1038/s41598-020-64245-4
work_keys_str_mv AT setsercasandrah differentiationofhispanicbiogeographicancestrywith80ancestryinformativemarkers
AT planzjohnv differentiationofhispanicbiogeographicancestrywith80ancestryinformativemarkers
AT barberrobertc differentiationofhispanicbiogeographicancestrywith80ancestryinformativemarkers
AT phillipsnicoler differentiationofhispanicbiogeographicancestrywith80ancestryinformativemarkers
AT chakrabortyranajit differentiationofhispanicbiogeographicancestrywith80ancestryinformativemarkers
AT crossdeannas differentiationofhispanicbiogeographicancestrywith80ancestryinformativemarkers