Cargando…

Ancestral Spectrum Analysis With Population-Specific Variants

With the advance of sequencing technology, an increasing number of populations have been sequenced to study the histories of worldwide populations, including their divergence, admixtures, migration, and effective sizes. The variants detected in sequencing studies are largely rare and mostly populati...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Gang, Kuang, Qingmin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8503515/
https://www.ncbi.nlm.nih.gov/pubmed/34646302
http://dx.doi.org/10.3389/fgene.2021.724638
_version_ 1784581139917176832
author Shi, Gang
Kuang, Qingmin
author_facet Shi, Gang
Kuang, Qingmin
author_sort Shi, Gang
collection PubMed
description With the advance of sequencing technology, an increasing number of populations have been sequenced to study the histories of worldwide populations, including their divergence, admixtures, migration, and effective sizes. The variants detected in sequencing studies are largely rare and mostly population specific. Population-specific variants are often recent mutations and are informative for revealing substructures and admixtures in populations; however, computational methods and tools to analyze them are still lacking. In this work, we propose using reference populations and single nucleotide polymorphisms (SNPs) specific to the reference populations. Ancestral information, the best linear unbiased estimator (BLUE) of the ancestral proportion, is proposed, which can be used to infer ancestral proportions in recently admixed target populations and measure the extent to which reference populations serve as good proxies for the admixing sources. Based on the same panel of SNPs, the ancestral information is comparable across samples from different studies and is not affected by genetic outliers, related samples, or the sample sizes of the admixed target populations. In addition, ancestral spectrum is useful for detecting genetic outliers or exploring co-ancestry between study samples and the reference populations. The methods are implemented in a program, Ancestral Spectrum Analyzer (ASA), and are applied in analyzing high-coverage sequencing data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP). In the analyses of American populations from the 1000 Genomes Project, we demonstrate that recent admixtures can be dissected from ancient admixtures by comparing ancestral spectra with and without indigenous Americans being included in the reference populations.
format Online
Article
Text
id pubmed-8503515
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-85035152021-10-12 Ancestral Spectrum Analysis With Population-Specific Variants Shi, Gang Kuang, Qingmin Front Genet Genetics With the advance of sequencing technology, an increasing number of populations have been sequenced to study the histories of worldwide populations, including their divergence, admixtures, migration, and effective sizes. The variants detected in sequencing studies are largely rare and mostly population specific. Population-specific variants are often recent mutations and are informative for revealing substructures and admixtures in populations; however, computational methods and tools to analyze them are still lacking. In this work, we propose using reference populations and single nucleotide polymorphisms (SNPs) specific to the reference populations. Ancestral information, the best linear unbiased estimator (BLUE) of the ancestral proportion, is proposed, which can be used to infer ancestral proportions in recently admixed target populations and measure the extent to which reference populations serve as good proxies for the admixing sources. Based on the same panel of SNPs, the ancestral information is comparable across samples from different studies and is not affected by genetic outliers, related samples, or the sample sizes of the admixed target populations. In addition, ancestral spectrum is useful for detecting genetic outliers or exploring co-ancestry between study samples and the reference populations. The methods are implemented in a program, Ancestral Spectrum Analyzer (ASA), and are applied in analyzing high-coverage sequencing data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP). In the analyses of American populations from the 1000 Genomes Project, we demonstrate that recent admixtures can be dissected from ancient admixtures by comparing ancestral spectra with and without indigenous Americans being included in the reference populations. Frontiers Media S.A. 2021-09-27 /pmc/articles/PMC8503515/ /pubmed/34646302 http://dx.doi.org/10.3389/fgene.2021.724638 Text en Copyright © 2021 Shi and Kuang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Shi, Gang
Kuang, Qingmin
Ancestral Spectrum Analysis With Population-Specific Variants
title Ancestral Spectrum Analysis With Population-Specific Variants
title_full Ancestral Spectrum Analysis With Population-Specific Variants
title_fullStr Ancestral Spectrum Analysis With Population-Specific Variants
title_full_unstemmed Ancestral Spectrum Analysis With Population-Specific Variants
title_short Ancestral Spectrum Analysis With Population-Specific Variants
title_sort ancestral spectrum analysis with population-specific variants
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8503515/
https://www.ncbi.nlm.nih.gov/pubmed/34646302
http://dx.doi.org/10.3389/fgene.2021.724638
work_keys_str_mv AT shigang ancestralspectrumanalysiswithpopulationspecificvariants
AT kuangqingmin ancestralspectrumanalysiswithpopulationspecificvariants