Cargando…

Inferring population structure in biobank-scale genomic data

Inferring the structure of human populations from genetic variation data is a key task in population and medical genomic studies. Although a number of methods for population structure inference have been proposed, current methods are impractical to run on biobank-scale genomic datasets containing mi...

Descripción completa

Detalles Bibliográficos
Autores principales: Chiu, Alec M., Molloy, Erin K., Tan, Zilong, Talwalkar, Ameet, Sankararaman, Sriram
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9069078/
https://www.ncbi.nlm.nih.gov/pubmed/35298920
http://dx.doi.org/10.1016/j.ajhg.2022.02.015
_version_ 1784700351695290368
author Chiu, Alec M.
Molloy, Erin K.
Tan, Zilong
Talwalkar, Ameet
Sankararaman, Sriram
author_facet Chiu, Alec M.
Molloy, Erin K.
Tan, Zilong
Talwalkar, Ameet
Sankararaman, Sriram
author_sort Chiu, Alec M.
collection PubMed
description Inferring the structure of human populations from genetic variation data is a key task in population and medical genomic studies. Although a number of methods for population structure inference have been proposed, current methods are impractical to run on biobank-scale genomic datasets containing millions of individuals and genetic variants. We introduce SCOPE, a method for population structure inference that is orders of magnitude faster than existing methods while achieving comparable accuracy. SCOPE infers population structure in about a day on a dataset containing one million individuals and variants as well as on the UK Biobank dataset containing 488,363 individuals and 569,346 variants. Furthermore, SCOPE can leverage allele frequencies from previous studies to improve the interpretability of population structure estimates.
format Online
Article
Text
id pubmed-9069078
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-90690782022-05-05 Inferring population structure in biobank-scale genomic data Chiu, Alec M. Molloy, Erin K. Tan, Zilong Talwalkar, Ameet Sankararaman, Sriram Am J Hum Genet Article Inferring the structure of human populations from genetic variation data is a key task in population and medical genomic studies. Although a number of methods for population structure inference have been proposed, current methods are impractical to run on biobank-scale genomic datasets containing millions of individuals and genetic variants. We introduce SCOPE, a method for population structure inference that is orders of magnitude faster than existing methods while achieving comparable accuracy. SCOPE infers population structure in about a day on a dataset containing one million individuals and variants as well as on the UK Biobank dataset containing 488,363 individuals and 569,346 variants. Furthermore, SCOPE can leverage allele frequencies from previous studies to improve the interpretability of population structure estimates. Elsevier 2022-04-07 2022-03-16 /pmc/articles/PMC9069078/ /pubmed/35298920 http://dx.doi.org/10.1016/j.ajhg.2022.02.015 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Chiu, Alec M.
Molloy, Erin K.
Tan, Zilong
Talwalkar, Ameet
Sankararaman, Sriram
Inferring population structure in biobank-scale genomic data
title Inferring population structure in biobank-scale genomic data
title_full Inferring population structure in biobank-scale genomic data
title_fullStr Inferring population structure in biobank-scale genomic data
title_full_unstemmed Inferring population structure in biobank-scale genomic data
title_short Inferring population structure in biobank-scale genomic data
title_sort inferring population structure in biobank-scale genomic data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9069078/
https://www.ncbi.nlm.nih.gov/pubmed/35298920
http://dx.doi.org/10.1016/j.ajhg.2022.02.015
work_keys_str_mv AT chiualecm inferringpopulationstructureinbiobankscalegenomicdata
AT molloyerink inferringpopulationstructureinbiobankscalegenomicdata
AT tanzilong inferringpopulationstructureinbiobankscalegenomicdata
AT talwalkarameet inferringpopulationstructureinbiobankscalegenomicdata
AT sankararamansriram inferringpopulationstructureinbiobankscalegenomicdata