Cargando…
Inferring population structure in biobank-scale genomic data
Inferring the structure of human populations from genetic variation data is a key task in population and medical genomic studies. Although a number of methods for population structure inference have been proposed, current methods are impractical to run on biobank-scale genomic datasets containing mi...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9069078/ https://www.ncbi.nlm.nih.gov/pubmed/35298920 http://dx.doi.org/10.1016/j.ajhg.2022.02.015 |
_version_ | 1784700351695290368 |
---|---|
author | Chiu, Alec M. Molloy, Erin K. Tan, Zilong Talwalkar, Ameet Sankararaman, Sriram |
author_facet | Chiu, Alec M. Molloy, Erin K. Tan, Zilong Talwalkar, Ameet Sankararaman, Sriram |
author_sort | Chiu, Alec M. |
collection | PubMed |
description | Inferring the structure of human populations from genetic variation data is a key task in population and medical genomic studies. Although a number of methods for population structure inference have been proposed, current methods are impractical to run on biobank-scale genomic datasets containing millions of individuals and genetic variants. We introduce SCOPE, a method for population structure inference that is orders of magnitude faster than existing methods while achieving comparable accuracy. SCOPE infers population structure in about a day on a dataset containing one million individuals and variants as well as on the UK Biobank dataset containing 488,363 individuals and 569,346 variants. Furthermore, SCOPE can leverage allele frequencies from previous studies to improve the interpretability of population structure estimates. |
format | Online Article Text |
id | pubmed-9069078 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-90690782022-05-05 Inferring population structure in biobank-scale genomic data Chiu, Alec M. Molloy, Erin K. Tan, Zilong Talwalkar, Ameet Sankararaman, Sriram Am J Hum Genet Article Inferring the structure of human populations from genetic variation data is a key task in population and medical genomic studies. Although a number of methods for population structure inference have been proposed, current methods are impractical to run on biobank-scale genomic datasets containing millions of individuals and genetic variants. We introduce SCOPE, a method for population structure inference that is orders of magnitude faster than existing methods while achieving comparable accuracy. SCOPE infers population structure in about a day on a dataset containing one million individuals and variants as well as on the UK Biobank dataset containing 488,363 individuals and 569,346 variants. Furthermore, SCOPE can leverage allele frequencies from previous studies to improve the interpretability of population structure estimates. Elsevier 2022-04-07 2022-03-16 /pmc/articles/PMC9069078/ /pubmed/35298920 http://dx.doi.org/10.1016/j.ajhg.2022.02.015 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Chiu, Alec M. Molloy, Erin K. Tan, Zilong Talwalkar, Ameet Sankararaman, Sriram Inferring population structure in biobank-scale genomic data |
title | Inferring population structure in biobank-scale genomic data |
title_full | Inferring population structure in biobank-scale genomic data |
title_fullStr | Inferring population structure in biobank-scale genomic data |
title_full_unstemmed | Inferring population structure in biobank-scale genomic data |
title_short | Inferring population structure in biobank-scale genomic data |
title_sort | inferring population structure in biobank-scale genomic data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9069078/ https://www.ncbi.nlm.nih.gov/pubmed/35298920 http://dx.doi.org/10.1016/j.ajhg.2022.02.015 |
work_keys_str_mv | AT chiualecm inferringpopulationstructureinbiobankscalegenomicdata AT molloyerink inferringpopulationstructureinbiobankscalegenomicdata AT tanzilong inferringpopulationstructureinbiobankscalegenomicdata AT talwalkarameet inferringpopulationstructureinbiobankscalegenomicdata AT sankararamansriram inferringpopulationstructureinbiobankscalegenomicdata |