Cargando…
Estimating variance components in population scale family trees
The rapid digitization of genealogical and medical records enables the assembly of extremely large pedigree records spanning millions of individuals and trillions of pairs of relatives. Such pedigrees provide the opportunity to investigate the sociological and epidemiological history of human popula...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6529016/ https://www.ncbi.nlm.nih.gov/pubmed/31071088 http://dx.doi.org/10.1371/journal.pgen.1008124 |
_version_ | 1783420323406807040 |
---|---|
author | Shor, Tal Kalka, Iris Geiger, Dan Erlich, Yaniv Weissbrod, Omer |
author_facet | Shor, Tal Kalka, Iris Geiger, Dan Erlich, Yaniv Weissbrod, Omer |
author_sort | Shor, Tal |
collection | PubMed |
description | The rapid digitization of genealogical and medical records enables the assembly of extremely large pedigree records spanning millions of individuals and trillions of pairs of relatives. Such pedigrees provide the opportunity to investigate the sociological and epidemiological history of human populations in scales much larger than previously possible. Linear mixed models (LMMs) are routinely used to analyze extremely large animal and plant pedigrees for the purposes of selective breeding. However, LMMs have not been previously applied to analyze population-scale human family trees. Here, we present Sparse Cholesky factorIzation LMM (Sci-LMM), a modeling framework for studying population-scale family trees that combines techniques from the animal and plant breeding literature and from human genetics literature. The proposed framework can construct a matrix of relationships between trillions of pairs of individuals and fit the corresponding LMM in several hours. We demonstrate the capabilities of Sci-LMM via simulation studies and by estimating the heritability of longevity and of reproductive fitness (quantified via number of children) in a large pedigree spanning millions of individuals and over five centuries of human history. Sci-LMM provides a unified framework for investigating the epidemiological history of human populations via genealogical records. |
format | Online Article Text |
id | pubmed-6529016 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-65290162019-05-31 Estimating variance components in population scale family trees Shor, Tal Kalka, Iris Geiger, Dan Erlich, Yaniv Weissbrod, Omer PLoS Genet Research Article The rapid digitization of genealogical and medical records enables the assembly of extremely large pedigree records spanning millions of individuals and trillions of pairs of relatives. Such pedigrees provide the opportunity to investigate the sociological and epidemiological history of human populations in scales much larger than previously possible. Linear mixed models (LMMs) are routinely used to analyze extremely large animal and plant pedigrees for the purposes of selective breeding. However, LMMs have not been previously applied to analyze population-scale human family trees. Here, we present Sparse Cholesky factorIzation LMM (Sci-LMM), a modeling framework for studying population-scale family trees that combines techniques from the animal and plant breeding literature and from human genetics literature. The proposed framework can construct a matrix of relationships between trillions of pairs of individuals and fit the corresponding LMM in several hours. We demonstrate the capabilities of Sci-LMM via simulation studies and by estimating the heritability of longevity and of reproductive fitness (quantified via number of children) in a large pedigree spanning millions of individuals and over five centuries of human history. Sci-LMM provides a unified framework for investigating the epidemiological history of human populations via genealogical records. Public Library of Science 2019-05-09 /pmc/articles/PMC6529016/ /pubmed/31071088 http://dx.doi.org/10.1371/journal.pgen.1008124 Text en © 2019 Shor et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Shor, Tal Kalka, Iris Geiger, Dan Erlich, Yaniv Weissbrod, Omer Estimating variance components in population scale family trees |
title | Estimating variance components in population scale family trees |
title_full | Estimating variance components in population scale family trees |
title_fullStr | Estimating variance components in population scale family trees |
title_full_unstemmed | Estimating variance components in population scale family trees |
title_short | Estimating variance components in population scale family trees |
title_sort | estimating variance components in population scale family trees |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6529016/ https://www.ncbi.nlm.nih.gov/pubmed/31071088 http://dx.doi.org/10.1371/journal.pgen.1008124 |
work_keys_str_mv | AT shortal estimatingvariancecomponentsinpopulationscalefamilytrees AT kalkairis estimatingvariancecomponentsinpopulationscalefamilytrees AT geigerdan estimatingvariancecomponentsinpopulationscalefamilytrees AT erlichyaniv estimatingvariancecomponentsinpopulationscalefamilytrees AT weissbrodomer estimatingvariancecomponentsinpopulationscalefamilytrees |