Cargando…
Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors
Whole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3748359/ https://www.ncbi.nlm.nih.gov/pubmed/23842528 http://dx.doi.org/10.1093/molbev/mst125 |
_version_ | 1782281058175156224 |
---|---|
author | MacLeod, Iona M. Larkin, Denis M. Lewin, Harris A. Hayes, Ben J. Goddard, Mike E. |
author_facet | MacLeod, Iona M. Larkin, Denis M. Lewin, Harris A. Hayes, Ben J. Goddard, Mike E. |
author_sort | MacLeod, Iona M. |
collection | PubMed |
description | Whole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using whole-genome sequence data from two Holstein cattle, we demonstrate a new method to correct for bias caused by hidden errors and then infer stepwise changes in ancestral demography up to present. There was a strong upward bias in estimates of recent effective population size (N(e)) if the correction method was not applied to the data, both for our method and the Li and Durbin (Inference of human population history from individual whole-genome sequences. Nature 475:493–496) pairwise sequentially Markovian coalescent method. To infer demography, we use an analytical predictor of multiloci linkage disequilibrium (LD) based on a simple coalescent model that allows for changes in N(e). The LD statistic summarizes the distribution of runs of homozygosity for any given demography. We infer a best fit demography as one that predicts a match with the observed distribution of runs of homozygosity in the corrected sequence data. We use multiloci LD because it potentially holds more information about ancestral demography than pairwise LD. The inferred demography indicates a strong reduction in the N(e) around 170,000 years ago, possibly related to the divergence of African and European Bos taurus cattle. This is followed by a further reduction coinciding with the period of cattle domestication, with N(e) of between 3,500 and 6,000. The most recent reduction of N(e) to approximately 100 in the Holstein breed agrees well with estimates from pedigrees. Our approach can be applied to whole-genome sequence from any diploid species and can be scaled up to use sequence from multiple individuals. |
format | Online Article Text |
id | pubmed-3748359 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-37483592013-08-21 Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors MacLeod, Iona M. Larkin, Denis M. Lewin, Harris A. Hayes, Ben J. Goddard, Mike E. Mol Biol Evol Methods Whole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using whole-genome sequence data from two Holstein cattle, we demonstrate a new method to correct for bias caused by hidden errors and then infer stepwise changes in ancestral demography up to present. There was a strong upward bias in estimates of recent effective population size (N(e)) if the correction method was not applied to the data, both for our method and the Li and Durbin (Inference of human population history from individual whole-genome sequences. Nature 475:493–496) pairwise sequentially Markovian coalescent method. To infer demography, we use an analytical predictor of multiloci linkage disequilibrium (LD) based on a simple coalescent model that allows for changes in N(e). The LD statistic summarizes the distribution of runs of homozygosity for any given demography. We infer a best fit demography as one that predicts a match with the observed distribution of runs of homozygosity in the corrected sequence data. We use multiloci LD because it potentially holds more information about ancestral demography than pairwise LD. The inferred demography indicates a strong reduction in the N(e) around 170,000 years ago, possibly related to the divergence of African and European Bos taurus cattle. This is followed by a further reduction coinciding with the period of cattle domestication, with N(e) of between 3,500 and 6,000. The most recent reduction of N(e) to approximately 100 in the Holstein breed agrees well with estimates from pedigrees. Our approach can be applied to whole-genome sequence from any diploid species and can be scaled up to use sequence from multiple individuals. Oxford University Press 2013-09 2013-07-10 /pmc/articles/PMC3748359/ /pubmed/23842528 http://dx.doi.org/10.1093/molbev/mst125 Text en © The Author 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods MacLeod, Iona M. Larkin, Denis M. Lewin, Harris A. Hayes, Ben J. Goddard, Mike E. Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors |
title | Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors |
title_full | Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors |
title_fullStr | Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors |
title_full_unstemmed | Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors |
title_short | Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence, with Correction for Sequence Errors |
title_sort | inferring demography from runs of homozygosity in whole-genome sequence, with correction for sequence errors |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3748359/ https://www.ncbi.nlm.nih.gov/pubmed/23842528 http://dx.doi.org/10.1093/molbev/mst125 |
work_keys_str_mv | AT macleodionam inferringdemographyfromrunsofhomozygosityinwholegenomesequencewithcorrectionforsequenceerrors AT larkindenism inferringdemographyfromrunsofhomozygosityinwholegenomesequencewithcorrectionforsequenceerrors AT lewinharrisa inferringdemographyfromrunsofhomozygosityinwholegenomesequencewithcorrectionforsequenceerrors AT hayesbenj inferringdemographyfromrunsofhomozygosityinwholegenomesequencewithcorrectionforsequenceerrors AT goddardmikee inferringdemographyfromrunsofhomozygosityinwholegenomesequencewithcorrectionforsequenceerrors |