Cargando…

Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel

Sequencing ancient DNA to high coverage is often limited by sample quality and cost. Imputing missing genotypes can potentially increase information content and quality of ancient data, but requires different computational approaches than modern DNA imputation. Ancient imputation beyond humans has n...

Descripción completa

Detalles Bibliográficos
Autores principales: Erven, J. A. M., Çakirlar, C., Bradley, D. G., Raemaekers, D. C. M., Madsen, O.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9315352/
https://www.ncbi.nlm.nih.gov/pubmed/35903348
http://dx.doi.org/10.3389/fgene.2022.872486
_version_ 1784754540650692608
author Erven, J. A. M.
Çakirlar, C.
Bradley, D. G.
Raemaekers, D. C. M.
Madsen, O.
author_facet Erven, J. A. M.
Çakirlar, C.
Bradley, D. G.
Raemaekers, D. C. M.
Madsen, O.
author_sort Erven, J. A. M.
collection PubMed
description Sequencing ancient DNA to high coverage is often limited by sample quality and cost. Imputing missing genotypes can potentially increase information content and quality of ancient data, but requires different computational approaches than modern DNA imputation. Ancient imputation beyond humans has not been investigated. In this study we report results of a systematic evaluation of imputation of three whole genome ancient Sus scrofa samples from the Early and Late Neolithic (∼7,100–4,500 BP), to test the utility of imputation. We show how issues like genetic architecture and, reference panel divergence, composition and size affect imputation accuracy. We evaluate a variety of imputation methods, including Beagle5, GLIMPSE, and Impute5 with varying filters, pipelines, and variant calling methods. We achieved genotype concordance in most cases reaching above 90%; with the highest being 98% with ∼2,000,000 variants recovered using GLIMPSE. Despite this high concordance the sources of diversity present in the genotypes called in the original high coverage genomes were not equally imputed leading to biases in downstream analyses; a trend toward genotypes most common in the reference panel is observed. This demonstrates that the current reference panel does not possess the full diversity needed for accurate imputation of ancient Sus, due to missing variations from Near Eastern and Mesolithic wild boar. Imputation of ancient Sus scrofa holds potential but should be approached with caution due to these biases, and suggests that there is no universal approach for imputation of non-human ancient species.
format Online
Article
Text
id pubmed-9315352
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-93153522022-07-27 Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel Erven, J. A. M. Çakirlar, C. Bradley, D. G. Raemaekers, D. C. M. Madsen, O. Front Genet Genetics Sequencing ancient DNA to high coverage is often limited by sample quality and cost. Imputing missing genotypes can potentially increase information content and quality of ancient data, but requires different computational approaches than modern DNA imputation. Ancient imputation beyond humans has not been investigated. In this study we report results of a systematic evaluation of imputation of three whole genome ancient Sus scrofa samples from the Early and Late Neolithic (∼7,100–4,500 BP), to test the utility of imputation. We show how issues like genetic architecture and, reference panel divergence, composition and size affect imputation accuracy. We evaluate a variety of imputation methods, including Beagle5, GLIMPSE, and Impute5 with varying filters, pipelines, and variant calling methods. We achieved genotype concordance in most cases reaching above 90%; with the highest being 98% with ∼2,000,000 variants recovered using GLIMPSE. Despite this high concordance the sources of diversity present in the genotypes called in the original high coverage genomes were not equally imputed leading to biases in downstream analyses; a trend toward genotypes most common in the reference panel is observed. This demonstrates that the current reference panel does not possess the full diversity needed for accurate imputation of ancient Sus, due to missing variations from Near Eastern and Mesolithic wild boar. Imputation of ancient Sus scrofa holds potential but should be approached with caution due to these biases, and suggests that there is no universal approach for imputation of non-human ancient species. Frontiers Media S.A. 2022-07-12 /pmc/articles/PMC9315352/ /pubmed/35903348 http://dx.doi.org/10.3389/fgene.2022.872486 Text en Copyright © 2022 Erven, Çakirlar, Bradley, Raemaekers and Madsen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Erven, J. A. M.
Çakirlar, C.
Bradley, D. G.
Raemaekers, D. C. M.
Madsen, O.
Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel
title Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel
title_full Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel
title_fullStr Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel
title_full_unstemmed Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel
title_short Imputation of Ancient Whole Genome Sus scrofa DNA Introduces Biases Toward Main Population Components in the Reference Panel
title_sort imputation of ancient whole genome sus scrofa dna introduces biases toward main population components in the reference panel
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9315352/
https://www.ncbi.nlm.nih.gov/pubmed/35903348
http://dx.doi.org/10.3389/fgene.2022.872486
work_keys_str_mv AT ervenjam imputationofancientwholegenomesusscrofadnaintroducesbiasestowardmainpopulationcomponentsinthereferencepanel
AT cakirlarc imputationofancientwholegenomesusscrofadnaintroducesbiasestowardmainpopulationcomponentsinthereferencepanel
AT bradleydg imputationofancientwholegenomesusscrofadnaintroducesbiasestowardmainpopulationcomponentsinthereferencepanel
AT raemaekersdcm imputationofancientwholegenomesusscrofadnaintroducesbiasestowardmainpopulationcomponentsinthereferencepanel
AT madseno imputationofancientwholegenomesusscrofadnaintroducesbiasestowardmainpopulationcomponentsinthereferencepanel