Cargando…

Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation

Most methods for next-generation sequencing (NGS) data analyses incorporate information regarding allele frequencies using the assumption of Hardy–Weinberg equilibrium (HWE) as a prior. However, many organisms including those that are domesticated, partially selfing, or with asexual life cycles show...

Descripción completa

Detalles Bibliográficos
Autores principales: Vieira, Filipe G., Fumagalli, Matteo, Albrechtsen, Anders, Nielsen, Rasmus
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3814885/
https://www.ncbi.nlm.nih.gov/pubmed/23950147
http://dx.doi.org/10.1101/gr.157388.113
_version_ 1782289317534629888
author Vieira, Filipe G.
Fumagalli, Matteo
Albrechtsen, Anders
Nielsen, Rasmus
author_facet Vieira, Filipe G.
Fumagalli, Matteo
Albrechtsen, Anders
Nielsen, Rasmus
author_sort Vieira, Filipe G.
collection PubMed
description Most methods for next-generation sequencing (NGS) data analyses incorporate information regarding allele frequencies using the assumption of Hardy–Weinberg equilibrium (HWE) as a prior. However, many organisms including those that are domesticated, partially selfing, or with asexual life cycles show strong deviations from HWE. For such species, and specially for low-coverage data, it is necessary to obtain estimates of inbreeding coefficients (F) for each individual before calling genotypes. Here, we present two methods for estimating inbreeding coefficients from NGS data based on an expectation-maximization (EM) algorithm. We assess the impact of taking inbreeding into account when calling genotypes or estimating the site frequency spectrum (SFS), and demonstrate a marked increase in accuracy on low-coverage highly inbred samples. We demonstrate the applicability and efficacy of these methods in both simulated and real data sets.
format Online
Article
Text
id pubmed-3814885
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-38148852014-05-01 Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation Vieira, Filipe G. Fumagalli, Matteo Albrechtsen, Anders Nielsen, Rasmus Genome Res Method Most methods for next-generation sequencing (NGS) data analyses incorporate information regarding allele frequencies using the assumption of Hardy–Weinberg equilibrium (HWE) as a prior. However, many organisms including those that are domesticated, partially selfing, or with asexual life cycles show strong deviations from HWE. For such species, and specially for low-coverage data, it is necessary to obtain estimates of inbreeding coefficients (F) for each individual before calling genotypes. Here, we present two methods for estimating inbreeding coefficients from NGS data based on an expectation-maximization (EM) algorithm. We assess the impact of taking inbreeding into account when calling genotypes or estimating the site frequency spectrum (SFS), and demonstrate a marked increase in accuracy on low-coverage highly inbred samples. We demonstrate the applicability and efficacy of these methods in both simulated and real data sets. Cold Spring Harbor Laboratory Press 2013-11 /pmc/articles/PMC3814885/ /pubmed/23950147 http://dx.doi.org/10.1101/gr.157388.113 Text en © 2013 Vieira et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported), as described at http://creativecommons.org/licenses/by-nc/3.0/.
spellingShingle Method
Vieira, Filipe G.
Fumagalli, Matteo
Albrechtsen, Anders
Nielsen, Rasmus
Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation
title Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation
title_full Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation
title_fullStr Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation
title_full_unstemmed Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation
title_short Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation
title_sort estimating inbreeding coefficients from ngs data: impact on genotype calling and allele frequency estimation
topic Method
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3814885/
https://www.ncbi.nlm.nih.gov/pubmed/23950147
http://dx.doi.org/10.1101/gr.157388.113
work_keys_str_mv AT vieirafilipeg estimatinginbreedingcoefficientsfromngsdataimpactongenotypecallingandallelefrequencyestimation
AT fumagallimatteo estimatinginbreedingcoefficientsfromngsdataimpactongenotypecallingandallelefrequencyestimation
AT albrechtsenanders estimatinginbreedingcoefficientsfromngsdataimpactongenotypecallingandallelefrequencyestimation
AT nielsenrasmus estimatinginbreedingcoefficientsfromngsdataimpactongenotypecallingandallelefrequencyestimation