Cargando…

Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution

Identity-by-descent (IBD) is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to share an IBD segment if that segment is inherited from a recent shared common ancestor without intervening recombination. Segments several cM long can be efficien...

Descripción completa

Detalles Bibliográficos
Autores principales: Chiang, Charleston W. K., Ralph, Peter, Novembre, John
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4856080/
https://www.ncbi.nlm.nih.gov/pubmed/26935417
http://dx.doi.org/10.1534/g3.116.027581
_version_ 1782430453562605568
author Chiang, Charleston W. K.
Ralph, Peter
Novembre, John
author_facet Chiang, Charleston W. K.
Ralph, Peter
Novembre, John
author_sort Chiang, Charleston W. K.
collection PubMed
description Identity-by-descent (IBD) is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to share an IBD segment if that segment is inherited from a recent shared common ancestor without intervening recombination. Segments several cM long can be efficiently detected by a number of algorithms using high-density SNP array data from a population sample, and there are currently efforts to detect shorter segments from sequencing. Here, we study a problem of identifiability: because existing approaches detect IBD based on contiguous segments of identity-by-state, inferred long segments of IBD may arise from the conflation of smaller, nearby IBD segments. We quantified this effect using coalescent simulations, finding that significant proportions of inferred segments 1–2 cM long are results of conflations of two or more shorter segments, each at least 0.2 cM or longer, under demographic scenarios typical for modern humans for all programs tested. The impact of such conflation is much smaller for longer (> 2 cM) segments. This biases the inferred IBD segment length distribution, and so can affect downstream inferences that depend on the assumption that each segment of IBD derives from a single common ancestor. As an example, we present and analyze an estimator of the de novo mutation rate using IBD segments, and demonstrate that unmodeled conflation leads to underestimates of the ages of the common ancestors on these segments, and hence a significant overestimate of the mutation rate. Understanding the conflation effect in detail will make its correction in future methods more tractable.
format Online
Article
Text
id pubmed-4856080
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-48560802016-05-05 Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution Chiang, Charleston W. K. Ralph, Peter Novembre, John G3 (Bethesda) Investigations Identity-by-descent (IBD) is a fundamental concept in genetics with many applications. In a common definition, two haplotypes are said to share an IBD segment if that segment is inherited from a recent shared common ancestor without intervening recombination. Segments several cM long can be efficiently detected by a number of algorithms using high-density SNP array data from a population sample, and there are currently efforts to detect shorter segments from sequencing. Here, we study a problem of identifiability: because existing approaches detect IBD based on contiguous segments of identity-by-state, inferred long segments of IBD may arise from the conflation of smaller, nearby IBD segments. We quantified this effect using coalescent simulations, finding that significant proportions of inferred segments 1–2 cM long are results of conflations of two or more shorter segments, each at least 0.2 cM or longer, under demographic scenarios typical for modern humans for all programs tested. The impact of such conflation is much smaller for longer (> 2 cM) segments. This biases the inferred IBD segment length distribution, and so can affect downstream inferences that depend on the assumption that each segment of IBD derives from a single common ancestor. As an example, we present and analyze an estimator of the de novo mutation rate using IBD segments, and demonstrate that unmodeled conflation leads to underestimates of the ages of the common ancestors on these segments, and hence a significant overestimate of the mutation rate. Understanding the conflation effect in detail will make its correction in future methods more tractable. Genetics Society of America 2016-03-01 /pmc/articles/PMC4856080/ /pubmed/26935417 http://dx.doi.org/10.1534/g3.116.027581 Text en Copyright © 2016 Chiang et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
Chiang, Charleston W. K.
Ralph, Peter
Novembre, John
Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution
title Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution
title_full Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution
title_fullStr Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution
title_full_unstemmed Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution
title_short Conflation of Short Identity-by-Descent Segments Bias Their Inferred Length Distribution
title_sort conflation of short identity-by-descent segments bias their inferred length distribution
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4856080/
https://www.ncbi.nlm.nih.gov/pubmed/26935417
http://dx.doi.org/10.1534/g3.116.027581
work_keys_str_mv AT chiangcharlestonwk conflationofshortidentitybydescentsegmentsbiastheirinferredlengthdistribution
AT ralphpeter conflationofshortidentitybydescentsegmentsbiastheirinferredlengthdistribution
AT novembrejohn conflationofshortidentitybydescentsegmentsbiastheirinferredlengthdistribution