Cargando…

Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification

Despite decades of methods development for classifying relatives in genetic studies, pairwise relatedness methods’ recalls are above 90% only for first through third-degree relatives. The top-performing approaches, which leverage identity-by-descent segments, often use only kinship coefficients, whi...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Jesse, Qiao, Ying, Williams, Amy L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9157175/
https://www.ncbi.nlm.nih.gov/pubmed/35348675
http://dx.doi.org/10.1093/g3journal/jkac072
_version_ 1784718583866064896
author Smith, Jesse
Qiao, Ying
Williams, Amy L
author_facet Smith, Jesse
Qiao, Ying
Williams, Amy L
author_sort Smith, Jesse
collection PubMed
description Despite decades of methods development for classifying relatives in genetic studies, pairwise relatedness methods’ recalls are above 90% only for first through third-degree relatives. The top-performing approaches, which leverage identity-by-descent segments, often use only kinship coefficients, while others, including estimation of recent shared ancestry (ERSA), use the number of segments relatives share. To quantify the potential for using segment numbers in relatedness inference, we leveraged information theory measures to analyze exact (i.e. produced by a simulator) identity-by-descent segments from simulated relatives. Over a range of settings, we found that the mutual information between the relatives’ degree of relatedness and a tuple of their kinship coefficient and segment number is on average 4.6% larger than between the degree and the kinship coefficient alone. We further evaluated identity-by-descent segment number utility by building a Bayes classifier to predict first through sixth-degree relationships using different feature sets. When trained and tested with exact segments, the inclusion of segment numbers improves the recall by between 0.28% and 3% for second through sixth-degree relatives. However, the recalls improve by less than 1.8% per degree when using inferred segments, suggesting limitations due to identity-by-descent detection accuracy. Last, we compared our Bayes classifier that includes segment numbers with both ERSA and IBIS and found comparable recalls, with the Bayes classifier and ERSA slightly outperforming each other across different degrees. Overall, this study shows that identity-by-descent segment numbers can improve relatedness inference, but errors from current SNP array-based detection methods yield dampened signals in practice.
format Online
Article
Text
id pubmed-9157175
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91571752022-06-04 Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification Smith, Jesse Qiao, Ying Williams, Amy L G3 (Bethesda) Investigation Despite decades of methods development for classifying relatives in genetic studies, pairwise relatedness methods’ recalls are above 90% only for first through third-degree relatives. The top-performing approaches, which leverage identity-by-descent segments, often use only kinship coefficients, while others, including estimation of recent shared ancestry (ERSA), use the number of segments relatives share. To quantify the potential for using segment numbers in relatedness inference, we leveraged information theory measures to analyze exact (i.e. produced by a simulator) identity-by-descent segments from simulated relatives. Over a range of settings, we found that the mutual information between the relatives’ degree of relatedness and a tuple of their kinship coefficient and segment number is on average 4.6% larger than between the degree and the kinship coefficient alone. We further evaluated identity-by-descent segment number utility by building a Bayes classifier to predict first through sixth-degree relationships using different feature sets. When trained and tested with exact segments, the inclusion of segment numbers improves the recall by between 0.28% and 3% for second through sixth-degree relatives. However, the recalls improve by less than 1.8% per degree when using inferred segments, suggesting limitations due to identity-by-descent detection accuracy. Last, we compared our Bayes classifier that includes segment numbers with both ERSA and IBIS and found comparable recalls, with the Bayes classifier and ERSA slightly outperforming each other across different degrees. Overall, this study shows that identity-by-descent segment numbers can improve relatedness inference, but errors from current SNP array-based detection methods yield dampened signals in practice. Oxford University Press 2022-03-28 /pmc/articles/PMC9157175/ /pubmed/35348675 http://dx.doi.org/10.1093/g3journal/jkac072 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigation
Smith, Jesse
Qiao, Ying
Williams, Amy L
Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification
title Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification
title_full Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification
title_fullStr Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification
title_full_unstemmed Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification
title_short Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification
title_sort evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification
topic Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9157175/
https://www.ncbi.nlm.nih.gov/pubmed/35348675
http://dx.doi.org/10.1093/g3journal/jkac072
work_keys_str_mv AT smithjesse evaluatingtheutilityofidentitybydescentsegmentnumbersforrelatednessinferenceviainformationtheoryandclassification
AT qiaoying evaluatingtheutilityofidentitybydescentsegmentnumbersforrelatednessinferenceviainformationtheoryandclassification
AT williamsamyl evaluatingtheutilityofidentitybydescentsegmentnumbersforrelatednessinferenceviainformationtheoryandclassification