Cargando…
Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification
Despite decades of methods development for classifying relatives in genetic studies, pairwise relatedness methods’ recalls are above 90% only for first through third-degree relatives. The top-performing approaches, which leverage identity-by-descent segments, often use only kinship coefficients, whi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9157175/ https://www.ncbi.nlm.nih.gov/pubmed/35348675 http://dx.doi.org/10.1093/g3journal/jkac072 |
_version_ | 1784718583866064896 |
---|---|
author | Smith, Jesse Qiao, Ying Williams, Amy L |
author_facet | Smith, Jesse Qiao, Ying Williams, Amy L |
author_sort | Smith, Jesse |
collection | PubMed |
description | Despite decades of methods development for classifying relatives in genetic studies, pairwise relatedness methods’ recalls are above 90% only for first through third-degree relatives. The top-performing approaches, which leverage identity-by-descent segments, often use only kinship coefficients, while others, including estimation of recent shared ancestry (ERSA), use the number of segments relatives share. To quantify the potential for using segment numbers in relatedness inference, we leveraged information theory measures to analyze exact (i.e. produced by a simulator) identity-by-descent segments from simulated relatives. Over a range of settings, we found that the mutual information between the relatives’ degree of relatedness and a tuple of their kinship coefficient and segment number is on average 4.6% larger than between the degree and the kinship coefficient alone. We further evaluated identity-by-descent segment number utility by building a Bayes classifier to predict first through sixth-degree relationships using different feature sets. When trained and tested with exact segments, the inclusion of segment numbers improves the recall by between 0.28% and 3% for second through sixth-degree relatives. However, the recalls improve by less than 1.8% per degree when using inferred segments, suggesting limitations due to identity-by-descent detection accuracy. Last, we compared our Bayes classifier that includes segment numbers with both ERSA and IBIS and found comparable recalls, with the Bayes classifier and ERSA slightly outperforming each other across different degrees. Overall, this study shows that identity-by-descent segment numbers can improve relatedness inference, but errors from current SNP array-based detection methods yield dampened signals in practice. |
format | Online Article Text |
id | pubmed-9157175 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-91571752022-06-04 Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification Smith, Jesse Qiao, Ying Williams, Amy L G3 (Bethesda) Investigation Despite decades of methods development for classifying relatives in genetic studies, pairwise relatedness methods’ recalls are above 90% only for first through third-degree relatives. The top-performing approaches, which leverage identity-by-descent segments, often use only kinship coefficients, while others, including estimation of recent shared ancestry (ERSA), use the number of segments relatives share. To quantify the potential for using segment numbers in relatedness inference, we leveraged information theory measures to analyze exact (i.e. produced by a simulator) identity-by-descent segments from simulated relatives. Over a range of settings, we found that the mutual information between the relatives’ degree of relatedness and a tuple of their kinship coefficient and segment number is on average 4.6% larger than between the degree and the kinship coefficient alone. We further evaluated identity-by-descent segment number utility by building a Bayes classifier to predict first through sixth-degree relationships using different feature sets. When trained and tested with exact segments, the inclusion of segment numbers improves the recall by between 0.28% and 3% for second through sixth-degree relatives. However, the recalls improve by less than 1.8% per degree when using inferred segments, suggesting limitations due to identity-by-descent detection accuracy. Last, we compared our Bayes classifier that includes segment numbers with both ERSA and IBIS and found comparable recalls, with the Bayes classifier and ERSA slightly outperforming each other across different degrees. Overall, this study shows that identity-by-descent segment numbers can improve relatedness inference, but errors from current SNP array-based detection methods yield dampened signals in practice. Oxford University Press 2022-03-28 /pmc/articles/PMC9157175/ /pubmed/35348675 http://dx.doi.org/10.1093/g3journal/jkac072 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Genetics Society of America. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Investigation Smith, Jesse Qiao, Ying Williams, Amy L Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification |
title | Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification |
title_full | Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification |
title_fullStr | Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification |
title_full_unstemmed | Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification |
title_short | Evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification |
title_sort | evaluating the utility of identity-by-descent segment numbers for relatedness inference via information theory and classification |
topic | Investigation |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9157175/ https://www.ncbi.nlm.nih.gov/pubmed/35348675 http://dx.doi.org/10.1093/g3journal/jkac072 |
work_keys_str_mv | AT smithjesse evaluatingtheutilityofidentitybydescentsegmentnumbersforrelatednessinferenceviainformationtheoryandclassification AT qiaoying evaluatingtheutilityofidentitybydescentsegmentnumbersforrelatednessinferenceviainformationtheoryandclassification AT williamsamyl evaluatingtheutilityofidentitybydescentsegmentnumbersforrelatednessinferenceviainformationtheoryandclassification |