Cargando…

Biobank-scale inference of multi-individual identity by descent and gene conversion

We present a method for efficiently identifying clusters of identical-by-descent haplotypes in biobank-scale sequence data. Our multi-individual approach enables much more efficient collection and storage of identity by descent (IBD) information than approaches that detect and store pairwise IBD seg...

Descripción completa

Detalles Bibliográficos
Autores principales: Browning, Sharon R., Browning, Brian L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635131/
https://www.ncbi.nlm.nih.gov/pubmed/37961601
http://dx.doi.org/10.1101/2023.11.03.565574
_version_ 1785146293631320064
author Browning, Sharon R.
Browning, Brian L.
author_facet Browning, Sharon R.
Browning, Brian L.
author_sort Browning, Sharon R.
collection PubMed
description We present a method for efficiently identifying clusters of identical-by-descent haplotypes in biobank-scale sequence data. Our multi-individual approach enables much more efficient collection and storage of identity by descent (IBD) information than approaches that detect and store pairwise IBD segments. Our method’s computation time, memory requirements, and output size scale linearly with the number of individuals in the dataset. We also present a method for using multi-individual IBD to detect alleles changed by gene conversion. Application of our methods to the autosomal sequence data for 125,361 White British individuals in the UK Biobank detects more than 9 million converted alleles. This is 2900 times more alleles changed by gene conversion than were detected in a previous analysis of familial data. We estimate that more than 250,000 sequenced probands and a much larger number of additional genomes from multi-generational family members would be required to find a similar number of alleles changed by gene conversion using a family-based approach.
format Online
Article
Text
id pubmed-10635131
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-106351312023-11-13 Biobank-scale inference of multi-individual identity by descent and gene conversion Browning, Sharon R. Browning, Brian L. bioRxiv Article We present a method for efficiently identifying clusters of identical-by-descent haplotypes in biobank-scale sequence data. Our multi-individual approach enables much more efficient collection and storage of identity by descent (IBD) information than approaches that detect and store pairwise IBD segments. Our method’s computation time, memory requirements, and output size scale linearly with the number of individuals in the dataset. We also present a method for using multi-individual IBD to detect alleles changed by gene conversion. Application of our methods to the autosomal sequence data for 125,361 White British individuals in the UK Biobank detects more than 9 million converted alleles. This is 2900 times more alleles changed by gene conversion than were detected in a previous analysis of familial data. We estimate that more than 250,000 sequenced probands and a much larger number of additional genomes from multi-generational family members would be required to find a similar number of alleles changed by gene conversion using a family-based approach. Cold Spring Harbor Laboratory 2023-11-05 /pmc/articles/PMC10635131/ /pubmed/37961601 http://dx.doi.org/10.1101/2023.11.03.565574 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Browning, Sharon R.
Browning, Brian L.
Biobank-scale inference of multi-individual identity by descent and gene conversion
title Biobank-scale inference of multi-individual identity by descent and gene conversion
title_full Biobank-scale inference of multi-individual identity by descent and gene conversion
title_fullStr Biobank-scale inference of multi-individual identity by descent and gene conversion
title_full_unstemmed Biobank-scale inference of multi-individual identity by descent and gene conversion
title_short Biobank-scale inference of multi-individual identity by descent and gene conversion
title_sort biobank-scale inference of multi-individual identity by descent and gene conversion
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635131/
https://www.ncbi.nlm.nih.gov/pubmed/37961601
http://dx.doi.org/10.1101/2023.11.03.565574
work_keys_str_mv AT browningsharonr biobankscaleinferenceofmultiindividualidentitybydescentandgeneconversion
AT browningbrianl biobankscaleinferenceofmultiindividualidentitybydescentandgeneconversion