Cargando…

Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes

The release of 150,119 UK Biobank sequences represents an unprecedented opportunity as a reference panel to impute low-coverage whole-genome sequencing data with high accuracy but current methods cannot cope with the size of the data. Here we introduce GLIMPSE2, a low-coverage whole-genome sequencin...

Descripción completa

Detalles Bibliográficos
Autores principales: Rubinacci, Simone, Hofmeister, Robin J., Sousa da Mota, Bárbara, Delaneau, Olivier
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group US 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10335927/
https://www.ncbi.nlm.nih.gov/pubmed/37386250
http://dx.doi.org/10.1038/s41588-023-01438-3
_version_ 1785071099735703552
author Rubinacci, Simone
Hofmeister, Robin J.
Sousa da Mota, Bárbara
Delaneau, Olivier
author_facet Rubinacci, Simone
Hofmeister, Robin J.
Sousa da Mota, Bárbara
Delaneau, Olivier
author_sort Rubinacci, Simone
collection PubMed
description The release of 150,119 UK Biobank sequences represents an unprecedented opportunity as a reference panel to impute low-coverage whole-genome sequencing data with high accuracy but current methods cannot cope with the size of the data. Here we introduce GLIMPSE2, a low-coverage whole-genome sequencing imputation method that scales sublinearly in both the number of samples and markers, achieving efficient whole-genome imputation from the UK Biobank reference panel while retaining high accuracy for ancient and modern genomes, particularly at rare variants and for very low-coverage samples.
format Online
Article
Text
id pubmed-10335927
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group US
record_format MEDLINE/PubMed
spelling pubmed-103359272023-07-13 Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes Rubinacci, Simone Hofmeister, Robin J. Sousa da Mota, Bárbara Delaneau, Olivier Nat Genet Brief Communication The release of 150,119 UK Biobank sequences represents an unprecedented opportunity as a reference panel to impute low-coverage whole-genome sequencing data with high accuracy but current methods cannot cope with the size of the data. Here we introduce GLIMPSE2, a low-coverage whole-genome sequencing imputation method that scales sublinearly in both the number of samples and markers, achieving efficient whole-genome imputation from the UK Biobank reference panel while retaining high accuracy for ancient and modern genomes, particularly at rare variants and for very low-coverage samples. Nature Publishing Group US 2023-06-29 2023 /pmc/articles/PMC10335927/ /pubmed/37386250 http://dx.doi.org/10.1038/s41588-023-01438-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Brief Communication
Rubinacci, Simone
Hofmeister, Robin J.
Sousa da Mota, Bárbara
Delaneau, Olivier
Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
title Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
title_full Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
title_fullStr Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
title_full_unstemmed Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
title_short Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes
title_sort imputation of low-coverage sequencing data from 150,119 uk biobank genomes
topic Brief Communication
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10335927/
https://www.ncbi.nlm.nih.gov/pubmed/37386250
http://dx.doi.org/10.1038/s41588-023-01438-3
work_keys_str_mv AT rubinaccisimone imputationoflowcoveragesequencingdatafrom150119ukbiobankgenomes
AT hofmeisterrobinj imputationoflowcoveragesequencingdatafrom150119ukbiobankgenomes
AT sousadamotabarbara imputationoflowcoveragesequencingdatafrom150119ukbiobankgenomes
AT delaneauolivier imputationoflowcoveragesequencingdatafrom150119ukbiobankgenomes