Cargando…

Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation

The short lengths of short-read sequencing reads challenge the analysis of paralogous genomic regions in exome and genome sequencing data. Most genetic variants within these homologous regions therefore remain unidentified in standard analyses. Here, we present a method (Chameleolyser) that accurate...

Descripción completa

Detalles Bibliográficos
Autores principales: Steyaert, Wouter, Haer-Wigman, Lonneke, Pfundt, Rolph, Hellebrekers, Debby, Steehouwer, Marloes, Hampstead, Juliet, de Boer, Elke, Stegmann, Alexander, Yntema, Helger, Kamsteeg, Erik-Jan, Brunner, Han, Hoischen, Alexander, Gilissen, Christian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10611741/
https://www.ncbi.nlm.nih.gov/pubmed/37891200
http://dx.doi.org/10.1038/s41467-023-42531-9
_version_ 1785128548931993600
author Steyaert, Wouter
Haer-Wigman, Lonneke
Pfundt, Rolph
Hellebrekers, Debby
Steehouwer, Marloes
Hampstead, Juliet
de Boer, Elke
Stegmann, Alexander
Yntema, Helger
Kamsteeg, Erik-Jan
Brunner, Han
Hoischen, Alexander
Gilissen, Christian
author_facet Steyaert, Wouter
Haer-Wigman, Lonneke
Pfundt, Rolph
Hellebrekers, Debby
Steehouwer, Marloes
Hampstead, Juliet
de Boer, Elke
Stegmann, Alexander
Yntema, Helger
Kamsteeg, Erik-Jan
Brunner, Han
Hoischen, Alexander
Gilissen, Christian
author_sort Steyaert, Wouter
collection PubMed
description The short lengths of short-read sequencing reads challenge the analysis of paralogous genomic regions in exome and genome sequencing data. Most genetic variants within these homologous regions therefore remain unidentified in standard analyses. Here, we present a method (Chameleolyser) that accurately identifies single nucleotide variants and small insertions/deletions (SNVs/Indels), copy number variants and ectopic gene conversion events in duplicated genomic regions using whole-exome sequencing data. Application to a cohort of 41,755 exome samples yields 20,432 rare homozygous deletions and 2,529,791 rare SNVs/Indels, of which we show that 338,084 are due to gene conversion events. None of the SNVs/Indels are detectable using regular analysis techniques. Validation by high-fidelity long-read sequencing in 20 samples confirms >88% of called variants. Focusing on variation in known disease genes leads to a direct molecular diagnosis in 25 previously undiagnosed patients. Our method can readily be applied to existing exome data.
format Online
Article
Text
id pubmed-10611741
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-106117412023-10-29 Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation Steyaert, Wouter Haer-Wigman, Lonneke Pfundt, Rolph Hellebrekers, Debby Steehouwer, Marloes Hampstead, Juliet de Boer, Elke Stegmann, Alexander Yntema, Helger Kamsteeg, Erik-Jan Brunner, Han Hoischen, Alexander Gilissen, Christian Nat Commun Article The short lengths of short-read sequencing reads challenge the analysis of paralogous genomic regions in exome and genome sequencing data. Most genetic variants within these homologous regions therefore remain unidentified in standard analyses. Here, we present a method (Chameleolyser) that accurately identifies single nucleotide variants and small insertions/deletions (SNVs/Indels), copy number variants and ectopic gene conversion events in duplicated genomic regions using whole-exome sequencing data. Application to a cohort of 41,755 exome samples yields 20,432 rare homozygous deletions and 2,529,791 rare SNVs/Indels, of which we show that 338,084 are due to gene conversion events. None of the SNVs/Indels are detectable using regular analysis techniques. Validation by high-fidelity long-read sequencing in 20 samples confirms >88% of called variants. Focusing on variation in known disease genes leads to a direct molecular diagnosis in 25 previously undiagnosed patients. Our method can readily be applied to existing exome data. Nature Publishing Group UK 2023-10-27 /pmc/articles/PMC10611741/ /pubmed/37891200 http://dx.doi.org/10.1038/s41467-023-42531-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Steyaert, Wouter
Haer-Wigman, Lonneke
Pfundt, Rolph
Hellebrekers, Debby
Steehouwer, Marloes
Hampstead, Juliet
de Boer, Elke
Stegmann, Alexander
Yntema, Helger
Kamsteeg, Erik-Jan
Brunner, Han
Hoischen, Alexander
Gilissen, Christian
Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation
title Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation
title_full Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation
title_fullStr Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation
title_full_unstemmed Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation
title_short Systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation
title_sort systematic analysis of paralogous regions in 41,755 exomes uncovers clinically relevant variation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10611741/
https://www.ncbi.nlm.nih.gov/pubmed/37891200
http://dx.doi.org/10.1038/s41467-023-42531-9
work_keys_str_mv AT steyaertwouter systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT haerwigmanlonneke systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT pfundtrolph systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT hellebrekersdebby systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT steehouwermarloes systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT hampsteadjuliet systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT deboerelke systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT stegmannalexander systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT yntemahelger systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT kamsteegerikjan systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT brunnerhan systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT hoischenalexander systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation
AT gilissenchristian systematicanalysisofparalogousregionsin41755exomesuncoversclinicallyrelevantvariation