Cargando…

A gene pathogenicity tool ‘GenePy’ identifies missed biallelic diagnoses in the 100,000 Genomes Project

The 100,000 Genomes Project (100KGP) diagnosed a quarter of recruited affected participants, but 26% of diagnoses were in genes not on the chosen gene panel(s); with many being de novo variants of high impact. However, assessing biallelic variants without a gene panel is challenging, due to the numb...

Descripción completa

Detalles Bibliográficos
Autores principales: Seaby, Eleanor G., Leggatt, Gary, Cheng, Guo, Thomas, N. Simon, Ashton, James J, Stafford, Imogen, Baralle, Diana, Rehm, Heidi L., O’Donnell-Luria, Anne, Ennis, Sarah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081430/
https://www.ncbi.nlm.nih.gov/pubmed/37034701
http://dx.doi.org/10.1101/2023.03.21.23287545
_version_ 1785021123227811840
author Seaby, Eleanor G.
Leggatt, Gary
Cheng, Guo
Thomas, N. Simon
Ashton, James J
Stafford, Imogen
Baralle, Diana
Rehm, Heidi L.
O’Donnell-Luria, Anne
Ennis, Sarah
author_facet Seaby, Eleanor G.
Leggatt, Gary
Cheng, Guo
Thomas, N. Simon
Ashton, James J
Stafford, Imogen
Baralle, Diana
Rehm, Heidi L.
O’Donnell-Luria, Anne
Ennis, Sarah
author_sort Seaby, Eleanor G.
collection PubMed
description The 100,000 Genomes Project (100KGP) diagnosed a quarter of recruited affected participants, but 26% of diagnoses were in genes not on the chosen gene panel(s); with many being de novo variants of high impact. However, assessing biallelic variants without a gene panel is challenging, due to the number of variants requiring scrutiny. We sought to identify potential missed biallelic diagnoses independent of the gene panel applied using GenePy - a whole gene pathogenicity metric. GenePy scores all variants called in a given individual, incorporating allele frequency, zygosity, and a user-defined deleterious metric (CADD v1.6 applied herein). GenePy then combines all variant scores for individual genes, generating an aggregate score per gene, per participant. We calculated GenePy scores for 2862 recessive disease genes in 78,216 individuals in 100KGP. For each gene, we ranked participant GenePy scores for that gene, and scrutinised affected individuals without a diagnosis whose scores ranked amongst the top-5 for each gene. We assessed these participants’ phenotypes for overlap with the disease gene associated phenotype for which they were highly ranked. Where phenotypes overlapped, we extracted rare variants in the gene of interest and applied phase, ClinVar and ACMG classification looking for putative causal biallelic variants. 3184 affected individuals without a molecular diagnosis had a top-5 ranked GenePy gene score and 682/3184 (21%) had phenotypes overlapping with one of the top-ranking genes. After removing 13 withdrawn participants, in 122/669 (18%) of the phenotype-matched cases, we identified a putative missed diagnosis in a top-ranked gene supported by phasing, ClinVar and ACMG classification. A further 334/669 (50%) of cases have a possible missed diagnosis but require functional validation. Applying GenePy at scale has identified potential diagnoses for 456/3183 (14%) of undiagnosed participants who had a top-5 ranked GenePy score in a recessive disease gene, whilst adding only 1.2 additional variants (per individual) for assessment.
format Online
Article
Text
id pubmed-10081430
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-100814302023-04-08 A gene pathogenicity tool ‘GenePy’ identifies missed biallelic diagnoses in the 100,000 Genomes Project Seaby, Eleanor G. Leggatt, Gary Cheng, Guo Thomas, N. Simon Ashton, James J Stafford, Imogen Baralle, Diana Rehm, Heidi L. O’Donnell-Luria, Anne Ennis, Sarah medRxiv Article The 100,000 Genomes Project (100KGP) diagnosed a quarter of recruited affected participants, but 26% of diagnoses were in genes not on the chosen gene panel(s); with many being de novo variants of high impact. However, assessing biallelic variants without a gene panel is challenging, due to the number of variants requiring scrutiny. We sought to identify potential missed biallelic diagnoses independent of the gene panel applied using GenePy - a whole gene pathogenicity metric. GenePy scores all variants called in a given individual, incorporating allele frequency, zygosity, and a user-defined deleterious metric (CADD v1.6 applied herein). GenePy then combines all variant scores for individual genes, generating an aggregate score per gene, per participant. We calculated GenePy scores for 2862 recessive disease genes in 78,216 individuals in 100KGP. For each gene, we ranked participant GenePy scores for that gene, and scrutinised affected individuals without a diagnosis whose scores ranked amongst the top-5 for each gene. We assessed these participants’ phenotypes for overlap with the disease gene associated phenotype for which they were highly ranked. Where phenotypes overlapped, we extracted rare variants in the gene of interest and applied phase, ClinVar and ACMG classification looking for putative causal biallelic variants. 3184 affected individuals without a molecular diagnosis had a top-5 ranked GenePy gene score and 682/3184 (21%) had phenotypes overlapping with one of the top-ranking genes. After removing 13 withdrawn participants, in 122/669 (18%) of the phenotype-matched cases, we identified a putative missed diagnosis in a top-ranked gene supported by phasing, ClinVar and ACMG classification. A further 334/669 (50%) of cases have a possible missed diagnosis but require functional validation. Applying GenePy at scale has identified potential diagnoses for 456/3183 (14%) of undiagnosed participants who had a top-5 ranked GenePy score in a recessive disease gene, whilst adding only 1.2 additional variants (per individual) for assessment. Cold Spring Harbor Laboratory 2023-03-30 /pmc/articles/PMC10081430/ /pubmed/37034701 http://dx.doi.org/10.1101/2023.03.21.23287545 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Seaby, Eleanor G.
Leggatt, Gary
Cheng, Guo
Thomas, N. Simon
Ashton, James J
Stafford, Imogen
Baralle, Diana
Rehm, Heidi L.
O’Donnell-Luria, Anne
Ennis, Sarah
A gene pathogenicity tool ‘GenePy’ identifies missed biallelic diagnoses in the 100,000 Genomes Project
title A gene pathogenicity tool ‘GenePy’ identifies missed biallelic diagnoses in the 100,000 Genomes Project
title_full A gene pathogenicity tool ‘GenePy’ identifies missed biallelic diagnoses in the 100,000 Genomes Project
title_fullStr A gene pathogenicity tool ‘GenePy’ identifies missed biallelic diagnoses in the 100,000 Genomes Project
title_full_unstemmed A gene pathogenicity tool ‘GenePy’ identifies missed biallelic diagnoses in the 100,000 Genomes Project
title_short A gene pathogenicity tool ‘GenePy’ identifies missed biallelic diagnoses in the 100,000 Genomes Project
title_sort gene pathogenicity tool ‘genepy’ identifies missed biallelic diagnoses in the 100,000 genomes project
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081430/
https://www.ncbi.nlm.nih.gov/pubmed/37034701
http://dx.doi.org/10.1101/2023.03.21.23287545
work_keys_str_mv AT seabyeleanorg agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT leggattgary agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT chengguo agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT thomasnsimon agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT ashtonjamesj agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT staffordimogen agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT barallediana agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT rehmheidil agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT odonnellluriaanne agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT ennissarah agenepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT seabyeleanorg genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT leggattgary genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT chengguo genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT thomasnsimon genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT ashtonjamesj genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT staffordimogen genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT barallediana genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT rehmheidil genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT odonnellluriaanne genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject
AT ennissarah genepathogenicitytoolgenepyidentifiesmissedbiallelicdiagnosesinthe100000genomesproject