Cargando…

IMPROVE-DD: Integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders

Diagnosing rare developmental disorders using genome-wide sequencing data commonly necessitates review of multiple plausible candidate variants, often using ontologies of categorical clinical terms. We show that Integrating Multiple Phenotype Resources Optimizes Variant Evaluation in Developmental D...

Descripción completa

Detalles Bibliográficos
Autores principales: Aitken, Stuart, Firth, Helen V., Wright, Caroline F., Hurles, Matthew E., FitzPatrick, David R., Semple, Colin A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9763511/
https://www.ncbi.nlm.nih.gov/pubmed/36561149
http://dx.doi.org/10.1016/j.xhgg.2022.100162
_version_ 1784853077670494208
author Aitken, Stuart
Firth, Helen V.
Wright, Caroline F.
Hurles, Matthew E.
FitzPatrick, David R.
Semple, Colin A.
author_facet Aitken, Stuart
Firth, Helen V.
Wright, Caroline F.
Hurles, Matthew E.
FitzPatrick, David R.
Semple, Colin A.
author_sort Aitken, Stuart
collection PubMed
description Diagnosing rare developmental disorders using genome-wide sequencing data commonly necessitates review of multiple plausible candidate variants, often using ontologies of categorical clinical terms. We show that Integrating Multiple Phenotype Resources Optimizes Variant Evaluation in Developmental Disorders (IMPROVE-DD) by incorporating additional classes of data commonly available to clinicians and recorded in health records. In doing so, we quantify the distinct contributions of sex, growth, and development in addition to Human Phenotype Ontology (HPO) terms and demonstrate added value from these readily available information sources. We use likelihood ratios for nominal and quantitative data and propose a classifier for HPO terms in this framework. This Bayesian framework results in more robust diagnoses. Using data systematically collected in the Deciphering Developmental Disorders study, we considered 77 genes with pathogenic/likely pathogenic variants in ≥10 individuals. All genes showed at least a satisfactory prediction by receiver operating characteristic when testing on training data (AUC ≥ 0.6), and HPO terms were the best predictor for the majority of genes, though a minority (13/77) of genes were better predicted by other phenotypic data types. Overall, classifiers based upon multiple integrated phenotypic data sources performed better than those based upon any individual source, and importantly, integrated models produced notably fewer false positives. Finally, we show that IMPROVE-DD models with good predictive performance on cross-validation can be constructed from relatively few individuals. This suggests new strategies for candidate gene prioritization and highlights the value of systematic clinical data collection to support diagnostic programs.
format Online
Article
Text
id pubmed-9763511
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-97635112022-12-21 IMPROVE-DD: Integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders Aitken, Stuart Firth, Helen V. Wright, Caroline F. Hurles, Matthew E. FitzPatrick, David R. Semple, Colin A. HGG Adv Report Diagnosing rare developmental disorders using genome-wide sequencing data commonly necessitates review of multiple plausible candidate variants, often using ontologies of categorical clinical terms. We show that Integrating Multiple Phenotype Resources Optimizes Variant Evaluation in Developmental Disorders (IMPROVE-DD) by incorporating additional classes of data commonly available to clinicians and recorded in health records. In doing so, we quantify the distinct contributions of sex, growth, and development in addition to Human Phenotype Ontology (HPO) terms and demonstrate added value from these readily available information sources. We use likelihood ratios for nominal and quantitative data and propose a classifier for HPO terms in this framework. This Bayesian framework results in more robust diagnoses. Using data systematically collected in the Deciphering Developmental Disorders study, we considered 77 genes with pathogenic/likely pathogenic variants in ≥10 individuals. All genes showed at least a satisfactory prediction by receiver operating characteristic when testing on training data (AUC ≥ 0.6), and HPO terms were the best predictor for the majority of genes, though a minority (13/77) of genes were better predicted by other phenotypic data types. Overall, classifiers based upon multiple integrated phenotypic data sources performed better than those based upon any individual source, and importantly, integrated models produced notably fewer false positives. Finally, we show that IMPROVE-DD models with good predictive performance on cross-validation can be constructed from relatively few individuals. This suggests new strategies for candidate gene prioritization and highlights the value of systematic clinical data collection to support diagnostic programs. Elsevier 2022-11-24 /pmc/articles/PMC9763511/ /pubmed/36561149 http://dx.doi.org/10.1016/j.xhgg.2022.100162 Text en © 2022 The Authors https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Report
Aitken, Stuart
Firth, Helen V.
Wright, Caroline F.
Hurles, Matthew E.
FitzPatrick, David R.
Semple, Colin A.
IMPROVE-DD: Integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders
title IMPROVE-DD: Integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders
title_full IMPROVE-DD: Integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders
title_fullStr IMPROVE-DD: Integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders
title_full_unstemmed IMPROVE-DD: Integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders
title_short IMPROVE-DD: Integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders
title_sort improve-dd: integrating multiple phenotype resources optimizes variant evaluation in genetically determined developmental disorders
topic Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9763511/
https://www.ncbi.nlm.nih.gov/pubmed/36561149
http://dx.doi.org/10.1016/j.xhgg.2022.100162
work_keys_str_mv AT aitkenstuart improveddintegratingmultiplephenotyperesourcesoptimizesvariantevaluationingeneticallydetermineddevelopmentaldisorders
AT firthhelenv improveddintegratingmultiplephenotyperesourcesoptimizesvariantevaluationingeneticallydetermineddevelopmentaldisorders
AT wrightcarolinef improveddintegratingmultiplephenotyperesourcesoptimizesvariantevaluationingeneticallydetermineddevelopmentaldisorders
AT hurlesmatthewe improveddintegratingmultiplephenotyperesourcesoptimizesvariantevaluationingeneticallydetermineddevelopmentaldisorders
AT fitzpatrickdavidr improveddintegratingmultiplephenotyperesourcesoptimizesvariantevaluationingeneticallydetermineddevelopmentaldisorders
AT semplecolina improveddintegratingmultiplephenotyperesourcesoptimizesvariantevaluationingeneticallydetermineddevelopmentaldisorders