Cargando…

VarSight: prioritizing clinically reported variants with binary classification algorithms

BACKGROUND: When applying genomic medicine to a rare disease patient, the primary goal is to identify one or more genomic variants that may explain the patient’s phenotypes. Typically, this is done through annotation, filtering, and then prioritization of variants for manual curation. However, prior...

Descripción completa

Detalles Bibliográficos
Autores principales: Holt, James M., Wilk, Brandon, Birch, Camille L., Brown, Donna M., Gajapathy, Manavalan, Moss, Alexander C., Sosonkina, Nadiya, Wilk, Melissa A., Anderson, Julie A., Harris, Jeremy M., Kelly, Jacob M., Shaterferdosian, Fariba, Uno-Antonison, Angelina E., Weborg, Arthur, Worthey, Elizabeth A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6792253/
https://www.ncbi.nlm.nih.gov/pubmed/31615419
http://dx.doi.org/10.1186/s12859-019-3026-8
_version_ 1783459111125385216
author Holt, James M.
Wilk, Brandon
Birch, Camille L.
Brown, Donna M.
Gajapathy, Manavalan
Moss, Alexander C.
Sosonkina, Nadiya
Wilk, Melissa A.
Anderson, Julie A.
Harris, Jeremy M.
Kelly, Jacob M.
Shaterferdosian, Fariba
Uno-Antonison, Angelina E.
Weborg, Arthur
Worthey, Elizabeth A.
author_facet Holt, James M.
Wilk, Brandon
Birch, Camille L.
Brown, Donna M.
Gajapathy, Manavalan
Moss, Alexander C.
Sosonkina, Nadiya
Wilk, Melissa A.
Anderson, Julie A.
Harris, Jeremy M.
Kelly, Jacob M.
Shaterferdosian, Fariba
Uno-Antonison, Angelina E.
Weborg, Arthur
Worthey, Elizabeth A.
author_sort Holt, James M.
collection PubMed
description BACKGROUND: When applying genomic medicine to a rare disease patient, the primary goal is to identify one or more genomic variants that may explain the patient’s phenotypes. Typically, this is done through annotation, filtering, and then prioritization of variants for manual curation. However, prioritization of variants in rare disease patients remains a challenging task due to the high degree of variability in phenotype presentation and molecular source of disease. Thus, methods that can identify and/or prioritize variants to be clinically reported in the presence of such variability are of critical importance. METHODS: We tested the application of classification algorithms that ingest variant annotations along with phenotype information for predicting whether a variant will ultimately be clinically reported and returned to a patient. To test the classifiers, we performed a retrospective study on variants that were clinically reported to 237 patients in the Undiagnosed Diseases Network. RESULTS: We treated the classifiers as variant prioritization systems and compared them to four variant prioritization algorithms and two single-measure controls. We showed that the trained classifiers outperformed all other tested methods with the best classifiers ranking 72% of all reported variants and 94% of reported pathogenic variants in the top 20. CONCLUSIONS: We demonstrated how freely available binary classification algorithms can be used to prioritize variants even in the presence of real-world variability. Furthermore, these classifiers outperformed all other tested methods, suggesting that they may be well suited for working with real rare disease patient datasets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3026-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6792253
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-67922532019-10-21 VarSight: prioritizing clinically reported variants with binary classification algorithms Holt, James M. Wilk, Brandon Birch, Camille L. Brown, Donna M. Gajapathy, Manavalan Moss, Alexander C. Sosonkina, Nadiya Wilk, Melissa A. Anderson, Julie A. Harris, Jeremy M. Kelly, Jacob M. Shaterferdosian, Fariba Uno-Antonison, Angelina E. Weborg, Arthur Worthey, Elizabeth A. BMC Bioinformatics Research Article BACKGROUND: When applying genomic medicine to a rare disease patient, the primary goal is to identify one or more genomic variants that may explain the patient’s phenotypes. Typically, this is done through annotation, filtering, and then prioritization of variants for manual curation. However, prioritization of variants in rare disease patients remains a challenging task due to the high degree of variability in phenotype presentation and molecular source of disease. Thus, methods that can identify and/or prioritize variants to be clinically reported in the presence of such variability are of critical importance. METHODS: We tested the application of classification algorithms that ingest variant annotations along with phenotype information for predicting whether a variant will ultimately be clinically reported and returned to a patient. To test the classifiers, we performed a retrospective study on variants that were clinically reported to 237 patients in the Undiagnosed Diseases Network. RESULTS: We treated the classifiers as variant prioritization systems and compared them to four variant prioritization algorithms and two single-measure controls. We showed that the trained classifiers outperformed all other tested methods with the best classifiers ranking 72% of all reported variants and 94% of reported pathogenic variants in the top 20. CONCLUSIONS: We demonstrated how freely available binary classification algorithms can be used to prioritize variants even in the presence of real-world variability. Furthermore, these classifiers outperformed all other tested methods, suggesting that they may be well suited for working with real rare disease patient datasets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3026-8) contains supplementary material, which is available to authorized users. BioMed Central 2019-10-15 /pmc/articles/PMC6792253/ /pubmed/31615419 http://dx.doi.org/10.1186/s12859-019-3026-8 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Holt, James M.
Wilk, Brandon
Birch, Camille L.
Brown, Donna M.
Gajapathy, Manavalan
Moss, Alexander C.
Sosonkina, Nadiya
Wilk, Melissa A.
Anderson, Julie A.
Harris, Jeremy M.
Kelly, Jacob M.
Shaterferdosian, Fariba
Uno-Antonison, Angelina E.
Weborg, Arthur
Worthey, Elizabeth A.
VarSight: prioritizing clinically reported variants with binary classification algorithms
title VarSight: prioritizing clinically reported variants with binary classification algorithms
title_full VarSight: prioritizing clinically reported variants with binary classification algorithms
title_fullStr VarSight: prioritizing clinically reported variants with binary classification algorithms
title_full_unstemmed VarSight: prioritizing clinically reported variants with binary classification algorithms
title_short VarSight: prioritizing clinically reported variants with binary classification algorithms
title_sort varsight: prioritizing clinically reported variants with binary classification algorithms
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6792253/
https://www.ncbi.nlm.nih.gov/pubmed/31615419
http://dx.doi.org/10.1186/s12859-019-3026-8
work_keys_str_mv AT holtjamesm varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT wilkbrandon varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT birchcamillel varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT browndonnam varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT gajapathymanavalan varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT mossalexanderc varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT sosonkinanadiya varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT wilkmelissaa varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT andersonjuliea varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT harrisjeremym varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT kellyjacobm varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT shaterferdosianfariba varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT unoantonisonangelinae varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT weborgarthur varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms
AT wortheyelizabetha varsightprioritizingclinicallyreportedvariantswithbinaryclassificationalgorithms