Cargando…
The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity
Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies for exploring complex and Mendelian diseases. A large number of in silico tools have been employed for the task of pathogenicity prediction, including PolyPhen‐2, SIFT, FatHMM, Muta...
Autores principales: | , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4409520/ https://www.ncbi.nlm.nih.gov/pubmed/25684150 http://dx.doi.org/10.1002/humu.22768 |
_version_ | 1782368205157695488 |
---|---|
author | Grimm, Dominik G. Azencott, Chloé‐Agathe Aicheler, Fabian Gieraths, Udo MacArthur, Daniel G. Samocha, Kaitlin E. Cooper, David N. Stenson, Peter D. Daly, Mark J. Smoller, Jordan W. Duncan, Laramie E. Borgwardt, Karsten M. |
author_facet | Grimm, Dominik G. Azencott, Chloé‐Agathe Aicheler, Fabian Gieraths, Udo MacArthur, Daniel G. Samocha, Kaitlin E. Cooper, David N. Stenson, Peter D. Daly, Mark J. Smoller, Jordan W. Duncan, Laramie E. Borgwardt, Karsten M. |
author_sort | Grimm, Dominik G. |
collection | PubMed |
description | Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies for exploring complex and Mendelian diseases. A large number of in silico tools have been employed for the task of pathogenicity prediction, including PolyPhen‐2, SIFT, FatHMM, MutationTaster‐2, MutationAssessor, Combined Annotation Dependent Depletion, LRT, phyloP, and GERP++, as well as optimized methods of combining tool scores, such as Condel and Logit. Due to the wealth of these methods, an important practical question to answer is which of these tools generalize best, that is, correctly predict the pathogenic character of new variants. We here demonstrate in a study of 10 tools on five datasets that such a comparative evaluation of these tools is hindered by two types of circularity: they arise due to (1) the same variants or (2) different variants from the same protein occurring both in the datasets used for training and for evaluation of these tools, which may lead to overly optimistic results. We show that comparative evaluations of predictors that do not address these types of circularity may erroneously conclude that circularity confounded tools are most accurate among all tools, and may even outperform optimized combinations of tools. |
format | Online Article Text |
id | pubmed-4409520 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-44095202016-05-01 The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity Grimm, Dominik G. Azencott, Chloé‐Agathe Aicheler, Fabian Gieraths, Udo MacArthur, Daniel G. Samocha, Kaitlin E. Cooper, David N. Stenson, Peter D. Daly, Mark J. Smoller, Jordan W. Duncan, Laramie E. Borgwardt, Karsten M. Hum Mutat Research Articles Prioritizing missense variants for further experimental investigation is a key challenge in current sequencing studies for exploring complex and Mendelian diseases. A large number of in silico tools have been employed for the task of pathogenicity prediction, including PolyPhen‐2, SIFT, FatHMM, MutationTaster‐2, MutationAssessor, Combined Annotation Dependent Depletion, LRT, phyloP, and GERP++, as well as optimized methods of combining tool scores, such as Condel and Logit. Due to the wealth of these methods, an important practical question to answer is which of these tools generalize best, that is, correctly predict the pathogenic character of new variants. We here demonstrate in a study of 10 tools on five datasets that such a comparative evaluation of these tools is hindered by two types of circularity: they arise due to (1) the same variants or (2) different variants from the same protein occurring both in the datasets used for training and for evaluation of these tools, which may lead to overly optimistic results. We show that comparative evaluations of predictors that do not address these types of circularity may erroneously conclude that circularity confounded tools are most accurate among all tools, and may even outperform optimized combinations of tools. John Wiley and Sons Inc. 2015-03-26 2015-05 /pmc/articles/PMC4409520/ /pubmed/25684150 http://dx.doi.org/10.1002/humu.22768 Text en © 2015 The Authors. **Human Mutation published by Wiley Periodicals, Inc. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial 4.0 (http://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes. |
spellingShingle | Research Articles Grimm, Dominik G. Azencott, Chloé‐Agathe Aicheler, Fabian Gieraths, Udo MacArthur, Daniel G. Samocha, Kaitlin E. Cooper, David N. Stenson, Peter D. Daly, Mark J. Smoller, Jordan W. Duncan, Laramie E. Borgwardt, Karsten M. The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity |
title | The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity |
title_full | The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity |
title_fullStr | The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity |
title_full_unstemmed | The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity |
title_short | The Evaluation of Tools Used to Predict the Impact of Missense Variants Is Hindered by Two Types of Circularity |
title_sort | evaluation of tools used to predict the impact of missense variants is hindered by two types of circularity |
topic | Research Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4409520/ https://www.ncbi.nlm.nih.gov/pubmed/25684150 http://dx.doi.org/10.1002/humu.22768 |
work_keys_str_mv | AT grimmdominikg theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT azencottchloeagathe theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT aichelerfabian theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT gierathsudo theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT macarthurdanielg theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT samochakaitline theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT cooperdavidn theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT stensonpeterd theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT dalymarkj theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT smollerjordanw theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT duncanlaramiee theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT borgwardtkarstenm theevaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT grimmdominikg evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT azencottchloeagathe evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT aichelerfabian evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT gierathsudo evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT macarthurdanielg evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT samochakaitline evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT cooperdavidn evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT stensonpeterd evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT dalymarkj evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT smollerjordanw evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT duncanlaramiee evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity AT borgwardtkarstenm evaluationoftoolsusedtopredicttheimpactofmissensevariantsishinderedbytwotypesofcircularity |