Cargando…

PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions

An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically...

Descripción completa

Detalles Bibliográficos
Autores principales: Bendl, Jaroslav, Musil, Miloš, Štourač, Jan, Zendulka, Jaroslav, Damborský, Jiří, Brezovský, Jan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4880439/
https://www.ncbi.nlm.nih.gov/pubmed/27224906
http://dx.doi.org/10.1371/journal.pcbi.1004962
_version_ 1782433802018095104
author Bendl, Jaroslav
Musil, Miloš
Štourač, Jan
Zendulka, Jaroslav
Damborský, Jiří
Brezovský, Jan
author_facet Bendl, Jaroslav
Musil, Miloš
Štourač, Jan
Zendulka, Jaroslav
Damborský, Jiří
Brezovský, Jan
author_sort Bendl, Jaroslav
collection PubMed
description An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools’ predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To enable comprehensive evaluation of variants, the predictions are complemented with annotations from eight databases. The web server is freely available to the community at http://loschmidt.chemi.muni.cz/predictsnp2.
format Online
Article
Text
id pubmed-4880439
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-48804392016-06-09 PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions Bendl, Jaroslav Musil, Miloš Štourač, Jan Zendulka, Jaroslav Damborský, Jiří Brezovský, Jan PLoS Comput Biol Research Article An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools’ predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To enable comprehensive evaluation of variants, the predictions are complemented with annotations from eight databases. The web server is freely available to the community at http://loschmidt.chemi.muni.cz/predictsnp2. Public Library of Science 2016-05-25 /pmc/articles/PMC4880439/ /pubmed/27224906 http://dx.doi.org/10.1371/journal.pcbi.1004962 Text en © 2016 Bendl et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Bendl, Jaroslav
Musil, Miloš
Štourač, Jan
Zendulka, Jaroslav
Damborský, Jiří
Brezovský, Jan
PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions
title PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions
title_full PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions
title_fullStr PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions
title_full_unstemmed PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions
title_short PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions
title_sort predictsnp2: a unified platform for accurately evaluating snp effects by exploiting the different characteristics of variants in distinct genomic regions
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4880439/
https://www.ncbi.nlm.nih.gov/pubmed/27224906
http://dx.doi.org/10.1371/journal.pcbi.1004962
work_keys_str_mv AT bendljaroslav predictsnp2aunifiedplatformforaccuratelyevaluatingsnpeffectsbyexploitingthedifferentcharacteristicsofvariantsindistinctgenomicregions
AT musilmilos predictsnp2aunifiedplatformforaccuratelyevaluatingsnpeffectsbyexploitingthedifferentcharacteristicsofvariantsindistinctgenomicregions
AT stouracjan predictsnp2aunifiedplatformforaccuratelyevaluatingsnpeffectsbyexploitingthedifferentcharacteristicsofvariantsindistinctgenomicregions
AT zendulkajaroslav predictsnp2aunifiedplatformforaccuratelyevaluatingsnpeffectsbyexploitingthedifferentcharacteristicsofvariantsindistinctgenomicregions
AT damborskyjiri predictsnp2aunifiedplatformforaccuratelyevaluatingsnpeffectsbyexploitingthedifferentcharacteristicsofvariantsindistinctgenomicregions
AT brezovskyjan predictsnp2aunifiedplatformforaccuratelyevaluatingsnpeffectsbyexploitingthedifferentcharacteristicsofvariantsindistinctgenomicregions