Cargando…
GESPA: classifying nsSNPs to predict disease association
BACKGROUND: Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association stu...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4513380/ https://www.ncbi.nlm.nih.gov/pubmed/26206375 http://dx.doi.org/10.1186/s12859-015-0673-2 |
_version_ | 1782382636225789952 |
---|---|
author | Khurana, Jay K. Reeder, Jay E. Shrimpton, Antony E. Thakar, Juilee |
author_facet | Khurana, Jay K. Reeder, Jay E. Shrimpton, Antony E. Thakar, Juilee |
author_sort | Khurana, Jay K. |
collection | PubMed |
description | BACKGROUND: Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs. RESULTS: GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA’s overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data. CONCLUSIONS: GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0673-2) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4513380 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45133802015-07-25 GESPA: classifying nsSNPs to predict disease association Khurana, Jay K. Reeder, Jay E. Shrimpton, Antony E. Thakar, Juilee BMC Bioinformatics Software BACKGROUND: Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs. RESULTS: GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA’s overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data. CONCLUSIONS: GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0673-2) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-25 /pmc/articles/PMC4513380/ /pubmed/26206375 http://dx.doi.org/10.1186/s12859-015-0673-2 Text en © Khurana et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Khurana, Jay K. Reeder, Jay E. Shrimpton, Antony E. Thakar, Juilee GESPA: classifying nsSNPs to predict disease association |
title | GESPA: classifying nsSNPs to predict disease association |
title_full | GESPA: classifying nsSNPs to predict disease association |
title_fullStr | GESPA: classifying nsSNPs to predict disease association |
title_full_unstemmed | GESPA: classifying nsSNPs to predict disease association |
title_short | GESPA: classifying nsSNPs to predict disease association |
title_sort | gespa: classifying nssnps to predict disease association |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4513380/ https://www.ncbi.nlm.nih.gov/pubmed/26206375 http://dx.doi.org/10.1186/s12859-015-0673-2 |
work_keys_str_mv | AT khuranajayk gespaclassifyingnssnpstopredictdiseaseassociation AT reederjaye gespaclassifyingnssnpstopredictdiseaseassociation AT shrimptonantonye gespaclassifyingnssnpstopredictdiseaseassociation AT thakarjuilee gespaclassifyingnssnpstopredictdiseaseassociation |