Cargando…

GESPA: classifying nsSNPs to predict disease association

BACKGROUND: Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association stu...

Descripción completa

Detalles Bibliográficos
Autores principales: Khurana, Jay K., Reeder, Jay E., Shrimpton, Antony E., Thakar, Juilee
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4513380/
https://www.ncbi.nlm.nih.gov/pubmed/26206375
http://dx.doi.org/10.1186/s12859-015-0673-2
_version_ 1782382636225789952
author Khurana, Jay K.
Reeder, Jay E.
Shrimpton, Antony E.
Thakar, Juilee
author_facet Khurana, Jay K.
Reeder, Jay E.
Shrimpton, Antony E.
Thakar, Juilee
author_sort Khurana, Jay K.
collection PubMed
description BACKGROUND: Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs. RESULTS: GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA’s overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data. CONCLUSIONS: GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0673-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4513380
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45133802015-07-25 GESPA: classifying nsSNPs to predict disease association Khurana, Jay K. Reeder, Jay E. Shrimpton, Antony E. Thakar, Juilee BMC Bioinformatics Software BACKGROUND: Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs. RESULTS: GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA’s overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data. CONCLUSIONS: GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0673-2) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-25 /pmc/articles/PMC4513380/ /pubmed/26206375 http://dx.doi.org/10.1186/s12859-015-0673-2 Text en © Khurana et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Khurana, Jay K.
Reeder, Jay E.
Shrimpton, Antony E.
Thakar, Juilee
GESPA: classifying nsSNPs to predict disease association
title GESPA: classifying nsSNPs to predict disease association
title_full GESPA: classifying nsSNPs to predict disease association
title_fullStr GESPA: classifying nsSNPs to predict disease association
title_full_unstemmed GESPA: classifying nsSNPs to predict disease association
title_short GESPA: classifying nsSNPs to predict disease association
title_sort gespa: classifying nssnps to predict disease association
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4513380/
https://www.ncbi.nlm.nih.gov/pubmed/26206375
http://dx.doi.org/10.1186/s12859-015-0673-2
work_keys_str_mv AT khuranajayk gespaclassifyingnssnpstopredictdiseaseassociation
AT reederjaye gespaclassifyingnssnpstopredictdiseaseassociation
AT shrimptonantonye gespaclassifyingnssnpstopredictdiseaseassociation
AT thakarjuilee gespaclassifyingnssnpstopredictdiseaseassociation