Cargando…

The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations

BACKGROUND: Understanding and predicting the effects of mutations on protein structure and phenotype is an increasingly important area. Genes for many genetically linked diseases are now routinely sequenced in the clinic. Previously we focused on understanding the structural effects of mutations, cr...

Descripción completa

Detalles Bibliográficos
Autores principales: Al-Numair, Nouf S, Martin, Andrew CR
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665582/
https://www.ncbi.nlm.nih.gov/pubmed/23819919
http://dx.doi.org/10.1186/1471-2164-14-S3-S4
_version_ 1782271276719538176
author Al-Numair, Nouf S
Martin, Andrew CR
author_facet Al-Numair, Nouf S
Martin, Andrew CR
author_sort Al-Numair, Nouf S
collection PubMed
description BACKGROUND: Understanding and predicting the effects of mutations on protein structure and phenotype is an increasingly important area. Genes for many genetically linked diseases are now routinely sequenced in the clinic. Previously we focused on understanding the structural effects of mutations, creating the SAAPdb resource. RESULTS: We have updated SAAPdb to include 41% more SNPs and 36% more PDs. Introducing a hydrophobic residue on the surface, or a hydrophilic residue in the core, no longer shows significant differences between SNPs and PDs. We have improved some of the analyses significantly enhancing the analysis of clashes and of mutations to-proline and from-glycine. A new web interface has been developed allowing users to analyze their own mutations. Finally we have developed a machine learning method which gives a cross-validated accuracy of 0.846, considerably out-performing well known methods including SIFT and PolyPhen2 which give accuracies between 0.690 and 0.785. CONCLUSIONS: We have updated SAAPdb and improved its analyses, but with the increasing rate with which mutation data are generated, we have created a new analysis pipeline and web interface. Results of machine learning using the structural analysis results to predict pathogenicity considerably outperform other methods.
format Online
Article
Text
id pubmed-3665582
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36655822013-06-05 The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations Al-Numair, Nouf S Martin, Andrew CR BMC Genomics Research BACKGROUND: Understanding and predicting the effects of mutations on protein structure and phenotype is an increasingly important area. Genes for many genetically linked diseases are now routinely sequenced in the clinic. Previously we focused on understanding the structural effects of mutations, creating the SAAPdb resource. RESULTS: We have updated SAAPdb to include 41% more SNPs and 36% more PDs. Introducing a hydrophobic residue on the surface, or a hydrophilic residue in the core, no longer shows significant differences between SNPs and PDs. We have improved some of the analyses significantly enhancing the analysis of clashes and of mutations to-proline and from-glycine. A new web interface has been developed allowing users to analyze their own mutations. Finally we have developed a machine learning method which gives a cross-validated accuracy of 0.846, considerably out-performing well known methods including SIFT and PolyPhen2 which give accuracies between 0.690 and 0.785. CONCLUSIONS: We have updated SAAPdb and improved its analyses, but with the increasing rate with which mutation data are generated, we have created a new analysis pipeline and web interface. Results of machine learning using the structural analysis results to predict pathogenicity considerably outperform other methods. BioMed Central 2013-05-28 /pmc/articles/PMC3665582/ /pubmed/23819919 http://dx.doi.org/10.1186/1471-2164-14-S3-S4 Text en Copyright © 2013 Al-Numair and Martin; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Al-Numair, Nouf S
Martin, Andrew CR
The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations
title The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations
title_full The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations
title_fullStr The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations
title_full_unstemmed The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations
title_short The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations
title_sort saap pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3665582/
https://www.ncbi.nlm.nih.gov/pubmed/23819919
http://dx.doi.org/10.1186/1471-2164-14-S3-S4
work_keys_str_mv AT alnumairnoufs thesaappipelineanddatabasetoolstoanalyzetheimpactandpredictthepathogenicityofmutations
AT martinandrewcr thesaappipelineanddatabasetoolstoanalyzetheimpactandpredictthepathogenicityofmutations
AT alnumairnoufs saappipelineanddatabasetoolstoanalyzetheimpactandpredictthepathogenicityofmutations
AT martinandrewcr saappipelineanddatabasetoolstoanalyzetheimpactandpredictthepathogenicityofmutations