Cargando…

SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS

BACKGROUND: The capability of correlating specific genotypes with human diseases is a complex issue in spite of all advantages arisen from high-throughput technologies, such as Genome Wide Association Studies (GWAS). New tools for genetic variants interpretation and for Single Nucleotide Polymorphis...

Descripción completa

Detalles Bibliográficos
Autores principales:	Merelli, Ivan, Calabria, Andrea, Cozzi, Paolo, Viti, Federica, Mosca, Ettore, Milanesi, Luciano
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2013
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3548692/ https://www.ncbi.nlm.nih.gov/pubmed/23369106 http://dx.doi.org/10.1186/1471-2105-14-S1-S9

_version_	1782256348176580608
author	Merelli, Ivan Calabria, Andrea Cozzi, Paolo Viti, Federica Mosca, Ettore Milanesi, Luciano
author_facet	Merelli, Ivan Calabria, Andrea Cozzi, Paolo Viti, Federica Mosca, Ettore Milanesi, Luciano
author_sort	Merelli, Ivan
collection	PubMed
description	BACKGROUND: The capability of correlating specific genotypes with human diseases is a complex issue in spite of all advantages arisen from high-throughput technologies, such as Genome Wide Association Studies (GWAS). New tools for genetic variants interpretation and for Single Nucleotide Polymorphisms (SNPs) prioritization are actually needed. Given a list of the most relevant SNPs statistically associated to a specific pathology as result of a genotype study, a critical issue is the identification of genes that are effectively related to the disease by re-scoring the importance of the identified genetic variations. Vice versa, given a list of genes, it can be of great importance to predict which SNPs can be involved in the onset of a particular disease, in order to focus the research on their effects. RESULTS: We propose a new bioinformatics approach to support biological data mining in the analysis and interpretation of SNPs associated to pathologies. This system can be employed to design custom genotyping chips for disease-oriented studies and to re-score GWAS results. The proposed method relies (1) on the data integration of public resources using a gene-centric database design, (2) on the evaluation of a set of static biomolecular annotations, defined as features, and (3) on the SNP scoring function, which computes SNP scores using parameters and weights set by users. We employed a machine learning classifier to set default feature weights and an ontological annotation layer to enable the enrichment of the input gene set. We implemented our method as a web tool called SNPranker 2.0 (http://www.itb.cnr.it/snpranker), improving our first published release of this system. A user-friendly interface allows the input of a list of genes, SNPs or a biological process, and to customize the features set with relative weights. As result, SNPranker 2.0 returns a list of SNPs, localized within input and ontologically enriched genes, combined with their prioritization scores. CONCLUSIONS: Different databases and resources are already available for SNPs annotation, but they do not prioritize or re-score SNPs relying on a-priori biomolecular knowledge. SNPranker 2.0 attempts to fill this gap through a user-friendly integrated web resource. End users, such as researchers in medical genetics and epidemiology, may find in SNPranker 2.0 a new tool for data mining and interpretation able to support SNPs analysis. Possible scenarios are GWAS data re-scoring, SNPs selection for custom genotyping arrays and SNPs/diseases association studies.
format	Online Article Text
id	pubmed-3548692
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-35486922013-02-04 SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS Merelli, Ivan Calabria, Andrea Cozzi, Paolo Viti, Federica Mosca, Ettore Milanesi, Luciano BMC Bioinformatics Research BACKGROUND: The capability of correlating specific genotypes with human diseases is a complex issue in spite of all advantages arisen from high-throughput technologies, such as Genome Wide Association Studies (GWAS). New tools for genetic variants interpretation and for Single Nucleotide Polymorphisms (SNPs) prioritization are actually needed. Given a list of the most relevant SNPs statistically associated to a specific pathology as result of a genotype study, a critical issue is the identification of genes that are effectively related to the disease by re-scoring the importance of the identified genetic variations. Vice versa, given a list of genes, it can be of great importance to predict which SNPs can be involved in the onset of a particular disease, in order to focus the research on their effects. RESULTS: We propose a new bioinformatics approach to support biological data mining in the analysis and interpretation of SNPs associated to pathologies. This system can be employed to design custom genotyping chips for disease-oriented studies and to re-score GWAS results. The proposed method relies (1) on the data integration of public resources using a gene-centric database design, (2) on the evaluation of a set of static biomolecular annotations, defined as features, and (3) on the SNP scoring function, which computes SNP scores using parameters and weights set by users. We employed a machine learning classifier to set default feature weights and an ontological annotation layer to enable the enrichment of the input gene set. We implemented our method as a web tool called SNPranker 2.0 (http://www.itb.cnr.it/snpranker), improving our first published release of this system. A user-friendly interface allows the input of a list of genes, SNPs or a biological process, and to customize the features set with relative weights. As result, SNPranker 2.0 returns a list of SNPs, localized within input and ontologically enriched genes, combined with their prioritization scores. CONCLUSIONS: Different databases and resources are already available for SNPs annotation, but they do not prioritize or re-score SNPs relying on a-priori biomolecular knowledge. SNPranker 2.0 attempts to fill this gap through a user-friendly integrated web resource. End users, such as researchers in medical genetics and epidemiology, may find in SNPranker 2.0 a new tool for data mining and interpretation able to support SNPs analysis. Possible scenarios are GWAS data re-scoring, SNPs selection for custom genotyping arrays and SNPs/diseases association studies. BioMed Central 2013-01-14 /pmc/articles/PMC3548692/ /pubmed/23369106 http://dx.doi.org/10.1186/1471-2105-14-S1-S9 Text en Copyright ©2013 Merelli et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Merelli, Ivan Calabria, Andrea Cozzi, Paolo Viti, Federica Mosca, Ettore Milanesi, Luciano SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS
title	SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS
title_full	SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS
title_fullStr	SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS
title_full_unstemmed	SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS
title_short	SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS
title_sort	snpranker 2.0: a gene-centric data mining tool for diseases associated snp prioritization in gwas
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3548692/ https://www.ncbi.nlm.nih.gov/pubmed/23369106 http://dx.doi.org/10.1186/1471-2105-14-S1-S9
work_keys_str_mv	AT merelliivan snpranker20agenecentricdataminingtoolfordiseasesassociatedsnpprioritizationingwas AT calabriaandrea snpranker20agenecentricdataminingtoolfordiseasesassociatedsnpprioritizationingwas AT cozzipaolo snpranker20agenecentricdataminingtoolfordiseasesassociatedsnpprioritizationingwas AT vitifederica snpranker20agenecentricdataminingtoolfordiseasesassociatedsnpprioritizationingwas AT moscaettore snpranker20agenecentricdataminingtoolfordiseasesassociatedsnpprioritizationingwas AT milanesiluciano snpranker20agenecentricdataminingtoolfordiseasesassociatedsnpprioritizationingwas

SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS

Ejemplares similares