Cargando…
Phenotype forecasting with SNPs data through gene-based Bayesian networks
BACKGROUND: Bayesian networks are powerful instruments to learn genetic models from association studies data. They are able to derive the existing correlation between genetic markers and phenotypic traits and, at the same time, to find the relationships between the markers themselves. However, learn...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2646249/ https://www.ncbi.nlm.nih.gov/pubmed/19208195 http://dx.doi.org/10.1186/1471-2105-10-S2-S7 |
_version_ | 1782164832302137344 |
---|---|
author | Malovini, Alberto Nuzzo, Angelo Ferrazzi, Fulvia Puca, Annibale A Bellazzi, Riccardo |
author_facet | Malovini, Alberto Nuzzo, Angelo Ferrazzi, Fulvia Puca, Annibale A Bellazzi, Riccardo |
author_sort | Malovini, Alberto |
collection | PubMed |
description | BACKGROUND: Bayesian networks are powerful instruments to learn genetic models from association studies data. They are able to derive the existing correlation between genetic markers and phenotypic traits and, at the same time, to find the relationships between the markers themselves. However, learning Bayesian networks is often non-trivial due to the high number of variables to be taken into account in the model with respect to the instances of the dataset. Therefore, it becomes very interesting to use an abstraction of the variable space that suitably reduces its dimensionality without losing information. In this paper we present a new strategy to achieve this goal by mapping the SNPs related to the same gene to one meta-variable. In order to assign states to the meta-variables we employ an approach based on classification trees. RESULTS: We applied our approach to data coming from a genome-wide scan on 288 individuals affected by arterial hypertension and 271 nonagenarians without history of hypertension. After pre-processing, we focused on a subset of 24 SNPs. We compared the performance of the proposed approach with the Bayesian network learned with SNPs as variables and with the network learned with haplotypes as meta-variables. The results were obtained by running a hold-out experiment five times. The mean accuracy of the new method was 64.28%, while the mean accuracy of the SNPs network was 58.99% and the mean accuracy of the haplotype network was 54.57%. CONCLUSION: The new approach presented in this paper is able to derive a gene-based predictive model based on SNPs data. Such model is more parsimonious than the one based on single SNPs, while preserving the capability of highlighting predictive SNPs configurations. The prediction performance of this approach was consistently superior to the SNP-based and the haplotype-based one in all the test sets of the evaluation procedure. The method can be then considered as an alternative way to analyze the data coming from association studies. |
format | Text |
id | pubmed-2646249 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26462492009-02-23 Phenotype forecasting with SNPs data through gene-based Bayesian networks Malovini, Alberto Nuzzo, Angelo Ferrazzi, Fulvia Puca, Annibale A Bellazzi, Riccardo BMC Bioinformatics Proceedings BACKGROUND: Bayesian networks are powerful instruments to learn genetic models from association studies data. They are able to derive the existing correlation between genetic markers and phenotypic traits and, at the same time, to find the relationships between the markers themselves. However, learning Bayesian networks is often non-trivial due to the high number of variables to be taken into account in the model with respect to the instances of the dataset. Therefore, it becomes very interesting to use an abstraction of the variable space that suitably reduces its dimensionality without losing information. In this paper we present a new strategy to achieve this goal by mapping the SNPs related to the same gene to one meta-variable. In order to assign states to the meta-variables we employ an approach based on classification trees. RESULTS: We applied our approach to data coming from a genome-wide scan on 288 individuals affected by arterial hypertension and 271 nonagenarians without history of hypertension. After pre-processing, we focused on a subset of 24 SNPs. We compared the performance of the proposed approach with the Bayesian network learned with SNPs as variables and with the network learned with haplotypes as meta-variables. The results were obtained by running a hold-out experiment five times. The mean accuracy of the new method was 64.28%, while the mean accuracy of the SNPs network was 58.99% and the mean accuracy of the haplotype network was 54.57%. CONCLUSION: The new approach presented in this paper is able to derive a gene-based predictive model based on SNPs data. Such model is more parsimonious than the one based on single SNPs, while preserving the capability of highlighting predictive SNPs configurations. The prediction performance of this approach was consistently superior to the SNP-based and the haplotype-based one in all the test sets of the evaluation procedure. The method can be then considered as an alternative way to analyze the data coming from association studies. BioMed Central 2009-02-05 /pmc/articles/PMC2646249/ /pubmed/19208195 http://dx.doi.org/10.1186/1471-2105-10-S2-S7 Text en Copyright © 2009 Malovini et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Malovini, Alberto Nuzzo, Angelo Ferrazzi, Fulvia Puca, Annibale A Bellazzi, Riccardo Phenotype forecasting with SNPs data through gene-based Bayesian networks |
title | Phenotype forecasting with SNPs data through gene-based Bayesian networks |
title_full | Phenotype forecasting with SNPs data through gene-based Bayesian networks |
title_fullStr | Phenotype forecasting with SNPs data through gene-based Bayesian networks |
title_full_unstemmed | Phenotype forecasting with SNPs data through gene-based Bayesian networks |
title_short | Phenotype forecasting with SNPs data through gene-based Bayesian networks |
title_sort | phenotype forecasting with snps data through gene-based bayesian networks |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2646249/ https://www.ncbi.nlm.nih.gov/pubmed/19208195 http://dx.doi.org/10.1186/1471-2105-10-S2-S7 |
work_keys_str_mv | AT malovinialberto phenotypeforecastingwithsnpsdatathroughgenebasedbayesiannetworks AT nuzzoangelo phenotypeforecastingwithsnpsdatathroughgenebasedbayesiannetworks AT ferrazzifulvia phenotypeforecastingwithsnpsdatathroughgenebasedbayesiannetworks AT pucaannibalea phenotypeforecastingwithsnpsdatathroughgenebasedbayesiannetworks AT bellazziriccardo phenotypeforecastingwithsnpsdatathroughgenebasedbayesiannetworks |