Cargando…
Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data
BACKGROUND: Multifactor dimensionality reduction (MDR) is a nonparametric approach that can be used to detect relevant interactions between single-nucleotide polymorphisms (SNPs). The aim of this study was to build the best genomic model based on SNP associations and to identify candidate polymorphi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Dove Medical Press
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5127431/ https://www.ncbi.nlm.nih.gov/pubmed/27920536 http://dx.doi.org/10.2147/NDT.S112558 |
_version_ | 1782470244724375552 |
---|---|
author | Acikel, Cengizhan Aydin Son, Yesim Celik, Cemil Gul, Husamettin |
author_facet | Acikel, Cengizhan Aydin Son, Yesim Celik, Cemil Gul, Husamettin |
author_sort | Acikel, Cengizhan |
collection | PubMed |
description | BACKGROUND: Multifactor dimensionality reduction (MDR) is a nonparametric approach that can be used to detect relevant interactions between single-nucleotide polymorphisms (SNPs). The aim of this study was to build the best genomic model based on SNP associations and to identify candidate polymorphisms that are the underlying molecular basis of the bipolar disorders. METHODS: This study was performed on Whole-Genome Association Study of Bipolar Disorder (dbGaP [database of Genotypes and Phenotypes] study accession number: phs000017.v3.p1) data. After preprocessing of the genotyping data, three classification-based data mining methods (ie, random forest, naïve Bayes, and k-nearest neighbor) were performed. Additionally, as a nonparametric, model-free approach, the MDR method was used to evaluate the SNP profiles. The validity of these methods was evaluated using true classification rate, recall (sensitivity), precision (positive predictive value), and F-measure. RESULTS: Random forests, naïve Bayes, and k-nearest neighbors identified 16, 13, and ten candidate SNPs, respectively. Surprisingly, the top six SNPs were reported by all three methods. Random forests and k-nearest neighbors were more successful than naïve Bayes, with recall values >0.95. On the other hand, MDR generated a model with comparable predictive performance based on five SNPs. Although different SNP profiles were identified in MDR compared to the classification-based models, all models mapped SNPs to the DOCK10 gene. CONCLUSION: Three classification-based data mining approaches, random forests, naïve Bayes, and k-nearest neighbors, have prioritized similar SNP profiles as predictors of bipolar disorders, in contrast to MDR, which has found different SNPs through analysis of two-way and three-way interactions. The reduced number of associated SNPs discovered by MDR, without loss in the classification performance, would facilitate validation studies and decision support models, and would reduce the cost to develop predictive and diagnostic tests. Nevertheless, we need to emphasize that translation of genomic models to the clinical setting requires models with higher classification performance. |
format | Online Article Text |
id | pubmed-5127431 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Dove Medical Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-51274312016-12-05 Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data Acikel, Cengizhan Aydin Son, Yesim Celik, Cemil Gul, Husamettin Neuropsychiatr Dis Treat Original Research BACKGROUND: Multifactor dimensionality reduction (MDR) is a nonparametric approach that can be used to detect relevant interactions between single-nucleotide polymorphisms (SNPs). The aim of this study was to build the best genomic model based on SNP associations and to identify candidate polymorphisms that are the underlying molecular basis of the bipolar disorders. METHODS: This study was performed on Whole-Genome Association Study of Bipolar Disorder (dbGaP [database of Genotypes and Phenotypes] study accession number: phs000017.v3.p1) data. After preprocessing of the genotyping data, three classification-based data mining methods (ie, random forest, naïve Bayes, and k-nearest neighbor) were performed. Additionally, as a nonparametric, model-free approach, the MDR method was used to evaluate the SNP profiles. The validity of these methods was evaluated using true classification rate, recall (sensitivity), precision (positive predictive value), and F-measure. RESULTS: Random forests, naïve Bayes, and k-nearest neighbors identified 16, 13, and ten candidate SNPs, respectively. Surprisingly, the top six SNPs were reported by all three methods. Random forests and k-nearest neighbors were more successful than naïve Bayes, with recall values >0.95. On the other hand, MDR generated a model with comparable predictive performance based on five SNPs. Although different SNP profiles were identified in MDR compared to the classification-based models, all models mapped SNPs to the DOCK10 gene. CONCLUSION: Three classification-based data mining approaches, random forests, naïve Bayes, and k-nearest neighbors, have prioritized similar SNP profiles as predictors of bipolar disorders, in contrast to MDR, which has found different SNPs through analysis of two-way and three-way interactions. The reduced number of associated SNPs discovered by MDR, without loss in the classification performance, would facilitate validation studies and decision support models, and would reduce the cost to develop predictive and diagnostic tests. Nevertheless, we need to emphasize that translation of genomic models to the clinical setting requires models with higher classification performance. Dove Medical Press 2016-11-24 /pmc/articles/PMC5127431/ /pubmed/27920536 http://dx.doi.org/10.2147/NDT.S112558 Text en © 2016 Acikel et al. This work is published and licensed by Dove Medical Press Limited The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution – Non Commercial (unported, v3.0) License (http://creativecommons.org/licenses/by-nc/3.0/). By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. |
spellingShingle | Original Research Acikel, Cengizhan Aydin Son, Yesim Celik, Cemil Gul, Husamettin Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data |
title | Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data |
title_full | Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data |
title_fullStr | Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data |
title_full_unstemmed | Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data |
title_short | Evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data |
title_sort | evaluation of potential novel variations and their interactions related to bipolar disorders: analysis of genome-wide association study data |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5127431/ https://www.ncbi.nlm.nih.gov/pubmed/27920536 http://dx.doi.org/10.2147/NDT.S112558 |
work_keys_str_mv | AT acikelcengizhan evaluationofpotentialnovelvariationsandtheirinteractionsrelatedtobipolardisordersanalysisofgenomewideassociationstudydata AT aydinsonyesim evaluationofpotentialnovelvariationsandtheirinteractionsrelatedtobipolardisordersanalysisofgenomewideassociationstudydata AT celikcemil evaluationofpotentialnovelvariationsandtheirinteractionsrelatedtobipolardisordersanalysisofgenomewideassociationstudydata AT gulhusamettin evaluationofpotentialnovelvariationsandtheirinteractionsrelatedtobipolardisordersanalysisofgenomewideassociationstudydata |