Cargando…

Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm

A genetic risk score could be beneficial in assisting clinical diagnosis for complex diseases with high heritability. With large-scale genome-wide association (GWA) data, the current study constructed a genetic risk model with a machine learning approach for bipolar disorder (BPD). The GWA dataset o...

Descripción completa

Detalles Bibliográficos
Autores principales:	Chuang, Li-Chung, Kuo, Po-Hsiu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group 2017
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5206749/ https://www.ncbi.nlm.nih.gov/pubmed/28045094 http://dx.doi.org/10.1038/srep39943

_version_	1782490295934386176
author	Chuang, Li-Chung Kuo, Po-Hsiu
author_facet	Chuang, Li-Chung Kuo, Po-Hsiu
author_sort	Chuang, Li-Chung
collection	PubMed
description	A genetic risk score could be beneficial in assisting clinical diagnosis for complex diseases with high heritability. With large-scale genome-wide association (GWA) data, the current study constructed a genetic risk model with a machine learning approach for bipolar disorder (BPD). The GWA dataset of BPD from the Genetic Association Information Network was used as the training data for model construction, and the Systematic Treatment Enhancement Program (STEP) GWA data were used as the validation dataset. A random forest algorithm was applied for pre-filtered markers, and variable importance indices were assessed. 289 candidate markers were selected by random forest procedures with good discriminability; the area under the receiver operating characteristic curve was 0.944 (0.935–0.953) in the training set and 0.702 (0.681–0.723) in the STEP dataset. Using a score with the cutoff of 184, the sensitivity and specificity for BPD was 0.777 and 0.854, respectively. Pathway analyses revealed important biological pathways for identified genes. In conclusion, the present study identified informative genetic markers to differentiate BPD from healthy controls with acceptable discriminability in the validation dataset. In the future, diagnosis classification can be further improved by assessing more comprehensive clinical risk factors and jointly analysing them with genetic data in large samples.
format	Online Article Text
id	pubmed-5206749
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Nature Publishing Group
record_format	MEDLINE/PubMed
spelling	pubmed-52067492017-01-04 Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm Chuang, Li-Chung Kuo, Po-Hsiu Sci Rep Article A genetic risk score could be beneficial in assisting clinical diagnosis for complex diseases with high heritability. With large-scale genome-wide association (GWA) data, the current study constructed a genetic risk model with a machine learning approach for bipolar disorder (BPD). The GWA dataset of BPD from the Genetic Association Information Network was used as the training data for model construction, and the Systematic Treatment Enhancement Program (STEP) GWA data were used as the validation dataset. A random forest algorithm was applied for pre-filtered markers, and variable importance indices were assessed. 289 candidate markers were selected by random forest procedures with good discriminability; the area under the receiver operating characteristic curve was 0.944 (0.935–0.953) in the training set and 0.702 (0.681–0.723) in the STEP dataset. Using a score with the cutoff of 184, the sensitivity and specificity for BPD was 0.777 and 0.854, respectively. Pathway analyses revealed important biological pathways for identified genes. In conclusion, the present study identified informative genetic markers to differentiate BPD from healthy controls with acceptable discriminability in the validation dataset. In the future, diagnosis classification can be further improved by assessing more comprehensive clinical risk factors and jointly analysing them with genetic data in large samples. Nature Publishing Group 2017-01-03 /pmc/articles/PMC5206749/ /pubmed/28045094 http://dx.doi.org/10.1038/srep39943 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle	Article Chuang, Li-Chung Kuo, Po-Hsiu Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm
title	Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm
title_full	Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm
title_fullStr	Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm
title_full_unstemmed	Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm
title_short	Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm
title_sort	building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5206749/ https://www.ncbi.nlm.nih.gov/pubmed/28045094 http://dx.doi.org/10.1038/srep39943
work_keys_str_mv	AT chuanglichung buildingageneticriskmodelforbipolardisorderfromgenomewideassociationdatawithrandomforestalgorithm AT kuopohsiu buildingageneticriskmodelforbipolardisorderfromgenomewideassociationdatawithrandomforestalgorithm

Building a genetic risk model for bipolar disorder from genome-wide association data with random forest algorithm

Ejemplares similares