Cargando…

Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts

BACKGROUND: One goal of personalized medicine is leveraging the emerging tools of data science to guide medical decision-making. Achieving this using disparate data sources is most daunting for polygenic traits. To this end, we employed random forests (RFs) and neural networks (NNs) for predictive m...

Descripción completa

Detalles Bibliográficos
Autores principales: Oguz, Cihan, Sen, Shurjo K., Davis, Adam R., Fu, Yi-Ping, O’Donnell, Christopher J., Gibbons, Gary H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5659034/
https://www.ncbi.nlm.nih.gov/pubmed/29073909
http://dx.doi.org/10.1186/s12918-017-0474-5
_version_ 1783274103621287936
author Oguz, Cihan
Sen, Shurjo K.
Davis, Adam R.
Fu, Yi-Ping
O’Donnell, Christopher J.
Gibbons, Gary H.
author_facet Oguz, Cihan
Sen, Shurjo K.
Davis, Adam R.
Fu, Yi-Ping
O’Donnell, Christopher J.
Gibbons, Gary H.
author_sort Oguz, Cihan
collection PubMed
description BACKGROUND: One goal of personalized medicine is leveraging the emerging tools of data science to guide medical decision-making. Achieving this using disparate data sources is most daunting for polygenic traits. To this end, we employed random forests (RFs) and neural networks (NNs) for predictive modeling of coronary artery calcium (CAC), which is an intermediate endo-phenotype of coronary artery disease (CAD). METHODS: Model inputs were derived from advanced cases in the ClinSeq®; discovery cohort (n=16) and the FHS replication cohort (n=36) from 89(th)-99(th) CAC score percentile range, and age-matched controls (ClinSeq®; n=16, FHS n=36) with no detectable CAC (all subjects were Caucasian males). These inputs included clinical variables and genotypes of 56 single nucleotide polymorphisms (SNPs) ranked highest in terms of their nominal correlation with the advanced CAC state in the discovery cohort. Predictive performance was assessed by computing the areas under receiver operating characteristic curves (ROC-AUC). RESULTS: RF models trained and tested with clinical variables generated ROC-AUC values of 0.69 and 0.61 in the discovery and replication cohorts, respectively. In contrast, in both cohorts, the set of SNPs derived from the discovery cohort were highly predictive (ROC-AUC ≥0.85) with no significant change in predictive performance upon integration of clinical and genotype variables. Using the 21 SNPs that produced optimal predictive performance in both cohorts, we developed NN models trained with ClinSeq®; data and tested with FHS data and obtained high predictive accuracy (ROC-AUC=0.80-0.85) with several topologies. Several CAD and “vascular aging" related biological processes were enriched in the network of genes constructed from the predictive SNPs. CONCLUSIONS: We identified a molecular network predictive of advanced coronary calcium using genotype data from ClinSeq®; and FHS cohorts. Our results illustrate that machine learning tools, which utilize complex interactions between disease predictors intrinsic to the pathogenesis of polygenic disorders, hold promise for deriving predictive disease models and networks. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-017-0474-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5659034
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-56590342017-11-01 Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts Oguz, Cihan Sen, Shurjo K. Davis, Adam R. Fu, Yi-Ping O’Donnell, Christopher J. Gibbons, Gary H. BMC Syst Biol Research Article BACKGROUND: One goal of personalized medicine is leveraging the emerging tools of data science to guide medical decision-making. Achieving this using disparate data sources is most daunting for polygenic traits. To this end, we employed random forests (RFs) and neural networks (NNs) for predictive modeling of coronary artery calcium (CAC), which is an intermediate endo-phenotype of coronary artery disease (CAD). METHODS: Model inputs were derived from advanced cases in the ClinSeq®; discovery cohort (n=16) and the FHS replication cohort (n=36) from 89(th)-99(th) CAC score percentile range, and age-matched controls (ClinSeq®; n=16, FHS n=36) with no detectable CAC (all subjects were Caucasian males). These inputs included clinical variables and genotypes of 56 single nucleotide polymorphisms (SNPs) ranked highest in terms of their nominal correlation with the advanced CAC state in the discovery cohort. Predictive performance was assessed by computing the areas under receiver operating characteristic curves (ROC-AUC). RESULTS: RF models trained and tested with clinical variables generated ROC-AUC values of 0.69 and 0.61 in the discovery and replication cohorts, respectively. In contrast, in both cohorts, the set of SNPs derived from the discovery cohort were highly predictive (ROC-AUC ≥0.85) with no significant change in predictive performance upon integration of clinical and genotype variables. Using the 21 SNPs that produced optimal predictive performance in both cohorts, we developed NN models trained with ClinSeq®; data and tested with FHS data and obtained high predictive accuracy (ROC-AUC=0.80-0.85) with several topologies. Several CAD and “vascular aging" related biological processes were enriched in the network of genes constructed from the predictive SNPs. CONCLUSIONS: We identified a molecular network predictive of advanced coronary calcium using genotype data from ClinSeq®; and FHS cohorts. Our results illustrate that machine learning tools, which utilize complex interactions between disease predictors intrinsic to the pathogenesis of polygenic disorders, hold promise for deriving predictive disease models and networks. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-017-0474-5) contains supplementary material, which is available to authorized users. BioMed Central 2017-10-26 /pmc/articles/PMC5659034/ /pubmed/29073909 http://dx.doi.org/10.1186/s12918-017-0474-5 Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Oguz, Cihan
Sen, Shurjo K.
Davis, Adam R.
Fu, Yi-Ping
O’Donnell, Christopher J.
Gibbons, Gary H.
Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts
title Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts
title_full Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts
title_fullStr Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts
title_full_unstemmed Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts
title_short Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts
title_sort genotype-driven identification of a molecular network predictive of advanced coronary calcium in clinseq® and framingham heart study cohorts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5659034/
https://www.ncbi.nlm.nih.gov/pubmed/29073909
http://dx.doi.org/10.1186/s12918-017-0474-5
work_keys_str_mv AT oguzcihan genotypedrivenidentificationofamolecularnetworkpredictiveofadvancedcoronarycalciuminclinseqandframinghamheartstudycohorts
AT senshurjok genotypedrivenidentificationofamolecularnetworkpredictiveofadvancedcoronarycalciuminclinseqandframinghamheartstudycohorts
AT davisadamr genotypedrivenidentificationofamolecularnetworkpredictiveofadvancedcoronarycalciuminclinseqandframinghamheartstudycohorts
AT fuyiping genotypedrivenidentificationofamolecularnetworkpredictiveofadvancedcoronarycalciuminclinseqandframinghamheartstudycohorts
AT odonnellchristopherj genotypedrivenidentificationofamolecularnetworkpredictiveofadvancedcoronarycalciuminclinseqandframinghamheartstudycohorts
AT gibbonsgaryh genotypedrivenidentificationofamolecularnetworkpredictiveofadvancedcoronarycalciuminclinseqandframinghamheartstudycohorts