Cargando…

Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks

We used the simulated data set from Genetic Analysis Workshop 15 Problem 3 to assess a two-stage approach for identifying single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). In the first stage, we used random forests (RF) to screen large amounts of genetic data using th...

Descripción completa

Detalles Bibliográficos
Autores principales: Meng, Yan, Yang, Qiong, Cuenco, Karen T, Cupples, L Adrienne, DeStefano, Anita L, Lunetta, Kathryn L
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367609/
https://www.ncbi.nlm.nih.gov/pubmed/18466556
_version_ 1782154333125607424
author Meng, Yan
Yang, Qiong
Cuenco, Karen T
Cupples, L Adrienne
DeStefano, Anita L
Lunetta, Kathryn L
author_facet Meng, Yan
Yang, Qiong
Cuenco, Karen T
Cupples, L Adrienne
DeStefano, Anita L
Lunetta, Kathryn L
author_sort Meng, Yan
collection PubMed
description We used the simulated data set from Genetic Analysis Workshop 15 Problem 3 to assess a two-stage approach for identifying single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). In the first stage, we used random forests (RF) to screen large amounts of genetic data using the variable importance measure, which takes into account SNP interaction effects as well as main effects without requiring model specification. We used the simulated 9187 SNPs mimicking a 10 K SNP chip, along with covariates DR (the simulated DRB1 gentoype), smoking, and sex as input to the RF analyses with a training set consisting of 750 unrelated RA cases and 750 controls. We used an iterative RF screening procedure to identify a smaller set of variables for further analysis. In the second stage, we used the software program CaMML for producing Bayesian networks, and developed complex etiologic models for RA risk using the variables identified by our RF screening procedure. We evaluated the performance of this method using independent test data sets for up to 100 replicates.
format Text
id pubmed-2367609
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23676092008-05-06 Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks Meng, Yan Yang, Qiong Cuenco, Karen T Cupples, L Adrienne DeStefano, Anita L Lunetta, Kathryn L BMC Proc Proceedings We used the simulated data set from Genetic Analysis Workshop 15 Problem 3 to assess a two-stage approach for identifying single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). In the first stage, we used random forests (RF) to screen large amounts of genetic data using the variable importance measure, which takes into account SNP interaction effects as well as main effects without requiring model specification. We used the simulated 9187 SNPs mimicking a 10 K SNP chip, along with covariates DR (the simulated DRB1 gentoype), smoking, and sex as input to the RF analyses with a training set consisting of 750 unrelated RA cases and 750 controls. We used an iterative RF screening procedure to identify a smaller set of variables for further analysis. In the second stage, we used the software program CaMML for producing Bayesian networks, and developed complex etiologic models for RA risk using the variables identified by our RF screening procedure. We evaluated the performance of this method using independent test data sets for up to 100 replicates. BioMed Central 2007-12-18 /pmc/articles/PMC2367609/ /pubmed/18466556 Text en Copyright © 2007 Meng et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Meng, Yan
Yang, Qiong
Cuenco, Karen T
Cupples, L Adrienne
DeStefano, Anita L
Lunetta, Kathryn L
Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks
title Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks
title_full Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks
title_fullStr Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks
title_full_unstemmed Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks
title_short Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks
title_sort two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and bayesian networks
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367609/
https://www.ncbi.nlm.nih.gov/pubmed/18466556
work_keys_str_mv AT mengyan twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks
AT yangqiong twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks
AT cuencokarent twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks
AT cupplesladrienne twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks
AT destefanoanital twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks
AT lunettakathrynl twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks