Cargando…
Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks
We used the simulated data set from Genetic Analysis Workshop 15 Problem 3 to assess a two-stage approach for identifying single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). In the first stage, we used random forests (RF) to screen large amounts of genetic data using th...
Autores principales: | , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367609/ https://www.ncbi.nlm.nih.gov/pubmed/18466556 |
_version_ | 1782154333125607424 |
---|---|
author | Meng, Yan Yang, Qiong Cuenco, Karen T Cupples, L Adrienne DeStefano, Anita L Lunetta, Kathryn L |
author_facet | Meng, Yan Yang, Qiong Cuenco, Karen T Cupples, L Adrienne DeStefano, Anita L Lunetta, Kathryn L |
author_sort | Meng, Yan |
collection | PubMed |
description | We used the simulated data set from Genetic Analysis Workshop 15 Problem 3 to assess a two-stage approach for identifying single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). In the first stage, we used random forests (RF) to screen large amounts of genetic data using the variable importance measure, which takes into account SNP interaction effects as well as main effects without requiring model specification. We used the simulated 9187 SNPs mimicking a 10 K SNP chip, along with covariates DR (the simulated DRB1 gentoype), smoking, and sex as input to the RF analyses with a training set consisting of 750 unrelated RA cases and 750 controls. We used an iterative RF screening procedure to identify a smaller set of variables for further analysis. In the second stage, we used the software program CaMML for producing Bayesian networks, and developed complex etiologic models for RA risk using the variables identified by our RF screening procedure. We evaluated the performance of this method using independent test data sets for up to 100 replicates. |
format | Text |
id | pubmed-2367609 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-23676092008-05-06 Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks Meng, Yan Yang, Qiong Cuenco, Karen T Cupples, L Adrienne DeStefano, Anita L Lunetta, Kathryn L BMC Proc Proceedings We used the simulated data set from Genetic Analysis Workshop 15 Problem 3 to assess a two-stage approach for identifying single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). In the first stage, we used random forests (RF) to screen large amounts of genetic data using the variable importance measure, which takes into account SNP interaction effects as well as main effects without requiring model specification. We used the simulated 9187 SNPs mimicking a 10 K SNP chip, along with covariates DR (the simulated DRB1 gentoype), smoking, and sex as input to the RF analyses with a training set consisting of 750 unrelated RA cases and 750 controls. We used an iterative RF screening procedure to identify a smaller set of variables for further analysis. In the second stage, we used the software program CaMML for producing Bayesian networks, and developed complex etiologic models for RA risk using the variables identified by our RF screening procedure. We evaluated the performance of this method using independent test data sets for up to 100 replicates. BioMed Central 2007-12-18 /pmc/articles/PMC2367609/ /pubmed/18466556 Text en Copyright © 2007 Meng et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Proceedings Meng, Yan Yang, Qiong Cuenco, Karen T Cupples, L Adrienne DeStefano, Anita L Lunetta, Kathryn L Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks |
title | Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks |
title_full | Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks |
title_fullStr | Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks |
title_full_unstemmed | Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks |
title_short | Two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and Bayesian networks |
title_sort | two-stage approach for identifying single-nucleotide polymorphisms associated with rheumatoid arthritis using random forests and bayesian networks |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367609/ https://www.ncbi.nlm.nih.gov/pubmed/18466556 |
work_keys_str_mv | AT mengyan twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks AT yangqiong twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks AT cuencokarent twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks AT cupplesladrienne twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks AT destefanoanital twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks AT lunettakathrynl twostageapproachforidentifyingsinglenucleotidepolymorphismsassociatedwithrheumatoidarthritisusingrandomforestsandbayesiannetworks |