Cargando…

A Two-Stage Random Forest-Based Pathway Analysis Method

Pathway analysis provides a powerful approach for identifying the joint effect of genes grouped into biologically-based pathways on disease. Pathway analysis is also an attractive approach for a secondary analysis of genome-wide association study (GWAS) data that may still yield new results from the...

Descripción completa

Detalles Bibliográficos
Autores principales: Chung, Ren-Hua, Chen, Ying-Erh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3346727/
https://www.ncbi.nlm.nih.gov/pubmed/22586488
http://dx.doi.org/10.1371/journal.pone.0036662
_version_ 1782232215368761344
author Chung, Ren-Hua
Chen, Ying-Erh
author_facet Chung, Ren-Hua
Chen, Ying-Erh
author_sort Chung, Ren-Hua
collection PubMed
description Pathway analysis provides a powerful approach for identifying the joint effect of genes grouped into biologically-based pathways on disease. Pathway analysis is also an attractive approach for a secondary analysis of genome-wide association study (GWAS) data that may still yield new results from these valuable datasets. Most of the current pathway analysis methods focused on testing the cumulative main effects of genes in a pathway. However, for complex diseases, gene-gene interactions are expected to play a critical role in disease etiology. We extended a random forest-based method for pathway analysis by incorporating a two-stage design. We used simulations to verify that the proposed method has the correct type I error rates. We also used simulations to show that the method is more powerful than the original random forest-based pathway approach and the set-based test implemented in PLINK in the presence of gene-gene interactions. Finally, we applied the method to a breast cancer GWAS dataset and a lung cancer GWAS dataset and interesting pathways were identified that have implications for breast and lung cancers.
format Online
Article
Text
id pubmed-3346727
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33467272012-05-14 A Two-Stage Random Forest-Based Pathway Analysis Method Chung, Ren-Hua Chen, Ying-Erh PLoS One Research Article Pathway analysis provides a powerful approach for identifying the joint effect of genes grouped into biologically-based pathways on disease. Pathway analysis is also an attractive approach for a secondary analysis of genome-wide association study (GWAS) data that may still yield new results from these valuable datasets. Most of the current pathway analysis methods focused on testing the cumulative main effects of genes in a pathway. However, for complex diseases, gene-gene interactions are expected to play a critical role in disease etiology. We extended a random forest-based method for pathway analysis by incorporating a two-stage design. We used simulations to verify that the proposed method has the correct type I error rates. We also used simulations to show that the method is more powerful than the original random forest-based pathway approach and the set-based test implemented in PLINK in the presence of gene-gene interactions. Finally, we applied the method to a breast cancer GWAS dataset and a lung cancer GWAS dataset and interesting pathways were identified that have implications for breast and lung cancers. Public Library of Science 2012-05-07 /pmc/articles/PMC3346727/ /pubmed/22586488 http://dx.doi.org/10.1371/journal.pone.0036662 Text en Chung, Chen. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Chung, Ren-Hua
Chen, Ying-Erh
A Two-Stage Random Forest-Based Pathway Analysis Method
title A Two-Stage Random Forest-Based Pathway Analysis Method
title_full A Two-Stage Random Forest-Based Pathway Analysis Method
title_fullStr A Two-Stage Random Forest-Based Pathway Analysis Method
title_full_unstemmed A Two-Stage Random Forest-Based Pathway Analysis Method
title_short A Two-Stage Random Forest-Based Pathway Analysis Method
title_sort two-stage random forest-based pathway analysis method
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3346727/
https://www.ncbi.nlm.nih.gov/pubmed/22586488
http://dx.doi.org/10.1371/journal.pone.0036662
work_keys_str_mv AT chungrenhua atwostagerandomforestbasedpathwayanalysismethod
AT chenyingerh atwostagerandomforestbasedpathwayanalysismethod
AT chungrenhua twostagerandomforestbasedpathwayanalysismethod
AT chenyingerh twostagerandomforestbasedpathwayanalysismethod