Cargando…
A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning
Understanding the root molecular and genetic causes driving complex traits is a fundamental challenge in genomics and genetics. Numerous studies have used variation in gene expression to understand complex traits, but the underlying genomic variation that contributes to these expression changes is n...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3441565/ https://www.ncbi.nlm.nih.gov/pubmed/23028573 http://dx.doi.org/10.1371/journal.pone.0044635 |
_version_ | 1782243320129388544 |
---|---|
author | Bennett, Brian D. Xiong, Qing Mukherjee, Sayan Furey, Terrence S. |
author_facet | Bennett, Brian D. Xiong, Qing Mukherjee, Sayan Furey, Terrence S. |
author_sort | Bennett, Brian D. |
collection | PubMed |
description | Understanding the root molecular and genetic causes driving complex traits is a fundamental challenge in genomics and genetics. Numerous studies have used variation in gene expression to understand complex traits, but the underlying genomic variation that contributes to these expression changes is not well understood. In this study, we developed a framework to integrate gene expression and genotype data to identify biological differences between samples from opposing complex trait classes that are driven by expression changes and genotypic variation. This framework utilizes pathway analysis and multi-task learning to build a predictive model and discover pathways relevant to the complex trait of interest. We simulated expression and genotype data to test the predictive ability of our framework and to measure how well it uncovered pathways with genes both differentially expressed and genetically associated with a complex trait. We found that the predictive performance of the multi-task model was comparable to other similar methods. Also, methods like multi-task learning that considered enrichment analysis scores from both data sets found pathways with both genetic and expression differences related to the phenotype. We used our framework to analyze differences between estrogen receptor (ER) positive and negative breast cancer samples. An analysis of the top 15 gene sets from the multi-task model showed they were all related to estrogen, steroids, cell signaling, or the cell cycle. Although our study suggests that multi-task learning does not enhance predictive accuracy, the models generated by our framework do provide valuable biological pathway knowledge for complex traits. |
format | Online Article Text |
id | pubmed-3441565 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-34415652012-10-01 A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning Bennett, Brian D. Xiong, Qing Mukherjee, Sayan Furey, Terrence S. PLoS One Research Article Understanding the root molecular and genetic causes driving complex traits is a fundamental challenge in genomics and genetics. Numerous studies have used variation in gene expression to understand complex traits, but the underlying genomic variation that contributes to these expression changes is not well understood. In this study, we developed a framework to integrate gene expression and genotype data to identify biological differences between samples from opposing complex trait classes that are driven by expression changes and genotypic variation. This framework utilizes pathway analysis and multi-task learning to build a predictive model and discover pathways relevant to the complex trait of interest. We simulated expression and genotype data to test the predictive ability of our framework and to measure how well it uncovered pathways with genes both differentially expressed and genetically associated with a complex trait. We found that the predictive performance of the multi-task model was comparable to other similar methods. Also, methods like multi-task learning that considered enrichment analysis scores from both data sets found pathways with both genetic and expression differences related to the phenotype. We used our framework to analyze differences between estrogen receptor (ER) positive and negative breast cancer samples. An analysis of the top 15 gene sets from the multi-task model showed they were all related to estrogen, steroids, cell signaling, or the cell cycle. Although our study suggests that multi-task learning does not enhance predictive accuracy, the models generated by our framework do provide valuable biological pathway knowledge for complex traits. Public Library of Science 2012-09-13 /pmc/articles/PMC3441565/ /pubmed/23028573 http://dx.doi.org/10.1371/journal.pone.0044635 Text en © 2012 Bennett et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Bennett, Brian D. Xiong, Qing Mukherjee, Sayan Furey, Terrence S. A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning |
title | A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning |
title_full | A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning |
title_fullStr | A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning |
title_full_unstemmed | A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning |
title_short | A Predictive Framework for Integrating Disparate Genomic Data Types Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task Learning |
title_sort | predictive framework for integrating disparate genomic data types using sample-specific gene set enrichment analysis and multi-task learning |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3441565/ https://www.ncbi.nlm.nih.gov/pubmed/23028573 http://dx.doi.org/10.1371/journal.pone.0044635 |
work_keys_str_mv | AT bennettbriand apredictiveframeworkforintegratingdisparategenomicdatatypesusingsamplespecificgenesetenrichmentanalysisandmultitasklearning AT xiongqing apredictiveframeworkforintegratingdisparategenomicdatatypesusingsamplespecificgenesetenrichmentanalysisandmultitasklearning AT mukherjeesayan apredictiveframeworkforintegratingdisparategenomicdatatypesusingsamplespecificgenesetenrichmentanalysisandmultitasklearning AT fureyterrences apredictiveframeworkforintegratingdisparategenomicdatatypesusingsamplespecificgenesetenrichmentanalysisandmultitasklearning AT bennettbriand predictiveframeworkforintegratingdisparategenomicdatatypesusingsamplespecificgenesetenrichmentanalysisandmultitasklearning AT xiongqing predictiveframeworkforintegratingdisparategenomicdatatypesusingsamplespecificgenesetenrichmentanalysisandmultitasklearning AT mukherjeesayan predictiveframeworkforintegratingdisparategenomicdatatypesusingsamplespecificgenesetenrichmentanalysisandmultitasklearning AT fureyterrences predictiveframeworkforintegratingdisparategenomicdatatypesusingsamplespecificgenesetenrichmentanalysisandmultitasklearning |