Cargando…

PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes

High-throughput data such as metabolomics, genomics, transcriptomics, and proteomics have become familiar data types within the “-omics” family. For this work, we focus on subsets that interact with one another and represent these “pathways” as graphs. Observed pathways often have disjoint component...

Descripción completa

Detalles Bibliográficos
Autores principales: Carpenter, Charlie M., Zhang, Weiming, Gillenwater, Lucas, Severn, Cameron, Ghosh, Tusharkanti, Bowler, Russell, Kechris, Katerina, Ghosh, Debashis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8565741/
https://www.ncbi.nlm.nih.gov/pubmed/34679079
http://dx.doi.org/10.1371/journal.pcbi.1008986
_version_ 1784593874826559488
author Carpenter, Charlie M.
Zhang, Weiming
Gillenwater, Lucas
Severn, Cameron
Ghosh, Tusharkanti
Bowler, Russell
Kechris, Katerina
Ghosh, Debashis
author_facet Carpenter, Charlie M.
Zhang, Weiming
Gillenwater, Lucas
Severn, Cameron
Ghosh, Tusharkanti
Bowler, Russell
Kechris, Katerina
Ghosh, Debashis
author_sort Carpenter, Charlie M.
collection PubMed
description High-throughput data such as metabolomics, genomics, transcriptomics, and proteomics have become familiar data types within the “-omics” family. For this work, we focus on subsets that interact with one another and represent these “pathways” as graphs. Observed pathways often have disjoint components, i.e., nodes or sets of nodes (metabolites, etc.) not connected to any other within the pathway, which notably lessens testing power. In this paper we propose the Pathway Integrated Regression-based Kernel Association Test (PaIRKAT), a new kernel machine regression method for incorporating known pathway information into the semi-parametric kernel regression framework. This work extends previous kernel machine approaches. This paper also contributes an application of a graph kernel regularization method for overcoming disconnected pathways. By incorporating a regularized or “smoothed” graph into a score test, PaIRKAT can provide more powerful tests for associations between biological pathways and phenotypes of interest and will be helpful in identifying novel pathways for targeted clinical research. We evaluate this method through several simulation studies and an application to real metabolomics data from the COPDGene study. Our simulation studies illustrate the robustness of this method to incorrect and incomplete pathway knowledge, and the real data analysis shows meaningful improvements of testing power in pathways. PaIRKAT was developed for application to metabolomic pathway data, but the techniques are easily generalizable to other data sources with a graph-like structure.
format Online
Article
Text
id pubmed-8565741
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-85657412021-11-04 PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes Carpenter, Charlie M. Zhang, Weiming Gillenwater, Lucas Severn, Cameron Ghosh, Tusharkanti Bowler, Russell Kechris, Katerina Ghosh, Debashis PLoS Comput Biol Research Article High-throughput data such as metabolomics, genomics, transcriptomics, and proteomics have become familiar data types within the “-omics” family. For this work, we focus on subsets that interact with one another and represent these “pathways” as graphs. Observed pathways often have disjoint components, i.e., nodes or sets of nodes (metabolites, etc.) not connected to any other within the pathway, which notably lessens testing power. In this paper we propose the Pathway Integrated Regression-based Kernel Association Test (PaIRKAT), a new kernel machine regression method for incorporating known pathway information into the semi-parametric kernel regression framework. This work extends previous kernel machine approaches. This paper also contributes an application of a graph kernel regularization method for overcoming disconnected pathways. By incorporating a regularized or “smoothed” graph into a score test, PaIRKAT can provide more powerful tests for associations between biological pathways and phenotypes of interest and will be helpful in identifying novel pathways for targeted clinical research. We evaluate this method through several simulation studies and an application to real metabolomics data from the COPDGene study. Our simulation studies illustrate the robustness of this method to incorrect and incomplete pathway knowledge, and the real data analysis shows meaningful improvements of testing power in pathways. PaIRKAT was developed for application to metabolomic pathway data, but the techniques are easily generalizable to other data sources with a graph-like structure. Public Library of Science 2021-10-22 /pmc/articles/PMC8565741/ /pubmed/34679079 http://dx.doi.org/10.1371/journal.pcbi.1008986 Text en © 2021 Carpenter et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Carpenter, Charlie M.
Zhang, Weiming
Gillenwater, Lucas
Severn, Cameron
Ghosh, Tusharkanti
Bowler, Russell
Kechris, Katerina
Ghosh, Debashis
PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes
title PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes
title_full PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes
title_fullStr PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes
title_full_unstemmed PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes
title_short PaIRKAT: A pathway integrated regression-based kernel association test with applications to metabolomics and COPD phenotypes
title_sort pairkat: a pathway integrated regression-based kernel association test with applications to metabolomics and copd phenotypes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8565741/
https://www.ncbi.nlm.nih.gov/pubmed/34679079
http://dx.doi.org/10.1371/journal.pcbi.1008986
work_keys_str_mv AT carpentercharliem pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT zhangweiming pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT gillenwaterlucas pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT severncameron pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT ghoshtusharkanti pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT bowlerrussell pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT kechriskaterina pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes
AT ghoshdebashis pairkatapathwayintegratedregressionbasedkernelassociationtestwithapplicationstometabolomicsandcopdphenotypes