Cargando…
Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer
Biomarker identification is critical for precise disease diagnosis and understanding disease pathogenesis in omics data analysis, like using fold change and regression analysis. Graph neural networks (GNNs) have been the dominant deep learning model for analyzing graph-structured data. However, we f...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Journal Experts
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10680938/ https://www.ncbi.nlm.nih.gov/pubmed/38014034 http://dx.doi.org/10.21203/rs.3.rs-3576068/v1 |
_version_ | 1785150747478851584 |
---|---|
author | Li, Fuhai Dong, Zehao Zhao, Qihang Payne, Philip Province, Michael Cruchaga, Carlos Zhang, Muhan Zhao, Tianyu Chen, Yixin |
author_facet | Li, Fuhai Dong, Zehao Zhao, Qihang Payne, Philip Province, Michael Cruchaga, Carlos Zhang, Muhan Zhao, Tianyu Chen, Yixin |
author_sort | Li, Fuhai |
collection | PubMed |
description | Biomarker identification is critical for precise disease diagnosis and understanding disease pathogenesis in omics data analysis, like using fold change and regression analysis. Graph neural networks (GNNs) have been the dominant deep learning model for analyzing graph-structured data. However, we found two major limitations of existing GNNs in omics data analysis, i.e., limited-prediction/diagnosis accuracy and limited-reproducible biomarker identification capacity across multiple datasets. The root of the challenges is the unique graph structure of biological signaling pathways, which consists of a large number of targets and intensive and complex signaling interactions among these targets. To resolve these two challenges, in this study, we presented a novel GNN model architecture, named PathFormer , which systematically integrate signaling network, priori knowledge and omics data to rank biomarkers and predict disease diagnosis. In the comparison results, PathFormer outperformed existing GNN models significantly in terms of highly accurate prediction capability (~ 30% accuracy improvement in disease diagnosis compared with existing GNN models) and high reproducibility of biomarker ranking across different datasets. The improvement was confirmed using two independent Alzheimer’s Disease (AD) and cancer transcriptomic datasets. The PathFormer model can be directly applied to other omics data analysis studies. |
format | Online Article Text |
id | pubmed-10680938 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Journal Experts |
record_format | MEDLINE/PubMed |
spelling | pubmed-106809382023-11-27 Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer Li, Fuhai Dong, Zehao Zhao, Qihang Payne, Philip Province, Michael Cruchaga, Carlos Zhang, Muhan Zhao, Tianyu Chen, Yixin Res Sq Article Biomarker identification is critical for precise disease diagnosis and understanding disease pathogenesis in omics data analysis, like using fold change and regression analysis. Graph neural networks (GNNs) have been the dominant deep learning model for analyzing graph-structured data. However, we found two major limitations of existing GNNs in omics data analysis, i.e., limited-prediction/diagnosis accuracy and limited-reproducible biomarker identification capacity across multiple datasets. The root of the challenges is the unique graph structure of biological signaling pathways, which consists of a large number of targets and intensive and complex signaling interactions among these targets. To resolve these two challenges, in this study, we presented a novel GNN model architecture, named PathFormer , which systematically integrate signaling network, priori knowledge and omics data to rank biomarkers and predict disease diagnosis. In the comparison results, PathFormer outperformed existing GNN models significantly in terms of highly accurate prediction capability (~ 30% accuracy improvement in disease diagnosis compared with existing GNN models) and high reproducibility of biomarker ranking across different datasets. The improvement was confirmed using two independent Alzheimer’s Disease (AD) and cancer transcriptomic datasets. The PathFormer model can be directly applied to other omics data analysis studies. American Journal Experts 2023-11-16 /pmc/articles/PMC10680938/ /pubmed/38014034 http://dx.doi.org/10.21203/rs.3.rs-3576068/v1 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Li, Fuhai Dong, Zehao Zhao, Qihang Payne, Philip Province, Michael Cruchaga, Carlos Zhang, Muhan Zhao, Tianyu Chen, Yixin Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer |
title |
Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer
|
title_full |
Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer
|
title_fullStr |
Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer
|
title_full_unstemmed |
Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer
|
title_short |
Highly accurate disease diagnosis and highly reproducible biomarker identification with PathFormer
|
title_sort | highly accurate disease diagnosis and highly reproducible biomarker identification with pathformer |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10680938/ https://www.ncbi.nlm.nih.gov/pubmed/38014034 http://dx.doi.org/10.21203/rs.3.rs-3576068/v1 |
work_keys_str_mv | AT lifuhai highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer AT dongzehao highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer AT zhaoqihang highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer AT paynephilip highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer AT provincemichael highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer AT cruchagacarlos highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer AT zhangmuhan highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer AT zhaotianyu highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer AT chenyixin highlyaccuratediseasediagnosisandhighlyreproduciblebiomarkeridentificationwithpathformer |