Cargando…
Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data
The early symptoms of lung adenocarcinoma patients are inapparent, and the clinical diagnosis of lung adenocarcinoma is primarily through X-ray examination and pathological section examination, whereas the discovery of biomarkers points out another direction for the diagnosis of lung adenocarcinoma...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9280023/ https://www.ncbi.nlm.nih.gov/pubmed/35846148 http://dx.doi.org/10.3389/fgene.2022.926927 |
_version_ | 1784746540843139072 |
---|---|
author | Qiu, Wang-Ren Qi, Bei-Bei Lin, Wei-Zhong Zhang, Shou-Hua Yu, Wang-Ke Huang, Shun-Fa |
author_facet | Qiu, Wang-Ren Qi, Bei-Bei Lin, Wei-Zhong Zhang, Shou-Hua Yu, Wang-Ke Huang, Shun-Fa |
author_sort | Qiu, Wang-Ren |
collection | PubMed |
description | The early symptoms of lung adenocarcinoma patients are inapparent, and the clinical diagnosis of lung adenocarcinoma is primarily through X-ray examination and pathological section examination, whereas the discovery of biomarkers points out another direction for the diagnosis of lung adenocarcinoma with the development of bioinformatics technology. However, it is not accurate and trustworthy to diagnose lung adenocarcinoma due to omics data with high-dimension and low-sample size (HDLSS) features or biomarkers produced by utilizing only single omics data. To address the above problems, the feature selection methods of biological analysis are used to reduce the dimension of gene expression data (GSE19188) and DNA methylation data (GSE139032, GSE49996). In addition, the Cartesian product method is used to expand the sample set and integrate gene expression data and DNA methylation data. The classification is built by using a deep neural network and is evaluated on K-fold cross validation. Moreover, gene ontology analysis and literature retrieving are used to analyze the biological relevance of selected genes, TCGA database is used for survival analysis of these potential genes through Kaplan-Meier estimates to discover the detailed molecular mechanism of lung adenocarcinoma. Survival analysis shows that COL5A2 and SERPINB5 are significant for identifying lung adenocarcinoma and are considered biomarkers of lung adenocarcinoma. |
format | Online Article Text |
id | pubmed-9280023 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-92800232022-07-15 Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data Qiu, Wang-Ren Qi, Bei-Bei Lin, Wei-Zhong Zhang, Shou-Hua Yu, Wang-Ke Huang, Shun-Fa Front Genet Genetics The early symptoms of lung adenocarcinoma patients are inapparent, and the clinical diagnosis of lung adenocarcinoma is primarily through X-ray examination and pathological section examination, whereas the discovery of biomarkers points out another direction for the diagnosis of lung adenocarcinoma with the development of bioinformatics technology. However, it is not accurate and trustworthy to diagnose lung adenocarcinoma due to omics data with high-dimension and low-sample size (HDLSS) features or biomarkers produced by utilizing only single omics data. To address the above problems, the feature selection methods of biological analysis are used to reduce the dimension of gene expression data (GSE19188) and DNA methylation data (GSE139032, GSE49996). In addition, the Cartesian product method is used to expand the sample set and integrate gene expression data and DNA methylation data. The classification is built by using a deep neural network and is evaluated on K-fold cross validation. Moreover, gene ontology analysis and literature retrieving are used to analyze the biological relevance of selected genes, TCGA database is used for survival analysis of these potential genes through Kaplan-Meier estimates to discover the detailed molecular mechanism of lung adenocarcinoma. Survival analysis shows that COL5A2 and SERPINB5 are significant for identifying lung adenocarcinoma and are considered biomarkers of lung adenocarcinoma. Frontiers Media S.A. 2022-06-30 /pmc/articles/PMC9280023/ /pubmed/35846148 http://dx.doi.org/10.3389/fgene.2022.926927 Text en Copyright © 2022 Qiu, Qi, Lin, Zhang, Yu and Huang. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Qiu, Wang-Ren Qi, Bei-Bei Lin, Wei-Zhong Zhang, Shou-Hua Yu, Wang-Ke Huang, Shun-Fa Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data |
title | Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data |
title_full | Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data |
title_fullStr | Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data |
title_full_unstemmed | Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data |
title_short | Predicting the Lung Adenocarcinoma and Its Biomarkers by Integrating Gene Expression and DNA Methylation Data |
title_sort | predicting the lung adenocarcinoma and its biomarkers by integrating gene expression and dna methylation data |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9280023/ https://www.ncbi.nlm.nih.gov/pubmed/35846148 http://dx.doi.org/10.3389/fgene.2022.926927 |
work_keys_str_mv | AT qiuwangren predictingthelungadenocarcinomaanditsbiomarkersbyintegratinggeneexpressionanddnamethylationdata AT qibeibei predictingthelungadenocarcinomaanditsbiomarkersbyintegratinggeneexpressionanddnamethylationdata AT linweizhong predictingthelungadenocarcinomaanditsbiomarkersbyintegratinggeneexpressionanddnamethylationdata AT zhangshouhua predictingthelungadenocarcinomaanditsbiomarkersbyintegratinggeneexpressionanddnamethylationdata AT yuwangke predictingthelungadenocarcinomaanditsbiomarkersbyintegratinggeneexpressionanddnamethylationdata AT huangshunfa predictingthelungadenocarcinomaanditsbiomarkersbyintegratinggeneexpressionanddnamethylationdata |